Thursday, March 29, 2012

Distributed processing "farm" using Broker for middleware.

In our current project, we are attempting to use Broker for the middleware for a "queuing" solution that helps in throttling & load-balancing incoming messages. These messages could potentially be either long-running jobs or simply an incoming flood of messages received from our integration web services that are called upon via BizTalk.

For the posting of messages onto the "to do" queue, we are implementing the "fire & forget" pattern that has been discussed here previously and on Remus' blog. This aspect of the design seems to work fine for us. Currently we have this setup where activation is occuring on the target queue, but we don't want to hold open a thread there on the server while waiting for our synchronous calls to various web-services on the processing "farm" to complete.

The next evolution of our design is to try and move activation from off of the primary SQL cluster itself (i.e. activation is currently happening on the clustered SQL boxes) onto the individual processing nodes. For this model, we are looking at using SQL Express on each of the nodes as has been suggested here on the forums for other similar scenarios.

For resiliance to any node failures, we do not want to "route" the messages to the individual nodes hosting their own queues but rather have those nodes do a "read" from the primary queue and do the per-message processing and performing either a committed EndDialog of EndDialogWithError based on the success of processing each task/message.

To invoke the processing on each of the nodes, we need some form of mechanism to send a "wake up" and do the "reads" since no message is being sent to the node itself for any form of activation based on a queue that receives the actual "job". On the same hand, we are considering having a "wake up" queue on the nodes where a simple "wake up" message could be sent to all nodes/queues and then activation on those queues would then invoke the processing "loop" for each node.

My question is how to best establish this "wake up" call to each node. I think I've read about a queue that is internal to Broker itself that receives a message when new items are received in any queue. My initial thought is to put activation on that queue and have a procedure that sends the "wake up" to each of the nodes in our processing farm.

I am looking for any input where others have attempted to solve this type of problem with Broker.

Regards,

Bradley A. Hehe

Hi Brad!

Do you already had a look at the external activation mechanism available in Service Broker? With external activation you can subscribe to the event when a new message arrives at a queue. Then you can activate an external application (like a console application or a WinForms application) that processes the original message from the queue outside of SQL Server within his own process. Doing the processing of a message in an external process should solve your described problem.

You can find here (http://www.gotdotnet.com/codegallery/codegallery.aspx?id=9f7ae2af-31aa-44dd-9ee8-6b6b6d3d6319) a sample (the ExternalActivator sample) which implements the whole infrastructure needed for this task and also some documentation about setting up this sample.

HTH

Klaus Aschenbrenner
www.csharp.at
www.sqljunkies.com/weblog/klaus.aschenbrenner

|||

Brad Hehe wrote:

For resiliance to any node failures, we do not want to "route" the messages to the individual nodes hosting their own queues but rather have those nodes do a "read" from the primary queue and do the per-message processing and performing either a committed EndDialog of EndDialogWithError based on the success of processing each task/message.

I have a little trouble undertanding the reasoning here. What is the role of the SQL Express instance on each node? If you want to do reads from the node using TDS (i.e. ADO.Net or OLEDB), then why do you need a SQL Express instance on the node? If you want involve the SQL Express instances into linked servers (distributed queries) or other forms of DTC, then you will be way better to route the messages instead.

HTH,
~ Remus

|||

As for ExternalActivator, I'll go back and examine that one again. I was under the impression that I could only have 1 external "activator" listening to the events. In my case, I'd have a "farm" of "external" processes listening for events to process.

As for Remus's reply, I was looking to solve the problem without developing anything beyond a "procedure" for activation on the processing node itself. I was trying to avoid the creation of an "service" (i.e. Windows Service) or some similar mechanism polling the main database cluster for jobs. I was hopign to send some form of "wake up" or "heart beart" to the processing nodes to tell them to go out and find work & process through it..... Something more event driven than polling based.

|||Unfortunately, there's no way for a procedure to read a message from a queue on a remote machine. I think you really need to have your processing nodes open a connection to the database where the queue is and receive messages. This is actually much simplere than any of the solutions suggested so far. you just write a program that opens a connection, receives messages from the queue, processes then and then receives more messages. If you need more processing, just start up more jobs. If one of the processing nodes goes down, the connections will be broken, the transactions will roll back, the messages will appear on the queue again and another node can pick them up. This is absloutely the most efficient way to do load balancing because the processing nodes only pull as much work off the queue as they can handle. As soon as their done, they pull more, if there's no work left, they wait on a receive so no resources are wasted. If something goes wrong with a processing application or node, the transaction rolls back and another node or application picks it up. If instead you pull messages out into a local database on each processing node to process them, if one of the processing nodes goes down, it takes the messages with it and they don't get handled until the node comes back up. They aren't lost but they are delayed.|||

I follow you there Roger. Let me run this past you then.

The folks who preceded me on this effort had went about their selection of Broker to avoid going down the road of developing a "service" on each node that "polls" a simple table sitting on the server. Essentially, your proposed solution is doing the same, correct? It is "polling" a broker queue rather than a table with the benefits of conversation grouping, sequencing, etc. All of those are "gains" from the use of Broker over a table. Additionally there is this "WAITFOR" syntax being used. It is this "WAITFOR" syntax that I'm thinking justifies use of broker over the traditional polling approach.

From a resource friendliness view of the world, is Broker (A) the same in server resource utilization or (B) more efficient -- than the traditional approach of polling/updating a "jobs" table. Being that processing cycles / IO on your cluster are one of your most expensive resources to implement, we are looking to ensure we are using those resources wisely. The previous thought process was that polling would be hammering on any already very busy database.

|||Polling is efficient only if there's always something in the table to select each time. A select that returns nothing is a waste of resources so unless the incoming rate is at least as large as your maximum processing rate,polling is inefficient. WAITFOR aleviates this by not returning until there is something to return. Service Broker is optimized for pulling a message from the queue using the least possible resources. Because Service Broker is part of the database, it is able to do things directly that an external application can't.

No comments:

Post a Comment