Category: ESB Portal


Most developers who have worked with BizTalk for a while will realize the benefits of using direct binding on orchestration ports to improve flexibility and loose coupling. What is not often realized is that using direct rather than specify later port binding comes with a bit of a trade-off in that the increased flexibility results in less hand holding for administrators, and it is altogether possible for them to stop/unenlist an orchestration or send port resulting in routing failures and unprocessed messages which might be difficult to replay. Seeing as guaranteed delivery is one of the biggest selling points on most projects involving BizTalk Server this is too big a hole to overlook. In this blog post I will detail how hand-holding is relaxed for direct binding and introduce error handling patterns that can be used in orchestrations to overcome routing failures.

One of the things most BizTalk administrators might notice when orchestrations use direct binding is that starting an orchestration no longer require starting all send ports that the orchestration publishes messages to. At run time this could mean that the loose coupling offered by direct binding has removed the guarantee that subscribers to your published messages will be active. If an orchestration publishes a message for which there are no subscribers then a non-resumable routing failure instance will be raised and a PersistenceException exception will be raised in the orchestration.

I have had some pretty concerned colleagues ask me about the dreaded PersistenceException. This is nothing more than a curiously named exception representing a routing failure. The way I tend to deal with such exceptions in guaranteed delivery scenarios is to create a scope around my send shape (possibly around other close by associated shapes as well) and to catch the PersistenceException (this is in the Microsoft.XLANGS.BaseTypes namespace and it’s assembly is referenced by default in BizTalk Server projects). If the exception is encountered then I raise an alert to administrators (via the ESB portal or relevant exception handling framework) advising of the routing failure, and suspend the orchestration instance. Once the administrators have fixed the problem they can resume the orchestration which will loop back to before the send shape and resend the message.

Single Message

Now of course this pattern comes with a bit more effort in terms of development but one has to ask whether their system can afford to lose a message or at the very least have to go through painful and manual message replay processes.

Another fun scenario I’ve encountered that can go very wrong is taking advantage of direct binding in orchestrations in tandem with a publish subscribe pattern with multiple subscribers to the same message. Say you have an orchestration that sends out a message which is subscribed to by a logging/auditing port and another port which actually performs an important action such as updating a LOB system. If the LOB send port was in an unenlisted state then the message would be directed to your auditing send port, no PersistenceException would be raised in your orchestration and the message would never update the LOB system. So much for guaranteed delivery…

What I would do in this case is create two copies of the same message in the orchestration, each with some sort of instructional context property that is used to direct the message to the relevant send port (the values of this context property being set to abstract values such as “Audit” or “UpdateLOB” rather than the name of the send port since this doesn’t steer away from the concept of loose coupling too far), wrap the send shapes for the two messages in an atomic scope so that only one persistence point is encountered when the messages are sent out (this is a whole other subject but it is important to keep your persistence point and thus your I/Os against your SQL server to a minimum), and wrap the atomic scope with a long running scope which catches a PersistenceException. I then implement the same exception handling pattern I mentioned earlier in this post.

Multiple messages

Once again this comes at a greater effort in development and introduces more overhead on the BizTalk runtime, muddies up your orchestration with logic which is arguably plumbing rather than business logic, and somewhat takes away from how loosely coupled your orchestration is.  That said it does ensure guaranteed delivery.

The exception handling patterns that I’ve discussed here stem from my own cautious nature but I have seen them pay off in multiple instances and have also seen the cost when such thought isn’t applied when required. I wouldn’t say they are required in every orchestration but I would at least encourage developers to think of these scenarios when they decide to what extent they are going to handle exceptions.  At the very least if such patterns aren’t implemented do think about putting a section or at least a blurb in your support documentation to discuss how to recover from such failures.

Advertisements

A colleague of mine was discussing my blog post “Routing exceptions on send ports to the ESB Exception Management Portal without turning on routing for failed messages (Part 1)” with me since he wanted to implement the pattern on his own project, but he wanted to take this a bit further. He wanted for the generation of NACKs on ports to be set at run time rather than design time so that the exception handling was a bit more global and enabling exception handling for a send port becomes more of an administration task than a development task. We managed to best the challenge with surprising ease and I thought it was time to write a second part to the aforementioned blog post.

In the solution mentioned in the previous blog post, the subscriptions for NACK messages from specific send ports were generated based on orchestration activation filters and thus obviously can’t be changed at run time as these are design time properties only. The next best alternative was to look at processing NACKs by adjusting the ALL.Exceptions send port in the Microsoft.Practices.ESB application or a similar send port. We thought that at the very least we were going to have to write some custom pipeline components to extract the error details from our NACK message and write them to the ErrorType and other relevant context properties for the ESB Pipeline Components to process appropriately.

The big surprise was to find out that the ESB Exception Encoder pipeline component was written with NACK messages in mind as well (if you use Reflector on the assembly this will be very quickly apparent), and will process them with no extra effort whatsoever. In the below screenshot you’ll see that the ALL.Exceptions send port has had its filter criteria adjusted such that it now subscribes to NACK messages from SendPort1 and SendPort2 in addition to its existing filters thus prompting those send ports to start generating NACKs which will get sent to the ESB Exception database with no orchestration.

Filters

Adding more send ports to this list is something that can easily be done at run time and thus requires no development effort. Below is a side by side comparison of NACK errors in the ESB Portal, the error on the left being generated by a send port as described in this article and the error on the right being generated by an orchestration as described in the previous blog post.

Side by side comparison

The first huge benefit is that while for the orchestration generated error the Service Instance ID is for the orchestration that threw the exception and the Service Instance ID of the suspended send port has to be found in the Fault Description, the Service Instance ID listed for the Send Port generated error is the correct one for the send port that has been suspended. This removes a fair bit of confusion.

It is also quite nice that the send port name is listed under service name rather than the exception handling orchestrations name, and it is also quite nice to see the endpoint URL in the Application Scope rather than whatever value you set it to in the orchestration. The exception type, error type, and machine name are always set to unknown for the send port generated errors. While this might be seen as less friendly than the orchestration generated errors one has to remember that the machine name listed for the orchestration generated errors was the machine name on which the exception handling orchestration was executed rather than on which the now suspended send port instance was executed. If one was really keen they could write a pipeline component that replaces the values unknown with whatever else they wanted after the ESB pipeline components have finished executing but I’m not sure how much value this could really add.

I think this is a cleaner solution than the one I blogged about previously, and while that one has its place in specific scenarios I think this one will be my default position from here on out.

The ESB Exception Management Portal has made it a whole lot easier to drive exception handling and alerting within BizTalk applications however it does come at a price when you’re dealing with exceptions encountered on send ports.  In order to route a failed message to the ESB Portal one must turn on routing for failed messages on the send port.  While this results in the failed message being routed to the ESB Exception database along with the exception details, it also results in the messaging instance being terminated.  This makes sense if the intention is to use the ESB Exception Management Portal for message resubmission however the out of the box sample portal contains many constraints that make this tedious (or impossible for context heavy messaging solutions) without extending the portal itself.

I was recently asked to implement exception handling for a solution that involves orchestrations that send out a message in a fire and forget fashion to the message box using direct binding which then get routed to a send port where it would get mapped and run through a flat file pipeline before being written to the file system.  Retries were enabled on the send port and it was not envisioned that there would be problems on this send port unless there was a massive infrastructure failure in which case all the messaging instances would fail once retries were exhausted.

The customer was already using the ESB Portal for alerting on other BizTalk applications however none of those required resubmission or resumption of failed instances whereas this application did require these features.  Routing the aforementioned exceptions to the ESB Portal would result in a very tedious resubmission cycle as the messages would have to be resubmitted one at a time, and because this is a very context heavy solution the ESB Portal’s resubmission feature would need to be extended to resubmit certain context properties as well as the message body.  However disabling failed routing would mean that we couldn’t take advantage of the exception notification functionality which is part of the ESB stack and since we didn’t have access to SCOM or any other monitoring software this could mean that the problem might go unrecognized for a prolonged period of time delaying the isolation and resolution of the problem’s root cause.

After doing some research I managed to implement a pattern utilizing delivery notifications that allowed me to achieve my design goals (as below) without making any major changes to my existing application.

  • Use direct binding within the orchestration on the logical send port
  • Send out the message to the message box in a fire and forget fashion on the one-way logical port as the orchestration needs to carry on with further processing without waiting for an acknowledgement that the message was delivered
  • Utilize BizTalk’s out of the box retry functionality on the send port
  • Make use of the ESB Exception Portal and Notification service to surface and raise alerts for the exceptions
  • Use the BizTalk Administration console to resume failed messaging instances on send ports

In order to implement this I had to whip up an orchestration which receives a message via direct binding, the activating receive shape having the below filters.  The very fact that there is a component that subscribes to an acknowledgement (in this case very specifically a negative acknowledgement and very specifically from the send port named SendPort) means that the send port will now start generating acknowledgements of that specific type.

Filter

Doing this makes the orchestration subscribe to all negative acknowledgements generated by the send port named “SendPort”.  The activating receive shape should receive a message which is of type System.XML.XMLDocument (I referred to this message as nACK in my orchestration).

I also added references to Microsoft.Practices.ESB.ExceptionHandling (at C:\Program Files (x86)\Microsoft BizTalk ESB Toolkit 2.1\Bin\Microsoft.Practices.ESB.ExceptionHandling.dll on my PC) and Microsoft.Practices.ESB.ExceptionHandling.Schemas.Faults (C:\Program Files (x86)\Microsoft BizTalk ESB Toolkit 2.1\Bin\Microsoft.Practices.ESB.ExceptionHandling.Schemas.Faults.dll) to my orchestration project.  I then created a multi part message type who’s body was of type Microsoft.Practices.ESB.ExceptionHandling.Schemas.Faults.FaultMessage and created a message called eSBFault which is of this multi part message type.

Now we need to work on extracting the fault message.  If you actually inspect a NACK message you will see that it is just a SOAPEnvelope message as below.

SoapEnvelope

Next up I extracted the error message into an orchestration variable (faultMessage which is a string) by pasting the below into an orchestration expression shape.

faultMessage = xpath(nACK.Body, “string(/*[local-name()=’Envelope’ and namespace-uri()=’http://schemas.xmlsoap.org/soap/envelope/’%5D/*%5Blocal-name()=’Body’ and namespace-uri()=’http://schemas.xmlsoap.org/soap/envelope/’%5D/*%5Blocal-name()=’Fault’ and namespace-uri()=’http://schemas.xmlsoap.org/soap/envelope/’%5D/*%5Blocal-name()=’detail’ and namespace-uri()=”]/*[local-name()=’NACK’ and namespace-uri()=’http://schema.microsoft.com/BizTalk/2003/NACKMessage.xsd’%5D/*%5Blocal-name()=’ErrorDescription’ and namespace-uri()=”])”);

Now that we have the error message extracted I created another expression shape within a scope which throws a new System.Exception with the faultMessage as the exception message (code would look like – throw new System.Exception(faultMessage);”).

Next step is to create an exception block in the scope that catches a System.Exception.  Now we want to construct our eSBFault message so create a construct shape for this message and drag in a message assignment shape.  The code should look like the below.Construct ESBFault

Note the use of the BTS.AckOwnerID context property which allows us to expose the send port’s messaging instance ID which will help BizTalk support specialists identify the failed messaging instance.  Also note that the message we are attaching to the fault is the NACK message as we do not have access to the original message that was sent to the send port (this is one weakness with this design pattern but the message is visible within the BizTalk Administration Console and this might suit your purposes just fine as it did mine).

Now we just need to send the eSBFault message to the message box with direct binding and it will be automatically routed to the ESB Exception database from which you can set up alerts.  The orchestration should look somewhat like the below.

Orchestration

With the above implemented the NACKs will now be routed to the ESB Exception database and surfaced in the ESB Exception Management Portal as below, and you will be able to search for the suspended message instances within the BizTalk Administration Console and resume them once the root cause of the problem has been rectified.

Fault

If anyone out there knows other good exception handling patterns making use of the ESB Portal then I would definitely be keen to be told about them.

EDIT – I have now added a second part to this blog as myself and a colleague found ways to improve on this pattern (the above will still be relevant if you need the orchestration for additional processing before sending a message to the ESB Portal).  Take a look at Part 2 here.

When attempting to resubmit an XML message (make sure you make the changes to the required stored procedures to set the correct content type for xml messages first, see http://midheach.wordpress.com/2012/03/23/esb-management-portal-customization/) to a receive location using the HTTP transport channel, you will get an error message saying that “A potentially dangerous Request.Form value was detected from the client”

This is because there are XML tags in the request message which would typically be considered unsafe.  Assuming your ESB Portal web site is only targeted at BizTalk administrator types you can always relax these restrictions by adding the below highlighted line into the web.config in your ESB Portal web directory.

Because these restrictions would effectively be relaxed on the entire website, this might not be ideal for everyone.  If someone finds a better way to get around this problem please let me know.

A small oversight (at least in my mind) in the ESB Portal quickly proves to become a major irritant after using it for awhile.  The default sort order on the faults page is by severity, which if you’re like me, is almost never the logical choice when viewing the page.  I would much rather be seeing it sorted by the time when the fault occurred and if I want to view the data any other way I can always change the sort order or apply filters.

The ESB portal uses an ASP.Net Grid and it’s default implementation doesn’t specify a sort order, thus on page load it will always sort by it’s first column which is severity.  Thus you’ve got two choices, either change the order of the columns, or add some new behavior which defines the default sort behavior.  I’ve explored the latter path.

In order to make the changes, you’ll want to open the ESB.Portal solution, expand the ESB.Portal project, and within that open the FaultList.ascx.cs file which is in the Lists directory (you’ll have to expand FaultList.ascx to see the .cs file).  You’re interested in the very first method – Page_Load.  You’ll want to add in the highlighted code from the below screenshot.

If you haven’t previously installed the portal then you can perform a build and use the msi installer package to deploy the portal.  If you have previously installed the portal and got it working then chances are you don’t want to start from scratch and only want to apply the updated dlls.  These would be the Microsoft.Practices.ESB.Portal.dll and Microsoft.Practices.ESB.Portal.XmlSerializers.dll files in the bin folder of your ESB.Portal project, and you’ll want to copy them to your bin folder of the ESB Portal folder that is used to host your IIS virtual application.

You can always take this a bit further and read in the default sort order from the web.config file if you want to make this more configurable, and you can of course replace DateTime in the Fault.Sort() method call with any SortingOrer you want.

%d bloggers like this: