Category: XSD Schemas

Back in June I wrote a blog post in which I explored how the BizTalk XML Validator pipeline component could be used to prevent duplicate values in repeating records, the duplicate check being scoped to a single element/attribute value or a combination of them (do have a read of the blog post in question for an overview of how this can be done in schemas with an elementFormDefault of unqualified or with no namespaces).  However at the time I found a major problem in that I could not figure out the syntax to get this to work with schemas with an elementFormDefault or attributeFormDefault of qualified, and was constantly facing the error “The prefix ‘x’ in XPath cannot be resolved” (x being the relevant namespace prefix I was trying to use in the unique constraint) when executing the XML Validator pipeline component even though the BizTalk project containing said schemas built successfully, and was unable to work around the problem at the time.


While looking at implementing a workaround on a solution I was working on whereby I was going to reverse the elementFormDefault on contained schemas from qualified back to unqualified my colleague Shikhar and myself worked out how to solve the problem encountered with namespace prefixes.  Put very simply, it looks like unique constraints in a schema will not respect namespace prefixes declared at the schema level but rather must have the namespace prefixes defined at the constraint level.

For the purpose of this exercise I’ve created a new type schema which contains a complexType definition called ItemMetadata which contains an attribute called DateOfAvailability and an element called Price.  The elementFormDefault and attributeFormDefault attributes are both set to qualified on this schema.  I’ve then referenced it via an imports statement from the Purchases schema I used in my previous example and added ItemMetadata as a child node under the repeating Purchase node so that the schema structure now looks like the below.


The unique constraint that I had working in the previous blog post specified that if there is more than one Purchase node that contains the same combination of values for the PurchaseOrder, SupplierName, ItemBarCode, ItemDescription and SpecialDeals elements as well as the PurchaseDate attribute then the XML Validator should throw an exception.  None of the aforementioned elements or attributes belonged to a namespace since the elementFormDefault and attributeFormDefault on the schema were set to unqualified.  I now want to add the DateOfAvailability attribute and Price element which both belong to the namespace http://BizTalk_Server_Project6.Type to the constraint.  The namespace must be associated with a prefix and as we have previously seen we can’t do this on the schema level, so the prefix should instead be declared on the constraint level and can then be used on individual field xpaths as below.


Using the namespace prefixes as above the below XML now generates the error “There is a duplicate key sequence ‘123213 Al Pacino PurchaseDate_0 42129841 Anti-dandruff shampoo 10% off 10/11/2013 $13.00’ for the http://uniquepurchases:SupplierPurchaseOrderConstriant key or unique identity constraint.” as expected.  If you wanted to make use of the qualified elements or attributes in multiple nodes in multiple unique constraints in the same schema then you would need to declare the namespace prefix on each constraint, potentially reusing the same prefix value since it is scoped to that specific constraint and thus there won’t be a clash.


I’ve uploaded the source code (as a Visual Studio 2012/BizTalk 2013 solution) to my Google drive account if you are interested in giving it a try.

In conclusion, duplicate value checks in a BizTalk solution are very well served by schema validation using unique constraints (even though the schema designer does not support implementing unique constraints and they have to be added manually), and alternatives methods on enforcing such validation would most likely turn out to be much harder to implement, support, and wouldn’t be anywhere near as efficient.

I was recently confronted with an interesting problem where I was asked to design a BizTalk schema which contained a complex type that must contain at least one instance of some of its child elements or optionally some or all of them. Upon questioning the business requirement it turned out that I didn’t actually have to enforce this in the schema as this level of business logic was better served by a downstream system. However I did decide that this was an interesting enough scenario (albeit not necessarily a common or practical one) to dive into it a bit deeper and see how it could be implemented using an XSD schema. Note that while I have used BizTalk 2009 schema editors in Visual Studio 2008 to create my schemas, aside from the wording of error messages nothing in this blog post is really BizTalk specific.

Let’s assume that we are dealing with the XML structure below and we are advised that the record structure must always contain either an Element1 or an Element2 element or both (the rest of the elements are all optional).

Basic XML structure

One way we can achieve the above goal is by making use of choice structures as below.

2 Elements Untyped

Note that in the above screenshot the choice consisted of a sequence that contained both Element1 and an optional Element2, and a second sequence that contained only Element2. This caters for scenarios where both elements exist, or only one exist. See the below screenshots which illustrate that if either Element1 or Element2 are missing then that is not a problem but if both are missing then the XML message is invalid.

Valid NoElement2 Valid NoElement1 Invalid NoElement1or2

When I originally approached the problem I thought of creating a choice structure with three sequences, one with Element1, the second with Element2, and the third with Element1 and Element 2. However this is not allowed and trying to implement this will result in an error stating “Multiple definitions of element xxx causes the content model to become ambiguous…” as below.

Ambiguous error

Now what if we want to change the data type for Element2 (which exists in both of the sequences in the choice). If we try to change the data type from string to int on the Element2 element in one of the sequences using the BizTalk/Visual Studio schema editor we will get an error stating “Elements with the same name and in the same scope must have the same type” as below.

Same type error

The reason for this is that the data type must be the same for elements with the same name that exists in multiple sequences in a choice structure. Changing them one at a time is not an option. One way to get around this is to open the schema using an XML editor and to change the data type for both of the Element2 elements at the same time. An even cleaner solution would be to extract the definition of Element2 into a simple type, and to have both Element2 elements use the simple type as their type so that if the data type for Element2 needed to be changed it would only have to be changed on the simple type in the future. In the below screenshot this has been done for Element1, Element2, Element3, Element4, and Element5 and we are now enabled to change data types with ease.

2 Elements Typed

Now what if we wanted to add Element3 to the choice as well, the new rule stating that at least one of Element1, Element2, or Element3 are provided, or any combination of two or three of the elements are provided. This can be achieved quite easily with the below structure. Note that the pattern is to have all the elements in the first sequence where only Element1 is mandatory, then to remove Element1 in the second sequence with Element2 now being mandatory, and the third sequence only containing Element3 which is now mandatory. Adding more elements to the mix would require you to extend this pattern.

3 Elements

How about if Element2 is always mandatory, and you additionally also need to provide either Element1 or Element3 or both elements. You can achieve this with the below structure.

Element2 Mandatory

I hope that this blog post has illustrated to you how you can create schemas that specify multiple combinations of optional/mandatory rules for a given complex type through the use of choice structures with child sequence groups.

A customer asked me recently if the BizTalk business rules engine was a good place to search for repeating elements containing duplicate values in an XML message that is received on a one way BizTalk hosted WCF service and to throw an exception back to a service consumer if a duplicate is found. My initial gut feeling was that the BRE wasn’t quite the best place to do this and I decided to search for alternatives before exploring this option further.

I was surprised to find out that the W3C standards for Xml schemas cater for unique constraints in your XML messages, defining keys scoped to a complex type that define which element/attribute or combination of elements and attributes are not allowed to repeat. This is not a feature in XSD schemas that myself or anyone I spoke to about had heard of before, and a common reaction was that no one had ever encountered such a requirement before. I decided to find out whether BizTalk supports these unique constraints and whether the XML validator pipeline component could be used to enforce these constraints.

The first thing I discovered is that there is no way in the schema designer to define the unique constraints, at least not that I could find. I decided I would follow the W3C guidelines and handcraft my constraint. Take a look at the below schema and take note of how the constraint I’ve defined prevents the same SupplierName value from being used across two Purchase records.


Validating the below instance file against the schema returns an error as expected since a duplicate SupplierName of Redmond has been used in two of the Purchase records. The error is reasonably detailed advising the name of the unique constraint that was not respected as well as which duplicate value broke the constraint. Providing unique values for the SupplierName elements results in the instance file validating successfully. Note that the XML editor in visual studio also highlights the fact that a constraint has been violated as in the below screenshot (look at the close tag of the second Purchase record).


Now what if we wanted to add a second constraint on the PurchaseOrder element as well. Not a problem, just add a second unique constraint. Now both constraints must be respected in order for an instance file to successfully validate.


Another interesting scenario is to extend the duplicate checking to actually check for combinations of multiple elements. Let’s throw attributes into the mix as well. The below unique constraint now allows for duplicate SupplierName or PurchaseOrder elements or duplicate PurchaseDate attributes but not duplicates of all three in combination.


We can also extend the duplicate check to elements in contained records. In the below example the ItemBarCode element in the child ItemDetails record has now been added to the unique constraint.


Running a duplicate message through an actual XML validator pipeline component also throw the error as expected.


Now what if one of the elements in your duplicate check is optional. In the below screenshot the optional ItemDescription element has been added to the unique constraint and even though neither of the Purchase records contain a ItemDescription element and all the other elements in the constraint are duplicated the constraint is not deemed to be violated. The constraint is only violated if the ItemDescription element is specified with a duplicate value. If it is missing, even if it is missing in two record in which every other element contains duplicate values, the constraint won’t be violated.

Optionals don't count

Another scenario… What if the element in your duplicate check is a repeating element? I extended the duplicate check to the SpecialDeals element as well and what I found was that the second I had more than one SpecialDeals element in a given Purchase/ItemDetails record I would get an error stating that only one element was expected. Adding an element to a unique constraint disallows it from being a repeating element.


(UPDATE 16/12/2013 – This post used to contain a further section that detailed problems I encountered back in June 2013 when dealing with elements or attributes which belong to a namespace in unique constraints.  At the time I could not figure out how to get them to work and I never managed to find any supporting information to help me, however I have now found out the correct syntax for this issue and have added a blog post detailing this here so I have now deleted the remaining sections of the post so as not to mislead anyone into thinking that including qualified elements/attributes in unique constraints are not supported.)

One of the interesting challenges I’ve faced in the last year was creating a flat file schema for a well established custom flat file format that was in use at one of my clients.  What made this flat file format different from most that I had dealt with in the past was that this flat file contained repeating delimited records, each record containing tags that were found in the middle of the record.

Now my responsibility on this project was to output flat files of this format which wouldn’t be a big deal since all the records had the same number of delimited elements and I could just make a generic repeating record in my schema with the right number of elements. However I couldn’t help but think that one day it might be a requirement to read in files of this format as well and designing the schema correctly now could save a lot of rework later (it turns out that this was a good call since one of my colleagues has been tasked with exactly this requirement). In order to future proof the schema it must have knowledge of the tags and be able to read in instance files and apply the correct record type to each line, the obvious problem with this being that you can only apply a tag at the record level.

Let’s illustrate this with an example. Say we have a file that contains details about animals, though the type of details might differ if the animal is a cat or a dog. Regardless what the type of animal is, there are some common elements that are shared between the two and these appear before the tag that identifies whether the record in question is dealing with a dog or cat. After the identifier the data elements might differ (there could even be a different number of elements supplied but I won’t touch on that).

Example file

A very generic approach to creating this schema would be to have a generic repeating record structure as below which can easily be created using the flat file schema generation wizard.


Now the problem with this is that while it is easy to write output files with this schema by flattening your dog and cat structures from source documents into this generic structure in a map, it is a bit more problematic when you are picking up files of this type and are mapping to a typed structure (far from impossible but difficult and harder to understand, especially as the number of record types grow), forcing you to keep more logic in your map than you really should have to. Validating an instance file gives us rather ill-defined XML.


The way I got around this problem was to have a generic Animal repeating record which contains all the common elements as well as a choice record segment (see this MSDN article for some more details on choice group nodes) called DiscreteAnimalType. Within this choice record I created a record for each of the different animal types with their respective elements, and I applied a tag identifier at the record level. Note that I started with the generic schema structure created by the flat file schema generation wizard and then adjusted it manually in the schema designer.


Validating the same instance file now gives us a much better defined XML structure.


Does anyone else have a better approach to this problem (I have of course simplified it a lot for the sake of this blog post). I hope this makes people think about future proofing their flat file schemas a bit more.

Last week I ran into a really interesting issue. When I was running Visual Studio unit tests to validate instances of XML formatted EDIFact messages (see my previous blog post on BizTalk testing for more details on schema testing) I found that my tests just seemed to take forever and timed out with no reason provided.


I decided to apply a break point in my unit test and debugged it.  Much to my surprise when I stepped into the line of code that was actually validating the instance file I saw the below.

EDIFact Properties

Well, that’s really painful.  The unit test was timing out because it was waiting for user input on the above screen, however unless the test was being run in debug mode the screen was not being displayed at all.

So far I haven’t found any way to get the out of the box ValidateInstance method to work on EDIFact schemas because I can’t supress the above screen.  Interestingly enough, if the instance file actually fails validation then the test will fail rather than time out.  Another alternative I explored was to use some non standard test methods to test instances of EDIFact messages against my schemas.  In my previous blog post discussing BizTalk unit testing I mentioned that I regularly use non standard methods to extend my schema unit tests anyways since the out of the box ValidateInstance method doesn’t return an error message upon failure, and it also doesn’t work when your schemas import/include from other schemas.  I chose to use only these extension methods (as described here), bypassing the out of the box testing methods altogether and the tests now complete succesfully.  Note that in the below screenshot the TestExtensions.ValidateSchema method contains an extension method as described here.


%d bloggers like this: