I recently encountered a problem at a customer site where it was a common practice to clone their production BizTalk Server 2009 environment into its own isolated environment for staging and testing purposes. The production environment made use of a failover cluster to ensure high-availability of the enterprise single sign on master secret. However in the testing environment only one of the master secret servers was cloned, and the DNS alias assigned to the failover cluster was repointed to the IP address of the remaining server.
For those who don’t know too much about what the enterprise single sign on service does here is a short summary… BizTalk stores a lot of its configuration information including port bindings etc… in the SSODB, however all that configuration information is encrypted since some of it might be potentially sensitive such as credentials associated with a web service. The enterprise single sign on service is used to fetch the master secret from the master secret server which is used to decrypt the configuration information held in the SSODB, and thus the BizTalk runtime is unable to work without the master secret (though it does cache the master secret in order to ensure high availability in cases of shorter outages to the master secret server).
Everything (and I’m talking message exchanges, orchestration execution, deployments, binding updates etc…) appeared to work fine in this cloned environment until we tried to register a new adapter (not an adapter handler but the adapter itself), and regardless whether we did this through the BizTalk Administration Console or through WMI/Powershell scripts we encountered the following error and were unable to proceed – “Cannot perform encryption or decryption because the secret is not available from the master secret server”. The same error could also be found in the event log of the master secret server as well so obviously the BizTalk runtime server was communicating with the master secret server.
The user we were logged in as was a member of the BizTalk Administrators and SSO Administrators groups so a lack of permissions was quickly ruled out as the cause of the issue. Searching for this error message on Google returned tons of results which all suggested that we needed to restore the master secret using the backup key file. However the administrator who cloned the environment advised us that he had already done so as part of the cloning process. Just to be on the safe side he ran it again however we were still stuck with the same issue.
After some more reading I found this MSDN article which mentions changing the master secret name as part of the process of clustering your secret server. Surely the same process must apply when you are reversing out of the cluster….
Sure enough running the SSOManage command line utility (found in the C:\Program Files\Common Files\Enterprise Single Sign-On folder) with the –displaydb switch displays the cluster DNS alias rather than the master secret server’s computer name. Even though that DNS alias has now been repointed to the SSO server’s IP address that does not appear to be good enough.
To fix this I had to create an XML file as below (as per the aforementioned MSDN article) which contains the correct SSO server name rather than the cluster name.
I then had to run the SSOManage command line tool with the –updatedb switch pointing towards this XML file. After a restart of the enterprise single sign on service the problem is now gone.
According to this technet article even if the master secret isn’t currently available the BizTalk runtime should be able to continue working using an in memory cached version of the master secret until the host instance or the local enterprise single sign on service has been restarted. This makes the particular problem I’ve described in this blog post quite weird, because even restarting these services and the BizTalk server itself did not render the BizTalk runtime unusable, or prevent us from performing other configuration tasks either through WMI/Powershell or the BizTalk Administration Console. This leads me to believe that for whatever reason, changing the DNS alias for the cluster to point to the master secret server’s IP address works for many functions but not for particular ones including the manipulation of adapter registrations. I would not use DNS alias repointing as a solution in such circumstances if I were you, and would instead prefer to change the name of the registered master secret server by using the SSOMange command line utility as I have described.