Thursday, November 29, 2012

Overriding the default 10 minutes transaction timeout in a WCF SQL adapter

The solution

To be able to override the System.Transactions default timeout of 10 minutes in the code, like using SqlAdapterInboundTransactionBehavior in our WCF SQL Adapter, you have to change certain settings in machine.config. Remember you have both the 32 and 64 bit versions and versions for both .NET framework version 2 and 4.
In the sectionGroup system.transactions, you have to change the attribute allowExeDefinition of the machineSettings section from MachineOnly to MachineToApplication.
As an example, we show our change to the 32-bit .Net 2 version.
<sectionGroup name="system.transactions" type="System.Transactions.Configuration.TransactionsSectionGroup, System.Transactions, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089, Custom=null">
           <section name="defaultSettings" type="System.Transactions.Configuration.DefaultSettingsSection, System.Transactions, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089, Custom=null"/>
           <section name="machineSettings" type="System.Transactions.Configuration.MachineSettingsSection, System.Transactions, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089, Custom=null" allowDefinition="MachineOnly" allowExeDefinition="MachineToApplication"/>
        </sectionGroup>
Further you have to configure the maxTimeout to some safe number, we go for 1 hour:
<system.transactions>
   <machineSettings maxTimeout="01:00:00" />
 </system.transactions>
 
I only tested doing this in machine.config, but as I understand  the setting allowExeDefinition="MachineToApplication" we should be able to do this in the BizTalk host instance configuration files (BTSNTSvc.exe.config and BTSNTSvc64.exe.config) as well.
Remember to restart the relevant host instances.

The problem

In a WCF SQL receive adapter using transaction every 10 minutes (actually sometimes also every 5 and 15 minutes) we were getting the following error in the Application Log with event id 5719:
There was a failure executing the receive pipeline: “…” Source: “…” Receive Port: “…” URI: “mssql://… “ Reason: 0x8004d00e
Followed by these two warnings (event id 5740):
The adapter "WCF-SQL" raised an error message. Details "System.Transactions.TransactionException: The operation is not valid for the state of the transaction. ---> System.TimeoutException: Transaction Timeout
   --- End of inner exception stack trace ---
   at System.Transactions.TransactionState.EnlistVolatile(InternalTransaction tx, IEnlistmentNotification enlistmentNotification, EnlistmentOptions enlistmentOptions, Transaction atomicTransaction)
   at System.Transactions.Transaction.EnlistVolatile(IEnlistmentNotification enlistmentNotification, EnlistmentOptions enlistmentOptions)
   at Microsoft.BizTalk.Adapter.Wcf.Runtime.TxnBatch2..ctor(IBTTransportProxy transportProxy, ControlledTermination control, Transaction transaction, ManualResetEvent orderedEvent, Boolean makeSuccessCall)
   at Microsoft.BizTalk.Adapter.Wcf.Runtime.SingleMessageReceiveTxnBatch2.StartProcessFailures()".
 
And
The adapter "WCF-SQL" raised an error message. Details "System.Runtime.InteropServices.COMException (0xC0C01627): Exception from HRESULT: 0xC0C01627".
 
If we removed the custom pipeline we got this adapter warning instead:
The adapter "WCF-SQL" raised an error message. Details "Microsoft.ServiceModel.Channels.Common.AdapterException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.. Endpoint Address - mssql://… ---> System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out
   --- End of inner exception stack trace ---
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)
   at Microsoft.Adapters.Sql.SqlAdapterInboundHandler.Polling_TryReceive(TimeoutHelper timeoutHelper, Message& wcfMessage)
   at Microsoft.Adapters.Sql.SqlAdapterInboundHandler.TryReceive(TimeSpan timeout, Message& message, IInboundReply& reply)
   --- End of inner exception stack trace ---
   at Microsoft.Adapters.Sql.SqlAdapterInboundHandler.TryReceive(TimeSpan timeout, Message& message, IInboundReply& reply)
   at Microsoft.ServiceModel.Channels.Common.Channels.AdapterInputChannel.TryReceive(TimeSpan timeout, Message& message)
   at System.ServiceModel.Dispatcher.InputChannelBinder.TryReceive(TimeSpan timeout, RequestContext& requestContext)
   at System.ServiceModel.Dispatcher.ErrorHandlingReceiver.TryReceive(TimeSpan timeout, RequestContext& requestContext)".
 
Correctly we saw the problem as a transaction timeout issue. However we could not understand the timing, one error every 10 minutes, as we had set the transaction timeout of the SqlAdapterInboundTransactionBehavior to 55 minutes according to http://msdn.microsoft.com/en-us/library/dd788041.aspx:

 

Finding the solution

We enabled MSDTC tracing, and transformed the binary trace via:
Msdtcvtr  -tracelog <NameofTheDtcTrace.log>  -o OutPutFile
In the TDC trace we saw the following event pattern on the BizTalk Server:
1.       transaction propagated from parent node 'DbServerX', Description = ''
2.       received request to abort the transaction from non beginner
3.       transaction is aborting
4.       transaction has been aborted
5.       child node received abort request from parent node ' DbServerX '
6.       child node achnowledging the delivery of abort request from parent node ' DbServerX '
 
with 10 minutes between step 1 and 2.
We interpret like this:
1.       The transaction is initiated by the BizTalk database (unfortunately the BizTalk databases and the database we are polling is located on the same server)
2.       After 10 minutes (the default WCF timeout) the WCF SQL adapter (which must be the non beginner) aborts the transaction.
A bit of Googeling and I found:
which finally gave us the solution above.
 

No comments: