Service Broker's External Activator service stopping when the server is under load

Answered Service Broker's External Activator service stopping when the server is under load

  • Thursday, May 24, 2012 3:13 PM
     
      Has Code

    Hi,

    We are using the Service Broker and External Activator for managing web service requests for keeping local repositories in sync with remote data sources outside of our organisation. The process works very well and is surprisingly efficient; definitely a great toolset.

    However we are currently encountering one issue. The particular SQL Service instance is also used for BI, and therefore is frequently under load with long running queries. When these queries run, the External Activator service stops with the Event Log:

    Inner Exception:
    System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
       at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
       at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
       at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
       at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
       at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
       at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
       at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
       at System.Data.SqlClient.SqlInternalTransaction.Commit()
       at System.Data.SqlClient.SqlTransaction.Commit()
       at ExternalActivator.QueueReader.CommitTransaction(SqlCommand cmd)
       at ExternalActivator.QueueReader.Run()
       at ExternalActivator.NotificationService.Start(IConfigurationManager configMgr)

    We have updated the worker threads that the External Activator starts and they successfully continue processing the queues throughout periods of server load by using longer command timeouts (waiting out the storm).

    However we are not aware of any External Activator settings for extending the command timeout for monitoring the notification queues. We have proposed using the resource governor which may not work as it only load balances new connections under load, not existing connections; or writing another program that monitors the External Activator service and restarts it if it stops. Separating the broker into its own SQL instance or own server is out of the question as we are currently resource constrained.

    We were wondering if anyone else has run into the same issue and how they overcame it?

    Any relevant suggestions much appreciated.
    Jay :)


    If you shake a kettle, does it boil faster?

All Replies

  • Monday, May 28, 2012 6:16 AM
    Moderator
     
     

    Hi Jay,

    Thank you for your question. 

    I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated. 

    Thank you for your understanding and support.


    Best Regards,
    Iric
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

  • Monday, May 28, 2012 7:53 PM
     
     
    Thanks Iric, any assistance would be greatly appreciated. Happy to wait for a 'best' answer...

    If you shake a kettle, does it boil faster?

  • Tuesday, May 29, 2012 4:45 PM
     
     Answered

    Hi Jay,

    I took a look at the source code where we do a receive from the notification queue. It sets the cmd.CommandTimeout = 0, which sets it to wait indefinately (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout.aspx). So we should not be getting a command timeout. Unfortunately this is currently not configurable. At the top of the stack you should have a SqlClient.SqlCommand() function if it is a command timout.

    The stack that you posted is a SqlConnection timeout, which you can modify in your EAService.config file by modifying the connection string section with the Connection Timeout parameter.   (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring.aspx). I think from you posting you are already doing this. You could try to increase this even further.

    We should not however be exiting the process for either a connection or a command timeout. I will file a design change request to see if we can make this more robust in a future version.

    Probably not much help but I would try to increase the connection timeout value in the EAService.config file more.

    Bill (Microsoft CTS)

    • Marked As Answer by Jay Kidd Tuesday, May 29, 2012 9:07 PM
    •  
  • Tuesday, May 29, 2012 9:20 PM
     
     

    Thanks Bill. You were correct, it was a connection timeout rather than a command timeout. My mistake.

    Have increased the timeout to 5 minutes in the connection string, I loaded the server for 30 minutes and the SQL Broker service was still up at the end of it. Therefore the problem appears to have a successful work around.

    Cheers,
    Jay :)


    If you shake a kettle, does it boil faster?

  • Thursday, March 14, 2013 12:00 AM
     
     

    I'm experiencing what looks like the same issue at the moment.  I get four or five instances of "Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding." daily since installing the Service Broker External Activator into our production environment.

    I've just upped the connection timeout from the default (which I think is 15 seconds) to 300 seconds.  I'll report back if this fixes it, but I am concerned to know that my SQL Server is sometimes not able to respond to a connection request within 15 seconds.

    I'd be interested in having the service a bit more fault tolerant (i.e. not terminate due to a temporary inability to connect to the database) - however we have a watchdog process that monitors for critical services not running and starts them back up again, so we're kinda shielded from this problem for the moment.

    Michael

  • Thursday, March 14, 2013 9:28 PM
     
     
    Yup - after extending the timeout, no more notifications about this service shutting down.