workflow unloaded from the memory and getting loaded and locked on other machine
-
9 февраля 2012 г. 12:28
Hi,
We were using WF for the first time in project for one of our customer, it’s a long running service and is in production now.we deployed to two machine in a load balanced scenario where the workflows were configured to unload if idle. Both boxes are running the appfabric services and pointing to a common sqlexpress instance. There were 3 services on each box one of which is entry service. After receiving the request on the entry service, it points to the services underneath on the localhost.
Deployment model below.
The problem:
When the workflow was unloaded from the memory, it was getting loaded and locked on other machine. Due to the two way communication between the services when the service was trying to go back to calling service using localhost, because the workflow instance was loaded and locked on the other machine it was throwing an error for that. We don’t really want to keep the instances loaded in memory while they are idle.Any ideas on how to stop the other machine from loading the workflow and locking it? Let me know if it needs further explanation. Currently in order to get around this we are running services on a single box.
Все ответы
-
14 февраля 2012 г. 16:11Модератор
Your diagram looks like you have 2 separate instances of SQL express. I am pretty sure you cannot cluster SQL express. I am wondering if using SQL express in this configuration may result in the persistence services not working as expected.
I would try to implement some kind of synchronization logic where you perhaps call a stored procedure to write to a table unless it has already been written to. This would be a kind of coordinated locking mechanism.
Thanks,
If this answers your question, please use the "Answer" button to say so | Ben Cline
- Помечено в качестве ответа WaheedHussain 14 февраля 2012 г. 17:21
- Снята пометка об ответе WaheedHussain 14 февраля 2012 г. 17:21
-
14 февраля 2012 г. 17:34
thanks for the reply,
Yes there are two instances of sql express but only the one on box02-v is being used from appfarbic on both boxes. The other one (box01-v) is being synchronized with the data and is a fail-over.
as appfabric from both the boxes is pointing to box02-v you wouldn't recoken that the service will load anything from box01-v db?
The sync scripts are very simple as follows.
truncate table [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[InstanceControlCommandsTable]; Set IDENTITY_INSERT [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[InstanceControlCommandsTable] ON; INSERT INTO [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[InstanceControlCommandsTable] ([ID] ,[InstanceId] ,[ServiceIdentifier] ,[Type] ,[ExecutionAttempts] ,[LastExecutionAttemptAt] ,[CurrentMachine] ,[LockExpiration] ,[Exception]) select * from [box01-v].[PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[InstanceControlCommandsTable]; Set IDENTITY_INSERT [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[InstanceControlCommandsTable] OFF; go truncate table [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsTable]; INSERT INTO [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsTable] select * from [box01-v].[PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsTable]; go truncate table [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsCleanupTable]; INSERT INTO [PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsCleanupTable] select * from [box01-v].[PersistenceStore].[Microsoft.ApplicationServer.DurableInstancing].[AbandonedInstanceControlCommandsCleanupTable]; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[RunnableInstancesTable]; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[RunnableInstancesTable] select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[RunnableInstancesTable]; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[KeysTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[KeysTable] ON; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[KeysTable] ([Id], [SurrogateKeyId] ,[SurrogateInstanceId] ,[EncodingOption] ,[Properties] ,[IsAssociated]) select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[KeysTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[KeysTable] OFF; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[LockOwnersTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[LockOwnersTable] ON; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[LockOwnersTable] ([Id] ,[SurrogateLockOwnerId] ,[LockExpiration] ,[WorkflowHostType] ,[MachineName] ,[EnqueueCommand] ,[DeletesInstanceOnCompletion] ,[PrimitiveLockOwnerData] ,[ComplexLockOwnerData] ,[WriteOnlyPrimitiveLockOwnerData] ,[WriteOnlyComplexLockOwnerData] ,[EncodingOption]) select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[LockOwnersTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[LockOwnersTable] OFF; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[InstanceMetadataChangesTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[InstanceMetadataChangesTable] ON; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[InstanceMetadataChangesTable] ([SurrogateInstanceId] ,[ChangeTime] ,[EncodingOption] ,[Change]) select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[InstanceMetadataChangesTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[InstanceMetadataChangesTable] OFF; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[ServiceDeploymentsTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[ServiceDeploymentsTable] ON; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[ServiceDeploymentsTable] ([Id], [ServiceDeploymentHash] ,[SiteName] ,[RelativeServicePath] ,[RelativeApplicationPath] ,[ServiceName] ,[ServiceNamespace]) select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[ServiceDeploymentsTable]; Set IDENTITY_INSERT [PersistenceStore].[System.Activities.DurableInstancing].[ServiceDeploymentsTable] OFF; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[InstancePromotedPropertiesTable]; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[InstancePromotedPropertiesTable] select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[InstancePromotedPropertiesTable]; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable] INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable] select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable]; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable] INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable] select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[SqlWorkflowInstanceStoreVersionTable]; go truncate table [PersistenceStore].[System.Activities.DurableInstancing].[InstancesTable]; set identity_insert [PersistenceStore].[System.Activities.DurableInstancing].[InstancesTable] on; INSERT INTO [PersistenceStore].[System.Activities.DurableInstancing].[InstancesTable] ([Id] ,[SurrogateInstanceId] ,[SurrogateLockOwnerId] ,[PrimitiveDataProperties] ,[ComplexDataProperties] ,[WriteOnlyPrimitiveDataProperties] ,[WriteOnlyComplexDataProperties] ,[MetadataProperties] ,[DataEncodingOption] ,[MetadataEncodingOption] ,[Version] ,[PendingTimer] ,[CreationTime] ,[LastUpdated] ,[WorkflowHostType] ,[ServiceDeploymentId] ,[SuspensionExceptionName] ,[SuspensionReason] ,[BlockingBookmarks] ,[LastMachineRunOn] ,[ExecutionStatus] ,[IsInitialized] ,[IsSuspended] ,[IsReadyToRun] ,[IsCompleted]) select * from [box01-v].[PersistenceStore].[System.Activities.DurableInstancing].[InstancesTable]; set identity_insert [PersistenceStore].[System.Activities.DurableInstancing].[InstancesTable] off; go
Thanks,
Waheed
-
14 февраля 2012 г. 22:12Модератор
When does this script run? I am not sure if this kind of script is supported for AppFabric. Although it is certainly possible to manipulate the data like this I am not sure you would want to.
Are you working off of a sample implementation where this type of synchronization is documented?
Basically the problem with this approach is you do not have any synchronization of the service persistence. If you switch back to just a single AppFabric instance, does everything work (although not being redundant)?
Thanks,
If this answers your question, please use the "Answer" button to say so | Ben Cline
-
15 февраля 2012 г. 9:34
We havn't come to the stage yet where we test the fail-over. The problem that is stopping it to work is appfabric loading the instance automatically on the other box (baring in mind serivces on both boxes are using the same database on box02-v) locking it and then not releasing it. My question is what could cause the workflow instance to load on the other box? while that particular instance would never be called by anything on the other box. All service urls are localhost if you see the picture.
The script runs hourly and the same problem happens it we disable this sync.
sorry... what do you mean by sync of service persistence?
Thanks,
Waheed -
16 марта 2012 г. 20:04
I am having this same issue. Sadly it seems like your question hasn't been addressed.
My workflow is on a farm, and immediate subsequent calls to the workflow fail, and the server reports the same error you describe.
From the workflow I set an app-specific flag that the workflow has completed, immediately before it reaches the next Pick Activity which contain Receive Activities as its branches. From the web app, once the flag has been set, the user will take an action to resume this workflow. However, in the event that the user takes action too quickly (let's say about 20 seconds after the flag is set), the server logs that the workflow instance is locked by another host.
How do I check to see if that instance of the workflow is free to receive further requests??
Thank you.

