locked
recovery and persistence RRS feed

  • Question

  • My workflow is hosted in a workflowapplication. I do not use workflowservicehost.  When my workflow completes the instance is deleted from the instance store. If it crashes it must load the instance from the store and recover to the pre-crash state. So when my application loads it has to check if there is an existing instance, otherwise create a new instance.

    I have seen other threads and reviewed the AbsoluteDelay sample. The approach is to query the WaitForEvents to check for runnable instances. But that only works if an instance exists and that instance becomes runnable.  The sample code is so confident that a runnable instance will emerge that it waits for Timespan.MaxValue. I do not have that option. I have to wait for a finite period. HostlockRenewalPeriod + RunnableInstancesDetectionPeriod (35 seconds by default) is a long delay for normal startup, but even that does not guarantee a successful load.  Waiting for 60 seconds is unacceptable.

    All I want to do is check if a valid instance exists in the store. If found then wait for it to load if and when it becomes runnable, otherwise create a new instance. How do I do that?


    Dick Page
    Monday, March 7, 2011 6:13 PM

All replies

  • My requirement seems pretty basic. Is there really no built-in way of doing it? Do I have to create my own data access components to query the instance store database? InstanceView sounds like it should be of some use to me but it is always empty. The only instance store command that sounds of any relevance is QueryActivatableWorkflowsCommand though the documentation seems to suggest this is only relevant to the inner workings of a workflowservicehost (a generic host?)

    QueryActivatableWorkflowsCommand cmd = new QueryActivatableWorkflowsCommand(); InstanceView v = store.Execute(handle, cmd, TimeSpan.FromSeconds(30));

    But this yields nothing. Maybe because my w/f is not Activatable even though I can load it

    I am very confused about InstanceOwners, InstanceViews, InstanceHandles, DefaultInstanceOwners. Are they specific to a machine, a user, a service? How do I create  and persist a workflow instance from one place and load it in another if the applicationId is unknown?


    Dick Page
    Thursday, March 10, 2011 1:50 PM
  • Hi Dick, 

    if you want to resume instances from persistence you may have the problem to redirect your instance to the right definition. In this case you have to deal with the persistence store by issuing some commands like I did here. See this post:

    http://social.msdn.microsoft.com/Forums/en-US/wfprerelease/thread/5d194202-aedb-4856-ac6d-0347b86ee92e

    However, you are asking about a sort of FAILED workflow resumption and I think that this is not supported as per my knowledge. If you want to track what are the instances that failed you should track them on your own persistence tables (using a TrackingPartecipant) along with your status so that you can instantiate a new instance with the same input parameters and issue another workflow.

    Hope this could help you choose the right approach.

    Cheers


    Adriano
    Thursday, March 10, 2011 2:32 PM
  • Hi Adriano

    In fact I recall seeing that thread somewhere back in the fog of time. I shall review it again.

    One thing that immediately sticks out is the point about TimeSpan.Zero.  My own experience shows that passing in TimeSpan.Zero will always throw a TimeoutException  - even though I know there is a runnable instance in the db - because it cannot complete in zero time. Timespan.zero is not interpreted as "ignore any time constraint". The AbsoluteDelay sample uses TimeSpan.MaxValue.

    I'm not great with VB but it looks like rather than use WaitForEvents, you create an instancehandle and delete an existing instance owner if it is invalid, then execute a TryLoadRunnableWorkflowCommand. I assume it woud be invalid if the instancehandle was created on a different host, thread, or whatever. It not like a hashcode?

    Anyway I'll play around with it.

    Thanks

    (Sorry if some of my comments don't make sense. I am still dazed and confused.)


    Dick Page
    Thursday, March 10, 2011 4:21 PM
  • Hi Dick, 

    Yes the TimeSpan.Zero is not correct in fact I said later on that post that the timeout should be placed at, say, 5 seconds.

    TimeoutException is a sort of "no events for the expected timeout", so you should see it as a benign exception.

    I agree with you that instancehandle should normally never become invalid, BUT we discovered some cases in our testing where this could happen, so we managed this case.

    I know this is not simple and full of workarounds, but definitely I can tell you that this approach works. We've a service up and running for 3-4 months 24x7 till now. 

    I hope that vNext will semplify this scenario.

    Cheers

     


    Adriano
    Thursday, March 10, 2011 4:33 PM
  • This is my current understanding.  Folks, please correct me if I am wrong.

    1. My initial question was about when instances may or may not exist. Basically there is no built-in support. You have to query the database and maintain your own association between workfowIds and workflow definitions. That would be less of a hassle if you could pass in your own Guid, but you can't (although I think it is possible with WorkflowServiceHost management endpoint?).

    2. The definition of runnable instances is perplexing. I run my application, which fires up a workflow which runs to the first bookmark preceded by a Persist activity. I then close the application to simulate a crash. If I have configured the workflow to unload on idle, the database Instances table shows an instance row item with no current lock, with idle status and a list of the blocking bookmarks. There is no entry in the runnable instances table i.e. the instance is not runnable. If I have configured the workflow to not unload on idle, the Instances table shows a row item that is locked, with executing status, the bookmarks column is empty and an entry pops up in the RunnableInstances table after the lock expires.

    In the first case both application and database are in sync, yet the workflow instance is not considered to be Runnable. I can only reload it if I know the instanceId (or wait till it becomes runnable if it has a timer which mine doesn't but I suppose I could put one in simply to make it runnable after it unloads).

    In the second case the application and the database are not in sync. The db assumes the instance is still loaded in memory and gives it Runnable status (Seems a bit weird to me). I can reload the instance by subscribing to WaitFoEvents on the instancestore then calling LoadRunnableInstances() or executing a TryLoadRunnableWorkflowCommand, The instance becomes runnable when the lock expires. This requires a single wait of 60 seconds or shorter timeouts in a while loop with catch blocks for the InstanceNotReadyExceptions. I think you can short cut the process by cancelling an existing lock by calling Free() on the instanceview but I've not been able to reproduce this without throwing any number of exceptions.

    3. An IntanceHandle could perhaps be more accurately described a lock handle. It creates a temporary lock entry in the database. The lock owner is some mashup of host and thread or whatever? Lock owners are also temporary. So you can create a workflow instance on one host/thread and load it on another.

    WorkflowApplication is engineered for a scenario where the workflow instance is created, loaded, unloaded and completed in a single host application instance. This is not the real world where applications crash or are shutdown and where the workflow execution shifts between different hosts. I also think the design is very unfriendly. WorkflowServiceHost / AppFabric is better suited for real world scenarios, but I've not played around with it yet.

    Workflow lifecycle is a complicated state machine. The documentation covers just a fraction of it.


    Dick Page
    Friday, March 11, 2011 1:13 PM