locked
Refactoring workflow RRS feed

  • Question

  • If am a WF newbie. I am trying to retrofit a workflow into my WCF app. My general approach has been to group sections of code in CodeActivities. I was beginning to get somewhere when my the designer got unusably slow so I refactored my activities into a separate project. I changed the NS values in the xaml file but it was still slow and now the new assembly simply fails to load when I start up the app.

    I assume I have fouled up the xaml file somehow. What are the gotchas to look out for when refactoring WF?

    (I'm finding it hard to love WF. Working with the designer is a soul destroying experience).


    Dick Page
    Friday, March 4, 2011 9:46 PM

Answers

  • In terms of slowness - I know there are some designer perf issues related to validation of large nested workflows, and workflows with lots of expressions. Also there were some design perf issues related to rendering deeply nested activities. If this sounds like you, with extremely large/deep workflow trees, you might want to see if you can reorganize the code groupings into larger chunks of work, basically to save the designer from recursion.

    (Something to keep in mind - if you have many chunks of work which you like viewing as distinct, but they are all going to run synchronously, sequentially, with no need to e.g. persist your workflow in the middle of that work, then you might not be geting much benefit from having them as separate activities in the first place.)

    If on the other hand you still see this issue even with a very small workflows in the designer, then it might be something I don't know about yet, but maybe something we could figure out with more info. So I'd be interested to hear whatever else you have to add. (Sidetrack: about soul-destroying, is it just the slowness, or is there more?)
    Tim

    • Marked as answer by Andrew_Zhu Friday, March 11, 2011 2:17 AM
    Tuesday, March 8, 2011 2:25 AM

All replies

  • In terms of slowness - I know there are some designer perf issues related to validation of large nested workflows, and workflows with lots of expressions. Also there were some design perf issues related to rendering deeply nested activities. If this sounds like you, with extremely large/deep workflow trees, you might want to see if you can reorganize the code groupings into larger chunks of work, basically to save the designer from recursion.

    (Something to keep in mind - if you have many chunks of work which you like viewing as distinct, but they are all going to run synchronously, sequentially, with no need to e.g. persist your workflow in the middle of that work, then you might not be geting much benefit from having them as separate activities in the first place.)

    If on the other hand you still see this issue even with a very small workflows in the designer, then it might be something I don't know about yet, but maybe something we could figure out with more info. So I'd be interested to hear whatever else you have to add. (Sidetrack: about soul-destroying, is it just the slowness, or is there more?)
    Tim

    • Marked as answer by Andrew_Zhu Friday, March 11, 2011 2:17 AM
    Tuesday, March 8, 2011 2:25 AM
  • For me the WP designer is sub-standard. It is unusably slow, frequently resulting in the white screen of death.  It seems unable to cope with custom types and code activities that wrap deep object graphs, although I find it slow with some of the samples e.g. StateMachineWithPick and I have an i7 processor, 8gb RAM and decent graphics card with Windows 7.

    I expect it to work like Class Designer. Once loaded, performance is not an issue and it handles consistency errors gracefully. 


    Dick Page

    Monday, March 21, 2011 10:01 AM
  • Still interested to hear more if you have time, as this sort of feedback can help drive perf and stress testing to make the next version better.

    You mentioned code activities that wrap deep object graphs, and it especially interests me.
    -Are you talking about performance issues even for the very small workflow but one that which has many properties/dependencies?
    -Are you talking about code activities with properties that point to object graphs? Are these also resulting in really huge XAML files? (Or is it just a huge object graph created automatically from a compact XAML?)

    Also, I'd like to ask whether VS10 Service Pack 1 which is in Beta now makes any noticeable improvement to the problems you're seeing. If not, I'd encourage you to open a connect bug with detailed information about how to reproduce, in case they have time to look into it for final SP1.

    Hope this helps,
    Tim

    Monday, March 21, 2011 5:53 PM
  • The performance is so bad is is very difficult for me to get any kind of handle on causality or behavioural patterns. I am not able to narrow down the problem domain by elimination. I believe it is my CodeActivities that are causing the problem but I cannot be certain. The main pattern I can discern is that there is something it does not like on some occasions in one of my custom types. The top level flowchart renders quite quickly, I then drill down into a sequence which has most of the guts. Again it renders reasonably quickly I then move the mouse and it goes busy. The ready cursor will return if I come back 30 minutes later, but it will go busy again the second I move the mouse.

    Responding to your points as best I can:

    - No code activities have properties only InArguments and the default generic out argument. The arguments are often custom types. The action all takes place in the Execute override.  For example One CodeActivity wraps operations on my custom ConfigurationSection, another might dig around with OperationContext. The code body might include a single instantation and method invocation which tiggers a long execution path across multiple assemblies.  Code which runs perfectly in a standard WCF environment.

    - I would not say my workflows are very small or very large or very deep - say around 50 activities. It is very hard to upload something to repro the problem.

    - I have installed SP1 which I thought was RTM not beta. It made no difference.

    - I am quite willing to accept that the source problem is probably of my making. My gripe is the designer does its thang without a progress bar or any status messages and no ability to cancel whatever the **** it is doing. And why does it have to do whatever the **** it is doing everytime I move my mouse or temporarily tab away. And why to simply render a few basic shapes on the design surface does it have to repeatedly do whatever the **** it is doing anyway because even when it does work it is too slow? Compiliing the whole project only takes 30 seconds. And when a w/f does work OK, at some point later it will get screwed up and will be very difficult to recover.

     

     


    Dick Page
    Monday, March 21, 2011 7:33 PM
  • Thanks for the added detail. And extra thanks for trying out SP1.

    I'd like to think it shouldn't be the many assembly dependencies or the custom Execute() logic. Certainly Execute() shouldn't be being touched during design time. Multiple assemblies is harder to rule out since it can effect VB expression evaluation and so on...

    Given the sensitivity to drilling down or expanding activities, that you describe I'm suspicious that it could either be layout related, or related to some kind of recursive exploration of the workflow tree and the corresponding model tree constructed by designer at design time. If the latter case, the reason it might not trigger immediately is that the model tree is lazily generated in response to showing the relevent parts of the workflow.

    One other thing that we can use to narrow it down if you have a few more minutes to spend is attach VS to itself as a debugger, and get a stack trace using Microsoft Public symbols, here's a howto adapted from a previous post on performance issues (in that case they were due to layout issues with deep nesting). I've asked people to do this before, and so far we got out of it some stack traces pointing to layout performance problems with VS2010 RTM that we were able to trace to a known bug. It could be worth doing again in case it's something else, like the recursion issues.

    2) Attach to Process, select the hung instance of VS (devenv.exe). Attach for debugging managed code, or maybe mixed managed and native (hopefully we do not need to debug in the native code...).
    3) Click 'pause' icon to force break, wait for process to break, VS enters Debugging mode
    4) Open the Debug->Threads window
    5) Look for the main thread of VS, it should be at top of the threads list. This should be the GUI thread, and the one which is not responding
    6) Switch to that thread by double-clicking it
    7) Show the call stack window
    8) Check out what dlls are on the stack. Is there anything that sounds interesting? (For workflow designer could be dlls with Activities in the name, but not necessarily)
    9) Right-click interesting looking dlls and choose download public symbols from Microsoft, this may take a while, but hopefully you start seeing some function names on the call stack window.

    Notes:

    Note 1 - if you are in a deadlock hang, or waiting for asynchronous API which is taking a long time hang, it's possible you will see some kernel wait functions at the top of the stack. In this case it could be a deadlock, the call stack may be hard to get, but anyway see what we can see.
    Note 2 - if you are in an infinite loop hang, it's possible you will need to continue and force break multiple times until you get an idea of which stack frame is the interesting one where the infinite loop is happening

    Tim

    Tuesday, March 22, 2011 6:29 AM
  • Hi Tim

    The freezing up seems to be caused in the body of the Pick when a drop in various combinations of sequence, parallel and other builtin activities. This sounds like the bug you describe but which I somehow missed in my trawl of the forum.

    The problem is not completely deterministic. I can make it work reasonably well one moment, then at some point the w/f will collapse. This morning I managed to repro the problem by creating a new w/f with one Pick and two branches and a few delay or writeline activities (no code activities). I was able to consistently repro the problem several times in my search for the single causal factor.

    I then ran the debugger on a hanging demo as you suggested and got very little from it. Something about clr.waitingtocreatInstance or some such.  I tried to repeat the debug session but all my demos were suddenly behaving and even my main application w/f which was moribund all day yesterday is suddenly functional without changing anything. It's almost as if attaching the debugger made everything work.

    Note nothing has changed in the environment. I installed VS2010 SP1 the day it was released. So I am confused but up and running again for now. I shall see how things go and report back.

    Spoke too soon. My app w/f has just frozen up again. Debugger shows Main thread is running MS.Internal.DeferredElement(TreeState.GetCoreParent( ...  Hit continue. W/f still frozen. Hit Pause. Now showing VisualTreeHelper.GetParent on main thread. Several worker threads showing Build Engine Loop in Sleep Wait Join.  The main thread seems to be active because everytime I break it shows a different method. No call stack available. External code. White screen of death. Looks like she ain't never comin back. Recycle VS. I sense that things improve if the app is built before viewing the w/f/... Yep. Now I can view it perfectly. Everything rendered fast. Scroll around some more. Nope its gone again.

    I'll post more when I establish something.


    Dick Page
    Tuesday, March 22, 2011 10:59 AM
  • OK it seems I have the same problem reported at http://social.msdn.microsoft.com/Forums/sr-Latn-CS/wfprerelease/thread/1f20c37f-1b9e-4b6d-904f-a6b518e4187a 

    My Code Activity issue looks like a red herring.

    Sorry for opening an unecessary thread. However I can confirm that SP1 made no discernable difference.

    But the idea that the designer won't work if the w/f levels are "too deep" when all I'm doing is adding enough levels to make a Pick Activity work is a bit hard to stomach. Particulary when the StateMachine was scrapped in WF4 forcing me to load up the Pick.


    Dick Page
    Tuesday, March 22, 2011 12:32 PM
  • Thanks for all of the investigations. It's good to get the additional information. Sounds like maybe it isn't deep nesting indeed, but is some kind of pathological WPF layout scenario. I'll forward this information to WF designer team with a nag to see if it's something please can't they get it done in SP1 final.
    Tim
    Tuesday, March 22, 2011 6:35 PM