locked
Large Xaml Loadtime and WorkflowApplication.BeginRun RRS feed

  • Question

  • Hi All,

    We tried working with a very large WF application (around 100K line of xaml).

    It seems that the load time is considerably high (about 10sec to load the xaml, in compression to few millisecond for 100 line xaml … it seems that the load time is a linear function of the number of lines)

    We tested the load time using stopwatch from the time we are calling .BeginRun and until the Async result is received.

    Also the .BeginRun returns only after the async callback is being called (isn't it defies the idea of async callback ...)

    Using 8 CPU machine we try to overcome this (.BeginRun doesn't return immediately) with Parallel.For and trying to load simultaneously 8 instances (1 per core) and it just made thing worse, instead of having an instance loaded every 10 sec we got all 8 instances loaded after 83sec.

    And now to the question part …

    Is the a way to overcome this? we are using the WF as a workflow engine as an extension development tool for our application and don't have a control on what the end user may do ...

    Is it possible to preload Xaml? or shorten the load time or do a phase loading (something like load the xaml as we progress in the flow)

    Is there a way to overcome the synchronous load of a workflow application using .Run/.BeginRun (factorying threads and using workflowinvoker might do the job , but then we lose a lot of the async abilities build inside the workflow engine)

    edit:

    after checking the workflowInvoker object, the result is the same, calling BeginInvoke/InvokeAsync will result with the same, the so call async methods will only return once the xaml load is done ... so it is a dead end ...

    is it possaible to create different instance of the workflow engine? (Will try now thread factorying ... although I don't think it will help because the parallel.For is failing also ...)

     

     

    Any thoughts will be more then welcome,

    Cheers,

    Avi

     

     

     

    Monday, March 14, 2011 3:05 PM

Answers

  • You could also look at compiling your Xaml workflow into a type and using the workflow definition from the compiled assembly.

    Steve Danielson [Microsoft]
    This posting is provided "AS IS" with no warranties, and confers no rights.
    Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 2:22 PM
  • I'm not suggesting this is a complete answer by any means, but you might want to read up on workflow scheduler behavior from these threads, they should at least explain the bit about why BeginRun sometimes is synchronous (and sometimes is asynchronous).

    http://social.msdn.microsoft.com/Forums/en-US/wfprerelease/thread/8669fa35-1259-420d-9e83-679bdb02d6ab
    http://social.msdn.microsoft.com/Forums/en/windowsworkflowfoundation/thread/36acbc02-b6d9-4d8d-ac32-ff74532449b7
    http://social.msdn.microsoft.com/Forums/en/wfprerelease/thread/86545e82-91eb-4183-bf09-5771c3cf2ec6

    As far as fully preloading the workflow definition from XAML before you do Run() or Invoke(), I think that calling WorkflowInspectionServices.CacheMetadata() should do the trick. I don't expect this makes loading itself faster.
    (Note that CacheMetadata() is being called anyway before your activity actually gets to execute, this is just theoretically a way to frontload the work.)

    Also, if you are always using the same workflow definition, you might want to look into recycling or sharing workflow instances.

    Tim

    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 8:22 AM
  • I modified your app to use the technique I showed in my blog post WF4 Performance Tip–Cache Activities

    It brought the 8 thread result down to 1300 on my laptop which is slower than your machine.

    Workflow1 wf = new Workflow1();
     
    is an expensive line of code.  You don't have to create a new one every time.  Think of Workflow1 like a CLR type.  WorkflowInvoker uses it like a type to create a new instance so if you create one and use it over and over again you won't have to pay the cost of XAML serialization over and over again.


    Sr. Program Manager, AppFabric Development Platform (WF/WCF) http://blogs.msdn.com/rjacobs http://www.twitter.com/ronljacobs
    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 9:59 PM

All replies

  • I'm not suggesting this is a complete answer by any means, but you might want to read up on workflow scheduler behavior from these threads, they should at least explain the bit about why BeginRun sometimes is synchronous (and sometimes is asynchronous).

    http://social.msdn.microsoft.com/Forums/en-US/wfprerelease/thread/8669fa35-1259-420d-9e83-679bdb02d6ab
    http://social.msdn.microsoft.com/Forums/en/windowsworkflowfoundation/thread/36acbc02-b6d9-4d8d-ac32-ff74532449b7
    http://social.msdn.microsoft.com/Forums/en/wfprerelease/thread/86545e82-91eb-4183-bf09-5771c3cf2ec6

    As far as fully preloading the workflow definition from XAML before you do Run() or Invoke(), I think that calling WorkflowInspectionServices.CacheMetadata() should do the trick. I don't expect this makes loading itself faster.
    (Note that CacheMetadata() is being called anyway before your activity actually gets to execute, this is just theoretically a way to frontload the work.)

    Also, if you are always using the same workflow definition, you might want to look into recycling or sharing workflow instances.

    Tim

    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 8:22 AM
  • You could also look at compiling your Xaml workflow into a type and using the workflow definition from the compiled assembly.

    Steve Danielson [Microsoft]
    This posting is provided "AS IS" with no warranties, and confers no rights.
    Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 2:22 PM
  • Thanks for the answer.

     

    I did some more tests

     

    let me start by explaining what I did.

     

    I built a workflow which about 10k xaml line.

    The workflow basically doing nothing, there is a "IF" activity at the beginning with condition "true" the THEN part print a stopwatch to the screen and the ELSE part is just another 10K of XAML never being reached.

    So my xaml look like this ( I removed the 10k line)

    <Activity mc:Ignorable="sap" x:Class="WorkflowConsoleApplication23.Workflow1" xmlns="http://schemas.microsoft.com/netfx/2009/xaml/activities" xmlns:av="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:mv="clr-namespace:Microsoft.VisualBasic;assembly=System" xmlns:mva="clr-namespace:Microsoft.VisualBasic.Activities;assembly=System.Activities" xmlns:s="clr-namespace:System;assembly=mscorlib" xmlns:s1="clr-namespace:System;assembly=System" xmlns:s2="clr-namespace:System;assembly=System.Xml" xmlns:s3="clr-namespace:System;assembly=System.Core" xmlns:s4="clr-namespace:System;assembly=System.ServiceModel" xmlns:sa="clr-namespace:System.Activities;assembly=System.Activities" xmlns:sad="clr-namespace:System.Activities.Debugger;assembly=System.Activities" xmlns:sap="http://schemas.microsoft.com/netfx/2009/xaml/activities/presentation" xmlns:scg="clr-namespace:System.Collections.Generic;assembly=System" xmlns:scg1="clr-namespace:System.Collections.Generic;assembly=System.ServiceModel" xmlns:scg2="clr-namespace:System.Collections.Generic;assembly=System.Core" xmlns:scg3="clr-namespace:System.Collections.Generic;assembly=mscorlib" xmlns:sd="clr-namespace:System.Diagnostics;assembly=System" xmlns:sd1="clr-namespace:System.Diagnostics;assembly=System.Core" xmlns:sd2="clr-namespace:System.Data;assembly=System.Data" xmlns:sd3="clr-namespace:System.Diagnostics;assembly=mscorlib" xmlns:sl="clr-namespace:System.Linq;assembly=System.Core" xmlns:st="clr-namespace:System.Text;assembly=mscorlib" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml">
      <x:Members>
        <x:Property Name="num" Type="InArgument(x:Int32)" />
        <x:Property Name="sw" Type="InArgument(sd:Stopwatch)" />
      </x:Members>
      <sap:VirtualizedContainerService.HintSize>504,246</sap:VirtualizedContainerService.HintSize>
      <mva:VisualBasic.Settings>Assembly references and imported namespaces for internal implementation</mva:VisualBasic.Settings>
      <If Condition="True" sad:XamlDebuggerXmlReader.FileName="C:\Users\avik\documents\visual studio 2010\Projects\WorkflowConsoleApplication23\WorkflowConsoleApplication23\Copy of Workflow1.xaml" sap:VirtualizedContainerService.HintSize="464,206">
        <If.Then>
          <WriteLine sap:VirtualizedContainerService.HintSize="225,100" Text="[num.ToString() + &quot; : &quot; + sw.ElapsedMilliseconds.ToString()]" />
        </If.Then>
        <If.Else>
          <Sequence sap:VirtualizedContainerService.HintSize="214,100">
            <sap:WorkflowViewStateService.ViewState>
              <scg3:Dictionary x:TypeArguments="x:String, x:Object">
                <x:Boolean x:Key="IsExpanded">True</x:Boolean>
              </scg3:Dictionary>
            </sap:WorkflowViewStateService.ViewState>
          </Sequence>
        </If.Else>
      </If>
    </Activity>

    I build an application that will spawn threads and will try to run the workflow using worklfowinvoker.invoke ... to my understanding each flow should run in his own separate core and therefore should have the same load time to the maximum core I have.

    I then used 8 core machine and try to spawn 1 thread, 2 thread, 4 thread and 8 threads

    To my understanding on 8 core machine there shouldn't be a real time gap between the scenarios as each one run on its own core.

    However result was surprisingly different than expected

    with 1 thread the result was 500ms

    with 2 thread the result was 850ms

    with 4 thread the result was 1500ms

    with 8 thread the result was 3300ms

    I checked that all 8 core was very busy during that time so I believe the CLR set the thread affinity right.

    It seems there is a lock of some kind when invoking a new workflow and only 1 workflow at a time can be loaded.

    I attached the sample code for the testing if someone want to try and reproduce, also the test project itself is available on my skydrive

    http://cid-2e76486bd25e9d3b.office.live.com/self.aspx/WF%20Sample%20share

     

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Activities;
    using System.Activities.Statements;
    using System.Diagnostics;
    using System.Threading;

    namespace WorkflowConsoleApplication23
    {    
        class wfthread
        {
            public wfthread(int index, Stopwatch sw)
            {
                this.threadindex = index;
                this.sw = sw;

            }

            public void ParameterizedThreadStart(Object obj)
            {
                ManualResetEvent ev = (ManualResetEvent)obj;
                ev.WaitOne();
                Workflow1 wf = new Workflow1();

                Dictionary<stringobject> dic = new Dictionary<stringobject>();

                dic.Add("num", threadindex);
                dic.Add("sw", sw);

                WorkflowInvoker wfi= new WorkflowInvoker(wf);

                wfi.Invoke(dic);

            }

            public int threadindex { getset; }
            public Stopwatch sw { getset; }

        }

        class Program
        {
            static Stopwatch sw;
            static int t = 0;
            static int l=1;


            static void Main(string[] args)
            {

                sw = new Stopwatch();

               System.Threading.ManualResetEvent ev = new ManualResetEvent(false);

               for (int i = 0; i < 8; ++i)
               {
                   wfthread wft = new wfthread(i,sw);
                   Thread t = new Thread(wft.ParameterizedThreadStart);               
                   t.Start(ev);  
                   
               }

               sw.Start();
               ev.Set();

               Console.ReadLine();
                
            }

        }
    }

     

    Tuesday, March 15, 2011 6:25 PM
  • I modified your app to use the technique I showed in my blog post WF4 Performance Tip–Cache Activities

    It brought the 8 thread result down to 1300 on my laptop which is slower than your machine.

    Workflow1 wf = new Workflow1();
     
    is an expensive line of code.  You don't have to create a new one every time.  Think of Workflow1 like a CLR type.  WorkflowInvoker uses it like a type to create a new instance so if you create one and use it over and over again you won't have to pay the cost of XAML serialization over and over again.


    Sr. Program Manager, AppFabric Development Platform (WF/WCF) http://blogs.msdn.com/rjacobs http://www.twitter.com/ronljacobs
    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 2:31 PM
    Tuesday, March 15, 2011 9:59 PM
  • Hi Ron,

    Thanks for your replay, it was a real pleasure meeting you at tech-ed Berlin

    It seems that the link is broken ... did you meant this http://blogs.msdn.com/b/rjacobs/archive/2011/02/12/wf4-performance-tip-cache-activities.aspx link?

    I will do some test, although when I checked it against the 100K line xaml at the begning (the original post)  the "new" took about 1 sec and "initial invoking" about 8 sec (by "initial invoking" I meant: from the time .Invoke is called until the time the first activity is executed)

    so all in all about 9 sec when running 8 thread simultaneously, one should expect more or less the same performance on an 8 core machine ... however as I stated in my original post it took about 82 seconds.

    I will do some more checks and let you know.

    Thanks,

    Avi

    Wednesday, March 16, 2011 7:05 AM
  • Hi All,

    I have some new conclusions I want to share

    Taking Ron advice and try to call "new Workflow1" only once and then passing reference did lower the response time drastically. moreover it made the .Invoke take the same time at any given number of threads.

    That made me a little bit carious, so I reverted back in order to check the "new workflow1" impact on the overall timing (3300ms), and try to run "new Workflow1" on an amount of thread equal to the number of cores to see the impact on the performance , the impact was minimum, the "new workflow1" took only 50ms.

    So the new operator is relatively expansive (as Lieutenant Commander Data once indicate 50ms is almost eternity to machine eyes J) ... but considering the original result (3300ms) it has minimum effect in this case.

    here is the results on 8 core machines and 8 threads

    · Calling new workflow1 and then calling .Inoke on each thread 3300ms (steady result – all threads ended at the same time)

    · calling only new workflow1 on each thread 50ms (without calling .Invoke)

    · calling new workflow1 in the main thread and then calling .Invoke in each thread - 500ms (steady result – all threads ended at the same time)

    and now something interesting ...

    · calling new workflow1 in the main thread then invoking it once on the main thread took 500ms, invoking it again on the 8 threads took 0ms ...

    To my opinion when calling .Invoke (it is the same for the .run or .beginrun of workflowapplication object) if the object is new (not previously cached) the caching mechanism has some kind of a lock.

    There is something going on in the first .Inoke of a new object  ...  it seems not to working right ... it seems there is no utilizition of multi-core architecture in the initial caching of a workflow application ...

    I will try to put everything to a sample code later on 

     

    Cheers,

    Avi

    • Edited by Avi Kadosh Wednesday, March 16, 2011 4:02 PM rephrase
    • Marked as answer by Avi Kadosh Thursday, March 17, 2011 5:31 AM
    • Unmarked as answer by Avi Kadosh Thursday, March 17, 2011 5:40 AM
    Wednesday, March 16, 2011 12:01 PM
  • · calling new workflow1 in the main thread then invoking it once on the main thread took 500ms, invoking it again on the 8 threads took 0ms ...

    I think this is fully explained by the way the workflow loads the XAML implementation body. It doesn't load the XAML implementation body until the first time you invoke it. But after that, it will not need to load the XAML again.
    Tim

    Wednesday, March 16, 2011 9:50 PM
  • Hi All,

    Thanks a lot for bearing with me this far ...

    for the amount of time loading large XAML file ... nothing one can do this is the amount of time the initial run will be ... Thanks to Ron and Tilovell and the caching mechanism it is tolerable

    However when investigating this I think I come across some strange behavior in a multi core environment I will continue the discussion in

    http://social.msdn.microsoft.com/Forums/en-US/wfprerelease/thread/45f4814d-a643-4656-9e8b-2b945819cdbc

    as it a little bit off topic

    Thursday, March 17, 2011 2:30 PM