AppFabric Caching restart more then 5 minutes


  • Hi,

    Environment: AppFabric v1.1 on Windows 2008 R2 (MultiCore NUMA, 64GB RAM), SQL Server 2008 R2 store provider. Single cluster (small) and only one host in cluster.

    AppFabric is very slow to restart, it takes more then 5 minutes to restart. What I have noticed is that it has few warnings in log (see below the error). After setting in configuration that it is not leadhost (by default it is), the errors gone, but the startup time is the same.

    When AppFabric cache size is reduced (default is 32GB) to 4GB or 1GB - the restart is reduced to 30seconds. Why is that? In case of 4GB, allocated memory is 350MB. When was in default settings (32GB), Cache jumps to 4GB allocated memory very quickly - after a few seconds, but start is slow (5 minutes).

    Errors in log:

    {3e43dc3c000000000000000000000000} failed to refresh lookup table, exception: {Microsoft.Fabric.Common.OperationCompletedException: Operation completed with an exception ---> Microsoft.Fabric.Federation.RoutingException: The target node explicitly aborted the operation
       --- End of inner exception stack trace ---
       at Microsoft.Fabric.Common.OperationContext.End()
       at Microsoft.Fabric.Federation.FederationSite.EndRoutedSendReceive(IAsyncResult ar)
       at Microsoft.Fabric.Data.ReliableServiceManager.EndRefreshLookupTable(IAsyncResult ar)}



    • 已编辑 SreckoT 2012年2月13日 12:27
    2012年2月13日 12:23


  • Hi Srecko,

    The warnings you are getting while starting the cluster is benign , and should not lead to any issues including the startup time. 

     "After setting in configuration that it is not leadhost (by default it is), the errors gone, but the startup time is the same."

    I am bit surprised by this . If no lead hosts are there cluster should not be starting . Can you please verify this ? I am assuming you are doing restart using the administrative command-lets. Is this right ?

    Now coming to the startup duration , the amount of memory should not have major impact on the startup time. How are you measuring the startup time ? Is it the time for service to be in running state or the all nodes are shown as healthy ?




    2012年2月14日 8:30
  • Hi Arun,

    Thanx again ;-).

    This is single cluster configuration, only (localhost) is in the cluster, no other nodes are added (for now) to the cluster.

    There are No lead hosts because SQL server (store provider) is managing the cluster. (<partitionStoreConnectionSettings leadHostManagement="false" />)

    Restart is done using powershell, i.e. administrative command-lets. Restart-CacheCluster and Restart-CacheNode.

    Startup duration is problematic and is directly memory correlated. Restart with Cache size 32GB is 5:30min and in case of 2GB is 0:30min. Note that only Cache size is changed and I observe this behaviour, but I must mention again that it is NUMA architecture server. Restart time is measured using EventLog (Microsoft-Windows-Application Server-System Services), from cache stop to cache started events.

    Best Regards,


    • 已编辑 SreckoT 2012年2月14日 14:45
    2012年2月14日 14:44
  • Hi Srecko,

    NUMA is not a tested scenario. Your host is large (64 GB) and we do a bunch of allocations during start up, which could be the cause of delay on NUMA.



    2012年2月17日 8:00
  • Srecko,

    We would like to understand your scenario and use case better. Can you share some details about how you are using the cache with us? You can email me at pragya dot agarwal at microsoft dot com.



    2012年2月17日 8:23
  • As Pragya mentioned, when the service starts lot of heap is preallocated (or taken from OS).  This phase of pre-allocating can take long time and is the time is proportional to the memory that can be preallocated. 

    To verify if this is the root cause of startup delay, you can watch the DistributedCacheService.exe process in Task Manager Process Tab or memory usage in Task Manager Performance Tab.


    Laxmi Narsimha Rao Oruganti

    Please hit "Yes", if my post answered your question(s). All postings are as-is and confer no rights.

    2012年3月1日 8:07
  • @Srecko:

    By restarting time, did you mean it's the startup time or the shutdown/stopping time?

    I'm having this issue as well on my Windows 7 SP1 x64 (8GB) sometimes. I notice it's the shutdown time that take few minutes to accomplish sometimes, I'm not sure why. But the startup time itself is almost instant.

    2012年4月1日 5:05