AppFabric Cache leaking memory?

Answered AppFabric Cache leaking memory?

  • 2011年3月1日 11:54
     
     

    Hi, I have the simple scenario - 3 freshly installed (virtual) windows 2008 enterprise servers with appfabric caching. 3 GB RAM each. The cache is used from 3-4 web sites for session store with the sesion provider/appfabric client for windows 2003 server. In the session we store few copies of viewstate.

    There is excessive memory usage, probably leaking memory till the point that it is exausted and starts throtling/errors on clients.

    Eviction is dissabled, the first try was without expiration too but with expiration it is the same.

    Here sample create cache command:

    New-Cache -CacheName LocalProduction -Eviction none -Expirable true -TimeToLive 30 -Secondaries 1

    All caches sum of data is not more than 100 MB at all the times but memory usage of cache service rises to about 2 GB and all available memory is over and it starts to evict objects (from perfmon) throtling state of some of the hosts and errors on the clients.

    What is the best way to use caching as session store? I cannot imagine what more usual/standart way of usage than mine...

    I have disabled sessionprovider on the sites and leaved alone overnight - this morning the memory usage was about the same - 1.3-1.5 GB per node and no objects in the cache. It was still working.

    After restarting node one by one memory usage dropped without loss of data... 

    With minimal usage through the day it again is high - about 800 MB per node and these are the statistics now

    Size         : 15195514
    ItemCount    : 12
    RegionCount  : 106
    RequestCount : 21074
    MissCount    : 703

     

全部回复

  • 2011年3月1日 18:17
     
     

    Can you use the Export-CacheClusterConfig command and post the contents of that file here? You can change the machine names if necessary. I am interested in seeing the cluster configuration.

    You are still seeing the memory use by the caching service even when your web application is not using it? And there are items in the cache despite this lack of use, or did you re-enable session state on the web app? 

    Can you also post the datacacheclient section of the web.config here as well as the session state provider section here? Hopefully we'll spot something.

    One final question, are these all Windows 2008 Enterprise servers or higher, which is the requirement for High Availability?

    Thanks!

    Jason Roth

  • 2011年3月1日 19:47
     
     

    Yes I have high memory usage after no one more is using the cache. Now I've stopped using it for production sites and test only on one of the training ones.

    Configuration is - all cache nodes are Windows 2008 Enterprise. AppFabric x64 6.1 installed. All in the domain.
    There are load balanced web servers on the same nodes and also web servers on Windows 2003 outside the domain that uses 2003 client for appfabric. So the security is disabled.

    I have tried load test few hours ago - few thousand separate gets of the login page on one of the test sites that is configured to use the cache as session store - so creating new sessions. This way I've succeeded to hit these stats:
    Size         : 60968054
    ItemCount    : 2612
    RegionCount  : 1876
    RequestCount : 59396
    MissCount    : 4201
    , but memory usage didn't go much upper... It was about 800 MB across all the 3 nodes and stayed about the same.

    Now there are no items in the cache (they've expired) but stats are
    Size         : 0
    ItemCount    : 0
    RegionCount  : 2048
    RequestCount : 144674
    MissCount    : 13786

    Memory usage is about 300-400 MB on each of the 3 nodes.

    Probably the region count is the problem with the leaking memory?

    There are 8 sites that I want to use this cache and I've created 8 caches. Names of them are changed.

    Cluster config
    <?xml version="1.0" encoding="utf-8"?>
    <configuration>
        <configSections>
            <section name="dataCache" type="Microsoft.ApplicationServer.Caching.DataCacheSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
        </configSections>
        <dataCache size="Small">
            <caches>
                <cache consistency="StrongConsistency" name="AProduction"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="ATraining"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="BProduction"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="BTraining"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="CProduction"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="CTraining"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="default">
                    <policy>
                        <eviction type="Lru" />
                        <expiration defaultTTL="10" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="DProduction"
                    secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
                <cache consistency="StrongConsistency" name="DTraining" secondaries="1">
                    <policy>
                        <eviction type="None" />
                        <expiration defaultTTL="30" isExpirable="true" />
                    </policy>
                </cache>
            </caches>
            <hosts>
                <host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
                    hostId="388141538" size="1023" leadHost="true" account="SOF\CBVELOCITY001$"
                    cacheHostName="AppFabricCachingService" name="CBVELOCITY001"
                    cachePort="22233" />
                <host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
                    hostId="415715938" size="1023" leadHost="false" account="SOF\CBVELOCITY002$"
                    cacheHostName="AppFabricCachingService" name="CBVELOCITY002"
                    cachePort="22233" />
                <host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
                    hostId="1724892651" size="1023" leadHost="false" account="SOF\CBVELOCITY003$"
                    cacheHostName="AppFabricCachingService" name="CBVELOCITY003"
                    cachePort="22233" />
            </hosts>
            <advancedProperties>
                <partitionStoreConnectionSettings leadHostManagement="false" />
                <securityProperties mode="None" protectionLevel="None">
                    <authorization>
                        <allow users="NetworkService" />
                    </authorization>
                </securityProperties>
            </advancedProperties>
        </dataCache>
    </configuration>

    Sample web site config

     <dataCacheClient>
      <securityProperties mode="None" protectionLevel="None" />
      <hosts>
        <host name="CBVELOCITY001" cachePort="22233" />
        <host name="CBVELOCITY002" cachePort="22233" />
        <host name="CBVELOCITY003" cachePort="22233" />
      </hosts>
     </dataCacheClient>

     <sessionState mode="Custom" customProvider="AppFabricCacheSessionStoreProvider">
      <providers>
       <add name="AppFabricCacheSessionStoreProvider" type="Microsoft.ApplicationServer.Caching.DataCacheSessionStoreProvider" cacheName="AProduction" sharedId="AProduction"/>
      </providers>
     </sessionState>

  • 2011年3月2日 9:20
     
     

    .Net GC does not collect memory unless necessary i.e. memory freed by the process may still seem part of process size (or in stats): http://msdn.microsoft.com/en-us/library/ee787088.aspx#conditions_for_a_garbage_collection. The unexpected element is this resulting in throttling errors. How did you calculate the 100MB size of your objects (take in account the serialized size)?

    You can try if forcing gargabe collection (using Invoke-CacheGC) brings dows the process memory (forcing collection is not really recommended, use this for diagnosis).

    Regions' presence is not related to memory leakage as one region object doesn't occupy more than a few KBs of memory. System regions once created are not removed in a running cluster.

     

  • 2011年3月2日 9:38
     
     

    I have tried Invoke-CacheGC two days ago. It does not have any effect. Just ran it again - nothing again.

    Today I have stopped everything from using the cache to perform clean tests but before that I want to try to clear it.

    Cache size that I'm talking about is from Get-CacheStatistics.

    Now I've run this script:

     $caches = Get-Cache
    foreach($cache in $caches) { Write-Host $cache.CacheName (Get-CacheStatistics $cache.CacheName).Size }

    Everything is Zero!

    Memory usage on each host is about 400 MB. I am talking about the private memory of DistributedCacheService.exe seen with Process Explorer.

    Also tried clearing up the cache with this script

    http://social.msdn.microsoft.com/Forums/en/velocity/thread/b2635103-b7ff-4bfd-992e-063047026ee4

    Again - nothing happens and even size on one of the hosts risen up a little.

  • 2011年3月2日 12:54
     
     

    Is it easy to try the test on those servers again (confirm that they are using 400 MB still across the cluster and don't restart the machines or services prior to the second run of the test)? Previously you went up to 800MB and then went back down to 400. If you ran the test again and it again returned to 400, then we could conclude that the caching service is maintaining a hold on memory without leaking. But if you go up to 1200 MB across the cluster and back down to 800MB with no activity, then there is definitely a leak.

    Jason Roth

  • 2011年3月2日 13:28
     
     

    They still are at 400 MB without any usage.

    Hronologically while testing on little loaded site the memory usage was acceptable 400-500 MB - still strange but...

    When I've enabled all production sites to use the cache, The first try was with no eviction and no expiration. At about 1 GB per node I've give up and recreated the caches with enabled expiration - "New-Cache -CacheName AProduction -Eviction none -Expirable true -TimeToLive 30 -Secondaries 1"

    Then for one day we hit 1.5 GB memory per node and got to the point that I've got error there is contention on the cluster or something like that and one of the nodes was in throtled state with some values on NoWriteQuorum. The sites are not heavy loaded - about 10-20 requests per second/200-300 users summary.

    Overnight without using it - the next day memory usage stayed the same. I've restarted the nodes one by one and it seems I didn't lost the session and the memory usage dropped at about 50 MB per node.

    I have leaved the little loaded site to work with it and for one day we go to the point that we have 800 MB per node. It dropped to 400 but I can't remember exactly - could be because of restart.

    Now I've created process dump on one of the nodes and am trying to see what is in the memory with WinDBG - some help on this will be appreciated greatly :)

     

    Regards,

    Bojidar Alexandrov

  • 2011年3月2日 15:40
     
     

    Here some info from the process dump. Its with size 500 MB.

     

    0:000> kb
    RetAddr           : Args to Child                                                           : Call Site
    000007fe`fdca10ac : 00000000`000702d8 00000000`77a2c674 00000000`00000000 00000000`002fead8 : ntdll!ZwWaitForSingleObject+0xa
    000007fe`ff53affb : 00000000`ffffffff 000007fe`ff53344c 00000000`00000000 00000000`000001e4 : KERNELBASE!WaitForSingleObjectEx+0x79
    000007fe`ff539d61 : 00000000`00e75bb0 00000000`000001e4 00000000`00000000 00000000`00000000 : sechost!ScSendResponseReceiveControls+0x13b
    000007fe`ff539c16 : 00000000`002fec78 00000000`000004d3 00000000`00000001 000007fe`00000000 : sechost!ScDispatcherLoop+0x121
    000007fe`f9ec17c7 : 00000000`f89ffc8f 00000000`00e83e40 00000000`778c0000 00000000`00000000 : sechost!StartServiceCtrlDispatcherW+0x14e
    *** WARNING: Unable to verify checksum for System.ServiceProcess.ni.dll
    000007fe`f7e8b15b : 00000000`00e84050 00000000`002fed48 000007fe`f7e65960 0000d7dd`c18484f5 : clr!DoNDirectCall__PatchGetThreadCall+0x7b
    000007fe`f7e8cf1f : 00000000`00e84050 00000000`00000000 00000000`00e84068 00000000`00000000 : System_ServiceProcess_ni+0x2b15b
    000007ff`00170249 : 00000000`bfff7ae8 00000000`bfff7b10 00000000`bfff7b10 00000000`bfff7b10 : System_ServiceProcess_ni+0x2cf1f
    000007fe`f9f010b4 : 00000000`002feef0 000007fe`f9ec4e15 ffffffff`fffffffe 00000000`002ff410 : 0x7ff`00170249
    000007fe`f9f011c9 : 000007ff`000440f0 00000000`00000001 00000000`00000000 00000000`00000000 : clr!CallDescrWorker+0x84
    000007fe`f9f01245 : 00000000`002feff8 00000000`00000000 00000000`002ff000 00000000`002ff1d8 : clr!CallDescrWorkerWithHandler+0xa9
    000007fe`fa001675 : 00000000`00000000 00000000`002ff1e8 00000000`00000000 00000000`00000000 : clr!MethodDesc::CallDescr+0x2a1
    000007fe`fa0017ac : 00000000`000bebd0 00000000`000bebd0 00000000`00000000 00000000`00000000 : clr!ClassLoader::RunMain+0x228
    000007fe`fa001562 : 00000000`002ff9e0 00000000`00000200 00000000`000d4ed0 00000000`00000200 : clr!Assembly::ExecuteMainMethod+0xac
    000007fe`fa003dd6 : 00000000`00000000 00000000`01220000 00000000`00000000 00000000`00000000 : clr!SystemDomain::ExecuteMainMethod+0x452
    000007fe`fa003cf3 : 00000000`01220000 00000000`00000000 00000000`00000000 00000000`00000000 : clr!ExecuteEXE+0x43
    000007fe`fa087365 : 00000000`000bebd0 ffffffff`ffffffff 00000000`00000000 00000000`00000000 : clr!CorExeMainInternal+0xc4
    000007fe`fa8d3309 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`002ffcf8 : clr!CorExeMain+0x15
    000007fe`fa835b21 : 000007fe`fa087350 000007fe`fa8d32c0 00000000`00000000 00000000`00000000 : mscoreei!CorExeMain+0x41
    00000000`778df56d : 000007fe`fa8d0000 00000000`00000000 00000000`00000000 00000000`00000000 : mscoree!CorExeMain_Exported+0x57
    00000000`77a12cc1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
    00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d


    0:000> k
    Child-SP          RetAddr           Call Site
    00000000`002fe978 000007fe`fdca10ac ntdll!ZwWaitForSingleObject+0xa
    00000000`002fe980 000007fe`ff53affb KERNELBASE!WaitForSingleObjectEx+0x79
    00000000`002fea20 000007fe`ff539d61 sechost!ScSendResponseReceiveControls+0x13b
    00000000`002feb10 000007fe`ff539c16 sechost!ScDispatcherLoop+0x121
    00000000`002fec20 000007fe`f9ec17c7 sechost!StartServiceCtrlDispatcherW+0x14e
    00000000`002fec70 000007fe`f7e8b15b clr!DoNDirectCall__PatchGetThreadCall+0x7b
    00000000`002fed10 000007fe`f7e8cf1f System_ServiceProcess_ni+0x2b15b
    00000000`002fedd0 000007ff`00170249 System_ServiceProcess_ni+0x2cf1f
    00000000`002fee60 000007fe`f9f010b4 0x7ff`00170249
    00000000`002feea0 000007fe`f9f011c9 clr!CallDescrWorker+0x84
    00000000`002feee0 000007fe`f9f01245 clr!CallDescrWorkerWithHandler+0xa9
    00000000`002fef60 000007fe`fa001675 clr!MethodDesc::CallDescr+0x2a1
    00000000`002ff190 000007fe`fa0017ac clr!ClassLoader::RunMain+0x228
    00000000`002ff3e0 000007fe`fa001562 clr!Assembly::ExecuteMainMethod+0xac
    00000000`002ff690 000007fe`fa003dd6 clr!SystemDomain::ExecuteMainMethod+0x452
    00000000`002ffc40 000007fe`fa003cf3 clr!ExecuteEXE+0x43
    00000000`002ffca0 000007fe`fa087365 clr!CorExeMainInternal+0xc4
    00000000`002ffd10 000007fe`fa8d3309 clr!CorExeMain+0x15
    00000000`002ffd50 000007fe`fa835b21 mscoreei!CorExeMain+0x41
    00000000`002ffd80 00000000`778df56d mscoree!CorExeMain_Exported+0x57
    00000000`002ffdb0 00000000`77a12cc1 kernel32!BaseThreadInitThunk+0xd
    00000000`002ffde0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


    0:000> ~
    .  0  Id: 524.528 Suspend: 0 Teb: 000007ff`fffdd000 Unfrozen
       1  Id: 524.52c Suspend: 0 Teb: 000007ff`fffdb000 Unfrozen
       2  Id: 524.530 Suspend: 0 Teb: 000007ff`fffd9000 Unfrozen
       3  Id: 524.534 Suspend: 0 Teb: 000007ff`fffd7000 Unfrozen
       4  Id: 524.538 Suspend: 0 Teb: 000007ff`fffd5000 Unfrozen
       5  Id: 524.53c Suspend: 0 Teb: 000007ff`fffd3000 Unfrozen
       6  Id: 524.540 Suspend: 0 Teb: 000007ff`fffae000 Unfrozen
       7  Id: 524.57c Suspend: 0 Teb: 000007ff`fffa6000 Unfrozen
       8  Id: 524.608 Suspend: 0 Teb: 000007ff`fffa2000 Unfrozen
       9  Id: 524.720 Suspend: 0 Teb: 000007ff`fffa0000 Unfrozen
      10  Id: 524.81c Suspend: 0 Teb: 000007ff`fffa8000 Unfrozen
      11  Id: 524.82c Suspend: 0 Teb: 000007ff`ffeec000 Unfrozen
      12  Id: 524.868 Suspend: 0 Teb: 000007ff`ffee4000 Unfrozen
      13  Id: 524.86c Suspend: 0 Teb: 000007ff`ffee2000 Unfrozen
      14  Id: 524.bec Suspend: 0 Teb: 000007ff`fffac000 Unfrozen
      15  Id: 524.9b4 Suspend: 0 Teb: 000007ff`ffeee000 Unfrozen
      16  Id: 524.930 Suspend: 0 Teb: 000007ff`fffaa000 Unfrozen
      17  Id: 524.16c0 Suspend: 0 Teb: 000007ff`fffa4000 Unfrozen
      18  Id: 524.103c Suspend: 0 Teb: 000007ff`ffed8000 Unfrozen
      19  Id: 524.fa0 Suspend: 0 Teb: 000007ff`ffee8000 Unfrozen
      20  Id: 524.d04 Suspend: 0 Teb: 000007ff`ffedc000 Unfrozen
      21  Id: 524.738 Suspend: 0 Teb: 000007ff`ffeda000 Unfrozen
      22  Id: 524.db0 Suspend: 0 Teb: 000007ff`ffee0000 Unfrozen
      23  Id: 524.c88 Suspend: 0 Teb: 000007ff`ffede000 Unfrozen
      24  Id: 524.1a14 Suspend: 0 Teb: 000007ff`ffeea000 Unfrozen


    0:000> !heap -s
    LFH Key                   : 0x0000007e434f22d6
    Termination on corruption : ENABLED
              Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock  Fast
                                (k)     (k)    (k)     (k) length      blocks cont. heap
    -------------------------------------------------------------------------------------
    Virtual block: 0000000001740000 - 0000000001740000 (size 0000000000000000)
    Virtual block: 00000000019a0000 - 00000000019a0000 (size 0000000000000000)
    0000000000070000 00000002    3584   3376   3584    249   114     3    2      0   LFH
    0000000000010000 00008000      64      4     64      1     1     1    0      0     
    00000000004f0000 00001002    1088    208   1088      9     7     2    0      0   LFH
    0000000000600000 00041002     512      8    512      3     1     1    0      0     
    00000000007f0000 00001002    1088    216   1088      4     6     2    0      0   LFH
    00000000007c0000 00001002    1088    216   1088     17     8     2    0      0   LFH
    0000000000940000 00041002     512     16    512      3     1     1    0      0     
    0000000000720000 00001002    1088    264   1088     17     8     2    0      0   LFH
    00000000008a0000 00041002     512    148    512      2     2     1    0      0   LFH
    0000000000db0000 00041002     512    184    512      4     4     1    0      0   LFH
    0000000002b40000 00001002      64     16     64     11     2     1    0      0     
    0000000002d20000 00001002     512    176    512      6    12     1    0      0   LFH
    0000000003510000 00001002     512    176    512      4     4     1    0      0   LFH
    0000000002ca0000 00001002      64      8     64      3     1     1    0      0     
    0000000002c40000 00001002      64      8     64      3     1     1    0      0     
    0000000003770000 00011002     512      8    512      3     2     1    0      0     
    0000000002b90000 00001002     512      8    512      3     2     1    0      0     
    -------------------------------------------------------------------------------------


    0:000> !heap -stat -h 0000000000070000
     heap @ 0000000000070000
    group-by: TOTSIZE max-display: 20
        size     #blocks     total     ( %) (percent of total busy bytes)
        258000 2 - 4b0000  (71.92)
        210 218 - 45180  (4.14)
        cc48 2 - 19890  (1.53)
        a0 1f8 - 13b00  (1.18)
        5f8 31 - 12478  (1.10)
        30 578 - 10680  (0.98)
        20 653 - ca60  (0.76)
        40 320 - c800  (0.75)
        c010 1 - c010  (0.72)
        103c a - a258  (0.61)
        10d1 9 - 9759  (0.57)
        700 13 - 8500  (0.50)
        108 80 - 8400  (0.49)
        8058 1 - 8058  (0.48)
        e0 7b - 6ba0  (0.40)
        100 64 - 6400  (0.37)
        6010 1 - 6010  (0.36)
        5c80 1 - 5c80  (0.35)
        1b00 3 - 5100  (0.30)
        ff8 5 - 4fd8  (0.30)

     

    0:000> !heap -flt s 258000
        _HEAP @ 70000
                  HEAP_ENTRY Size Prev Flags            UserPtr UserSize - state
            0000000001740030 25800 0000  [00]   0000000001740040    258000 - (busy VirtualAlloc)
            00000000019a0030 25800 5800  [00]   00000000019a0040    258000 - (busy VirtualAlloc)
        _HEAP @ 10000
        _HEAP @ 4f0000
        _HEAP @ 600000
        _HEAP @ 7f0000
        _HEAP @ 7c0000
        _HEAP @ 940000
        _HEAP @ 720000
        _HEAP @ 8a0000
        _HEAP @ db0000
        _HEAP @ 2b40000
        _HEAP @ 2d20000
        _HEAP @ 3510000
        _HEAP @ 2ca0000
        _HEAP @ 2c40000
        _HEAP @ 3770000
        _HEAP @ 2b90000

    0:000> !heap -p -a 0000000001740030
        address 0000000001740030 found in
        _HEAP @ 70000
                  HEAP_ENTRY Size Prev Flags            UserPtr UserSize - state
            0000000001740030 25800 0000  [00]   0000000001740040    258000 - (busy VirtualAlloc)

     
    0:000> !heap -p -a 00000000019a0030
        address 00000000019a0030 found in
        _HEAP @ 70000
                  HEAP_ENTRY Size Prev Flags            UserPtr UserSize - state
            00000000019a0030 25800 0000  [00]   00000000019a0040    258000 - (busy VirtualAlloc)

  • 2011年3月2日 18:01
     
     

    This may be an issue where you'll have to open a support incident to get it debugged completely. However, here are some ideas:

    1. Since you're familiar with the debugger, try opening your dump file and doing the following:

    .loadby sos mscorwks

    !dumpheap -stat

    2. This loads the .NET commands for the debugger. The second command shows an output of all of the managed objects, both count and size. You can see whether it is .NET objects taking up the majority of the memory. If so, maybe the type of .NET object is a clue. You could post the top culprits here if it shows anything.

    Jason Roth

  • 2011年3月3日 5:28
     
     

    My experience with windbg is almost one entire half day :)
    I have tried this, sucessfully loaded sos manually but on every sos command I get that mscorwks is not loaded so I think that it's not .net app?
    I will open a support case next week (I'm on a vacation till then), please tell me where to go, how to start.

    My wondering is that I use the cache in the most default way, there is nothing exceptional and special in my case - these are clean newly installed virtual machines... Either it is not tested at all (that I don't believe) or there is some misconfiguration that I've done causing it to act strange. I just cannot spot it.

    Are there other reports for excessive memory usage?

  • 2011年3月8日 6:37
     
     

    I have just made a clear test.

    Freshly restarted cluster with 3 nodes. Nothing in the cache. Windows 2008 Enterprise, 3 GB memory.

    Memory usage was (1,2,3 node) 94 MB, 72 MB, 27 MB. Why not equal?

    There are two load ballanced sites on the 1st and 2nd node using cache for session store. In the session we store the viewstate passing only guid in the page.

    Test app that plays recorded loging in to the site, making 1000 consecutive logins. I have leaved this overnight.

    This morning cache stats are

    Size         : 0
    ItemCount    : 0
    RegionCount  : 1254
    RequestCount : 11089
    MissCount    : 1008

    Memory usage was (1,2,3 node) 180 MB, 133 MB, 87 MB.

    Is this normal?

  • 2011年3月8日 7:50
     
     

    I am doing another test with more requests today but as the test is running I have another question.

    Just restarted the 3 hosts and starting with 119 MB, 118 MB, 119 MB memory usage (Private Bytes memory of DistributedCacheService.exe seen with Process Explorer)

    Cache config

    CacheName            : LocalTraining
    TimeToLive           : 30 mins
    CacheType            : Partitioned
    Secondaries          : 1
    IsExpirable          : True
    EvictionType         : None
    NotificationsEnabled : False

    Cache stats

    Size         : 271010296
    ItemCount    : 214
    RegionCount  : 210
    RequestCount : 2316
    MissCount    : 111

    Memory usage 400 MB, 360 MB, 370 MB

    So we have 270 MB cache (with a secondary copy) total 540 MB and difference in total memory usage is 770 MB?

    I am going to leave this test running after that will wait everything to expire and will post again memory usage.

    ... later

    Size         : 292318846
    ItemCount    : 228
    RegionCount  : 352
    RequestCount : 4215
    MissCount    : 194

    Memory usage 813 MB, 538 MB, 544 MB so total difference is 1500 MB more memory used for 292 MB cache (with secondaries 580 MB)!

    ...final results

    Peak cache size about 600 MB, peak memory usage about 1 GB per node (3 nodes)

    Now cache expired

    Size         : 0
    ItemCount    : 0
    RegionCount  : 1626
    RequestCount : 54452
    MissCount    : 1675

    Memory usage 535 MB, 615 MB, 488 MB.

     

  • 2011年3月8日 11:23
     
     

    I have done more tests. This time without "high availability" ie secondaries=0

    Starting memory usage 154 MB, 124 MB, 68 MB (again not equal after restart of cluster)

    After the test run (1000 sessions each with few requests) and cool down of 10 min.

    we have these cache stats

    Size         : 189127101
    ItemCount    : 1000
    RegionCount  : 632
    RequestCount : 36986
    MissCount    : 1000

    And memory usage is

    298 Mb, 190 MB, 168 MB. So we have a difference of 310 MB memory overhead to store 180 MB cache without secondaries!!!

    After expiration

    Size         : 0
    ItemCount    : 0
    RegionCount  : 632
    RequestCount : 37986
    MissCount    : 1000
    Memory usage 260 MB, 183 MB, 168 MB. About 250 MB garbadge remains... Still far better than the previous test but....

    Same test with secondaries

    Size         : 378160882
    ItemCount    : 2000
    RegionCount  : 1234
    RequestCount : 36942
    MissCount    : 1000

    Memory usage 552 MB, 473 MB, 351 MB - 1376 MB memory for 378 MB cache!!

    ...cache expired

    Size         : 0
    ItemCount    : 0
    RegionCount  : 1234
    RequestCount : 38942
    MissCount    : 1000

    Memory usage 480 MB, 437 MB, 383 MB...

    Another bigger test:
    Expiration was disabled but items anyway expired not after the iisreset on the web sites where used but gradually in time.

    Cache size about 1 GB (incl. secondaries), memory usage was 1 GB per host (Total 3)

    After some time one of the hosts released the memory and dropped at 230 MB, the other two dropped on 720 MB and stay there more than hour...
    ...14 hours later without any use

    Memory usage 452 MB, 718 MB, 181 MB...

    Is this a production ready product? Does someone really use it in a production environment? How to use it for session store successfully?

     

  • 2011年3月9日 13:18
     
     

    Thanks for the detailed tests. I do know that the caching service can use large amounts of memory compared to the amount of data being cached. Normally this is due to garbage collection, as we have discussed. But I know that you have tried using Invoke-CacheGC without effect. I would still continue to run this command to rule this out when you are testing.

    People are using this in production for session state. We have had reports of the caching service using a lot of memory due to the garbage collection issue. There is one other possibility that we're investigating about the WCF buffer pool. Is there any chance that you changed any of the default buffer sizes in the cluster configuration or client configuration files?

    I'm going to do some more research and try to get back to you on this.

    Jason Roth

  • 2011年3月9日 14:27
     
     

    Today I've started with uninstall of AppFabric, new install on only two of the hosts and with new config database. Everything is by default except disabling security as seen on the config file, posted.

    Servers are Windows Enterprise x64 and I use cache client for Windows 2003 with web site on .net 2.0, we are planing migration to .net 4.0 in next months

    Today after load test and waiting cache itemes to expire, two hosts used 800 MB memory. I've tried again Invoke-CacheGC and noticed little drop in memory usage on one of the hosts. Ivoked it few times more and memory usage stabilized at 700 MB. Then pasted 100 invokes but nothing changed. It's very strange that it is not released more than 30 min after last use but is not a problem if it gets reused in a normal usage.  I have issued Restart-CacheHost to one of the two hosts - it started as usual with 120-130 MB memory usage but raised to 200 MB shortly after this. The other host stayed at 700 MB. Reported cache size is still zero - there are only default regions left. So it synchronized something?

     

  • 2011年3月9日 14:50
     
     

    The caching service assumes that it is running in a dedicated environment, so there may be times when it doesn't release memory as an optimization. Also, .NET has something called the Large Object Heap that does not get compacted when you invoke the garbage collector. What are the typical sizes of objects that you're storing in the cache? If you're looking at just the size of the data, the serialized object size would be much greater.

    The real question is what happens on multiple runs. If you run your test once, memory goes up and does not come back down. But if you wait until the cache is 0 items and run the test again, does the memory go up an equal amount? And then what about a third test. It would be interesting to see if you have a bonified leak (which would be shown by the memory increasing until physical memory is exhausted). Or is the service just reserving memory for its own use and will eventually reuse or free it depending on the memory of the box.

    Also, I'm trying to invesitgate the WCF buffer pool question. I'll have more information for you soon.

    Thanks.

    Jason

  • 2011年3月9日 15:10
     
     

    That is exactly what I am doing now. At the start of the second test memory usage actually drops about 100 mb and stays there some time but then again goes up. I will run more tests after these items expire till eventualy hit the same errors as I got in production.

    Here is seen memory usage at the start of the test:

    http://img577.imageshack.us/f/memoryusage.png/

    With this artificial test sessions are 200 kb each as seen from the cache stats, didn't measured them manually. There will be kept last 10 copyes of the viewstate so I expect them to average to 1 MB per session. I am trying to reduce pages size with keeping viewstate in the session and the session in the cache store.

  • 2011年3月10日 15:27
     
     

    I have tried 3-4 test runs each with about 1 GB data (including secondaries) with delay between them items to expire. Peak memory usage is about 1.3-1.5 GB for each of the two hosts so did not suceeded to reporduce the problem when memory is exausted. (its happens at about 1.8 GB mark). Next test was the same one but with low speed so it completes in 4-5 hours instead of 20-30 min. so resemble the load when in production - with not big amount of data but continously. Cache size did not exceeded 100 MB, memory usage was about 1 GB per host. I have not restared hosts. Once one of the hosts dropped to 500 MB usage temporarily.

    It seems that memory is not lost... at least not all of it. Theory about fragmentation sounds likely but still I would call it a problem because its just too much. I have tried velocity before (beta version before merging to appfabric) and it used not more than 200-300 MB in our environment but stopped using it because of random errors.

    I am not sure if this fragmentation/memory hunger  depends on number of caches because in production I've setup 4-5 different caches for the difefrent sites. Would it be better they to use one cache and differ by SharedID only?

    Tomorrow morning I am going to enable production sites again to use the cache and will add 2 more hosts so the cluster will consist of 4 hosts. I hope that will help and to not have to touch it during weekend :)

     

  • 2011年3月14日 7:04
     
     

    Already 3 days almost all of production sites using it and memory is stable at 1 - 1.1 GB per host (4 nodes). I've reduced number of named caches to 2 and used SharedID's instead.

    Actual cache usage is not more than 150-200 MB including secondaires!!

  • 2011年3月14日 16:21
     
     

    On 2 of the nodes there are also web server that uses about 800 MB so available memory is lower (3 GB total) and at 75-80% Memory usage, Cache service is 800-1100 GB there are errors in the application because of missing sessions in the cache.

    Also in "Application Server-System Services" Operational log there are constantly errors about low memory.

    Service available memory low - Cache private bytes percent {26} Cache working set percent {25} Cache data size percent {0} Available memory percent {21} CLR Generation2 count {165} Released memory percent {0}.

     

    There are just too much problems to use this in production so I've give up on this.

  • 2011年3月14日 17:52
     
     

    I'm really sorry for your frustration. For the size of your cache, the memory does seem extremely high. If you want to send me any additional information offline, my Microsoft email is jroth at microsoft dotcom. In particular, I'd be interested in the code that you use to add information to the cache (unless it is only through session state). Or if you had a place where I could download the dump file you collected, maybe we could get a better understanding of the memory in that file. I was looking through your old posts, and I see that in one test your average sessions state size was 180 KB approximately.

    With this said, I know you are planning to drop AppFarbric and use another solution. So if you don't want to provide further information which requires further time, I understand.

    If I find new information, I'll post it to this thread for you or others.

    Jason Roth

  • 2011年3月14日 18:32
     
     

    We use the cache only for session store with its session provider so there is nothing hard for reproducing the problem. Sesions are with  avg size 200 kb - we store last few viewstates.

    First I've planned to use 2 virtual servers for high availability with 1.5GB each. First problem - it wants Windows Enterprise! Well this is a marketing thing we have rights to use whatever version need on our virtualized servers so...Ок.

    Next problem - it need to be in the domain to use the sql provider!!? Its in the DMZ and we needed to open some holes.

    Next - it turned out that when one of the nodes is stopped the other one throws errors on put that there is no write quorum that is ridiculous - I know that it is not high available but let it to be some-available and just hide this error.

    Then I've added third (and later fourth) node. Planned to use 1.5 GB memory per node but because of the high memory usage increased it to 3 GB per host so from the first planned 2 hosts with 3 GB total I've ended up with 4 hosts, 12 GB memory and its still not enough!

    I've found our SA ID but I can't see why to open a case...

    I will see tomorrow what more info can provide you. Next week I'm going to evaluate ScaleOut's server and I hope to resolve our problems without more unexpected problems.

  • 2011年3月15日 12:09
     
     

    I know you are done testing, but here is a hotfix that we're testing out with some customers that report high memory usage: http://support.microsoft.com/kb/983182. This is a WCF hotfix, but it could be related to the memory issues because AppFabric makes use of WCF. Even if you decide not to test, I wanted to put this in the thread for others to try as a test. Thanks.

    Jason Roth

  • 2011年3月15日 13:16
     
     

    Thanks, I've got it and the following days can try to test it on the test environment.

    Today I've found that one of the sites still use it and there was 8 MB cache and decided make and examine a new process dump - 1.2 GB in size from one of the hosts. This time I've succeeded to load sos and run some more commands. It seems .net memory is 200 MB but where is the rest of it I cannot see. There was error on !address -summary with WinDbg 6.12 so I used 6.11 that complete it without error but numbers seem strange to me.

    If you want I can send you a link to the dump.


    Here some results

    0:000> !eeheap -gc
    Number of GC Heaps: 2
    ------------------------------
    Heap 0 (00000000000c3460)
    generation 0 starts at 0x00000000806a8880
    generation 1 starts at 0x0000000080687608
    generation 2 starts at 0x000000007fff0068
    ephemeral segment allocation context: none
             segment             begin         allocated  size
    000000007fff0000  000000007fff0068  0000000084b47bf8  0x4b57b90(79002512)
    Large object heap starts at 0x00000000ffff0068
             segment             begin         allocated  size
    00000000ffff0000  00000000ffff0068  000000010229d4f8  0x22ad490(36361360)
    Heap Size:       Size: 0x6e05020 (115363872) bytes.
    ------------------------------
    Heap 1 (00000000000c60d0)
    generation 0 starts at 0x00000000c0b056e0
    generation 1 starts at 0x00000000c053d3b0
    generation 2 starts at 0x00000000bfff0068
    ephemeral segment allocation context: none
             segment             begin         allocated  size
    00000000bfff0000  00000000bfff0068  00000000c46f3020  0x4702fb8(74461112)
    Large object heap starts at 0x000000010fff0068
             segment             begin         allocated  size
    000000010fff0000  000000010fff0068  0000000113fccd30  0x3fdccc8(66964680)
    Heap Size:       Size: 0x86dfc80 (141425792) bytes.
    ------------------------------
    GC Heap Size:    Size: 0xf4e4ca0 (256789664) bytes.


    0:000> !HeapStat
    Heap             Gen0         Gen1         Gen2          LOH
    Heap0        71955320       135800      6911392     36361360
    Heap1        62839104      6062896      5559112     66964680
    Total       134794424      6198696     12470504    103326040

    Free space:                                                 Percentage
    Heap0        65713192          104          592     16258936SOH: 83% LOH: 44%
    Heap1        55104752          504          104     42714536SOH: 74% LOH: 63%
    Total       120817944          608          696     58973472


    0:000> !address -summary
                                      
    Failed to map Heaps (error 80004005)

    --- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    Free                                    162      7ff`5194f000 (   7.997 Tb)           99.97%
    <unclassified>                          288        0`a4c9d000 (   2.575 Gb)  94.48%    0.03%
    Image                                  1012        0`092da000 ( 146.852 Mb)   5.26%    0.00%
    Stack                                    75        0`006cb000 (   6.793 Mb)   0.24%    0.00%
    TEB                                      25        0`00032000 ( 200.000 kb)   0.01%    0.00%
    NlsTables                                 1        0`00023000 ( 140.000 kb)   0.00%    0.00%
    CsrSharedMemory                           1        0`00005000 (  20.000 kb)   0.00%    0.00%
    ActivationContextData                     1        0`00004000 (  16.000 kb)   0.00%    0.00%
    PEB                                       1        0`00001000 (   4.000 kb)   0.00%    0.00%

    --- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    MEM_PRIVATE                             295        0`a44ea000 (   2.567 Gb)  94.21%    0.03%
    MEM_IMAGE                              1075        0`094b2000 ( 148.695 Mb)   5.33%    0.00%
    MEM_MAPPED                               34        0`00d05000 (  13.020 Mb)   0.47%    0.00%

    --- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    MEM_FREE                                162      7ff`5194f000 (   7.997 Tb)           99.97%
    MEM_RESERVE                             140        0`8eda9000 (   2.232 Gb)  81.90%    0.03%
    MEM_COMMIT                             1264        0`1f8f8000 ( 504.969 Mb)  18.10%    0.01%

    --- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    PAGE_READWRITE                          431        0`15a7e000 ( 346.492 Mb)  12.42%    0.00%
    PAGE_EXECUTE_READ                       106        0`067fe000 ( 103.992 Mb)   3.73%    0.00%
    PAGE_READONLY                           314        0`02348000 (  35.281 Mb)   1.26%    0.00%
    PAGE_WRITECOPY                          248        0`00e6c000 (  14.422 Mb)   0.52%    0.00%
    PAGE_EXECUTE_READWRITE                   87        0`00337000 (   3.215 Mb)   0.12%    0.00%
    PAGE_EXECUTE_WRITECOPY                   38        0`000f7000 ( 988.000 kb)   0.03%    0.00%
    PAGE_READWRITE|PAGE_GUARD                39        0`00096000 ( 600.000 kb)   0.02%    0.00%
    PAGE_EXECUTE                              1        0`00004000 (  16.000 kb)   0.00%    0.00%

    --- Largest Region by Usage ----------- Base Address -------- Region Size ----------
    Free                                      1`1fff0000      7fd`d2a00000 (   7.991 Tb)
    <unclassified>                            0`c66f4000        0`398fc000 ( 920.984 Mb)
    Image                                   7fe`f30ac000        0`01211000 (  18.066 Mb)
    Stack                                     0`02460000        0`000fc000 (1008.000 kb)
    TEB                                     7ff`ffed8000        0`00002000 (   8.000 kb)
    NlsTables                               7ff`fffb0000        0`00023000 ( 140.000 kb)
    CsrSharedMemory                           0`7efe0000        0`00005000 (  20.000 kb)
    ActivationContextData                     0`00030000        0`00004000 (  16.000 kb)
    PEB                                     7ff`fffdf000        0`00001000 (   4.000 kb)


    0:000> !address -summary
     ProcessParametrs 0000000000071cc0 in range 0000000000070000 00000000000f0000
     Environment 0000000000071320 in range 0000000000070000 00000000000f0000

    -------------------- Usage SUMMARY --------------------------
        TotSize (      KB)   Pct(Tots) Pct(Busy)   Usage
       a29e2000 ( 2664328) : 00.03%    93.24%    : RegionUsageIsVAD
       7ff5194f000 (8587076924) : 99.97%    00.00%    : RegionUsageFree
        92da000 (  150376) : 00.00%    05.26%    : RegionUsageImage
        1900000 (   25600) : 00.00%    00.90%    : RegionUsageStack
          32000 (     200) : 00.00%    00.01%    : RegionUsageTeb
        10b2000 (   17096) : 00.00%    00.60%    : RegionUsageHeap
              0 (       0) : 00.00%    00.00%    : RegionUsagePageHeap
           1000 (       4) : 00.00%    00.00%    : RegionUsagePeb
              0 (       0) : 00.00%    00.00%    : RegionUsageProcessParametrs
              0 (       0) : 00.00%    00.00%    : RegionUsageEnvironmentBlock
           Tot: 7ffffff0000 (8589934528 KB) Busy: 00000000ae6a1000 (2857604 KB)

    -------------------- Type SUMMARY --------------------------
        TotSize (      KB)   Pct(Tots)  Usage
       7ff5194f000 (8587076924) : 99.97%   : <free>
        94b2000 (  152264) : 00.00%   : MEM_IMAGE
         d05000 (   13332) : 00.00%   : MEM_MAPPED
       a44ea000 ( 2692008) : 00.03%   : MEM_PRIVATE

    -------------------- State SUMMARY --------------------------
        TotSize (      KB)   Pct(Tots)  Usage
       1f8f8000 (  517088) : 00.01%   : MEM_COMMIT
       7ff5194f000 (8587076924) : 99.97%   : MEM_FREE
       8eda9000 ( 2340516) : 00.03%   : MEM_RESERVE

    Largest free region: Base 000000011fff0000 - Size 000007fdd2a00000 (8580802560 KB)


    0:000> !dumpheap -stat
    ------------------------------
    Heap 0
    total 0 objects
    ------------------------------
    Heap 1
    total 0 objects
    ------------------------------
    total 0 objects
    Statistics:
                  MT    Count    TotalSize Class Name
    **************** last few
    000007fef90301c8     3387       135480 System.Collections.ArrayList
    000007ff003d2d28     1412       135552 Microsoft.ApplicationServer.Caching.OMRegionStats
    000007ff003d2548     2824       135552 Microsoft.ApplicationServer.Caching.OMRegionProperties
    000007ff00290b88     1419       136224 Microsoft.ApplicationServer.Caching.DMHashContainer
    000007ff0028b6e0     2456       137536 Microsoft.ApplicationServer.Caching.BaseDirectoryNode
    000007fef9036c88     1966       173008 System.Signature
    000007fef904cce0     1697       203640 System.Reflection.RuntimeParameterInfo
    000007ff003895b0     4432       212736 Microsoft.ApplicationServer.Caching.VelocityReplicationOperation
    000007ff0038c628     1412       259808 Microsoft.ApplicationServer.Caching.OMRegion
    000007fef9028268    12630       303120 System.RuntimeTypeHandle
    000007fef902c7d8     5322       313968 System.Int32[]
    000007ff003afab8    10752       344064 Microsoft.Fabric.Data.Replication.SecondaryReplica+ReplicationOperationContainer
    000007fef9031ea8      243       410640 System.Collections.Hashtable+bucket[]
    000007fef8216250     3218       411904 System.Diagnostics.ProcessInfo
    000007fef9027618     3795       425040 System.Reflection.RuntimeMethodInfo
    000007ff0028f9c0     9988       639232 Microsoft.ApplicationServer.Caching.DMOperationCallBack
    000007ff003dc028    12672       709632 Microsoft.Fabric.Data.Replication.ReplicaReplicationContainer
    000007ff003d4538    13952       781312 Microsoft.Fabric.Data.Replication.StoreOperationContainer
    000007fef821c190     2999       815728 System.Diagnostics.NtProcessInfoHelper+SystemProcessInformation
    000007ff0038a1f8     4479      1074960 Microsoft.ApplicationServer.Caching.RequestBody
    000007fef905cf80     2880      1528848 System.Int64[]
    000007fef821f438    38413      1843824 System.Diagnostics.ThreadInfo
    000007fef9026960    36039      2631856 System.String
    000007fef8209b98    32809      3149664 System.Diagnostics.NtProcessInfoHelper+SystemThreadInformation
    000007fef902ae68    25317      3669296 System.Object[]
    000007fef9030bb0     1980     52123528 System.Byte[]
    000000000009a3a0      262    179792720      Free
    Total 320016 objects
    Fragmented blocks larger than 0.5 MB:
                Addr     Size      Followed by
    0000000080d04268    1.9MB 0000000080ef4218 System.Byte[]
    0000000080f16260    1.7MB 00000000810bfcc0 System.Byte[]
    00000000810bfdc0    4.1MB 00000000814daeb8 System.ServiceModel.Channels.OverlappedIOCompleteCallback
    00000000814e50b8    0.7MB 000000008159c488 System.Runtime.SynchronizedPool`1+GlobalPool[[System.Byte[], mscorlib]]
    000000008159c6e8   39.0MB 0000000083c971c8 System.Byte[]
    0000000083c979e0   14.7MB 0000000084b44fe8 System.Byte[]
    00000000c1260e90    8.4MB 00000000c1abf070 System.Byte[]
    00000000c1ac1088   44.2MB 00000000c46ed018 System.ServiceModel.Channels.OverlappedIOCompleteCallback


    0:000> !dumpheap -mt 000000000009a3a0
    ------------------------------
    Heap 0
             Address               MT     Size
    ****
    0000000080c91748 000000000009a3a0   240072 Free
    0000000080cce238 000000000009a3a0   212744 Free
    0000000080d04268 000000000009a3a0  2031536 Free
    0000000080ef6230 000000000009a3a0    43264 Free
    0000000080f02c58 000000000009a3a0    71152 Free
    0000000080f16260 000000000009a3a0  1743456 Free
    00000000810bfdc0 000000000009a3a0  4305144 Free
    00000000814dcfe0 000000000009a3a0    24496 Free
    00000000814e50b8 000000000009a3a0   750544 Free
    000000008159c6e8 000000000009a3a0 40872672 Free
    0000000083c979e0 000000000009a3a0 15390216 Free
    0000000084b47000 000000000009a3a0     2504 Free
    00000000ffff0068 000000000009a3a0       24 Free
    00000000ffff2080 000000000009a3a0       24 Free
    00000000ffff24b8 000000000009a3a0     7136 Free
    00000000ffff6098 000000000009a3a0       24 Free
    00000000ffffa090 000000000009a3a0       24 Free
    0000000100002048 000000000009a3a0       24 Free
    0000000100002880 000000000009a3a0     6112 Free
    0000000100005080 000000000009a3a0     4064 Free
    0000000100015f80 000000000009a3a0       24 Free
    0000000100017fb8 000000000009a3a0       24 Free
    000000010003bfc8 000000000009a3a0       24 Free
    0000000100080030 000000000009a3a0       24 Free
    0000000100080048 000000000009a3a0  1032352 Free
    000000010047c118 000000000009a3a0  1048624 Free
    000000010067c160 000000000009a3a0  2229040 Free
    00000001008bc4a8 000000000009a3a0       48 Free
    00000001009bc4f0 000000000009a3a0  1442320 Free
    0000000100c1c718 000000000009a3a0  1705592 Free
    0000000100ddcda8 000000000009a3a0  1705352 Free
    000000010117d348 000000000009a3a0  1573440 Free
    000000010137d5a0 000000000009a3a0   262240 Free
    00000001017dd648 000000000009a3a0   394752 Free
    0000000101a3dc60 000000000009a3a0   783744 Free
    0000000101b1d1f8 000000000009a3a0  1966800 Free
    0000000101d7d4e0 000000000009a3a0  1966824 Free
    0000000101f7d7e0 000000000009a3a0   130280 Free
    total 0 objects
    ------------------------------
    Heap 1
             Address               MT     Size
    ***
    00000000c123ee60 000000000009a3a0       48 Free
    00000000c1260e90 000000000009a3a0  8774112 Free
    00000000c1ac1088 000000000009a3a0 46317456 Free
    00000000c46ef140 000000000009a3a0     1408 Free
    00000000c46f17e8 000000000009a3a0     2472 Free
    00000000c46f23c0 000000000009a3a0      904 Free
    000000010fff0068 000000000009a3a0       24 Free
    0000000110010098 000000000009a3a0       24 Free
    00000001100700e0 000000000009a3a0       24 Free
    0000000110090110 000000000009a3a0       24 Free
    00000001100b0140 000000000009a3a0       24 Free
    00000001100d0170 000000000009a3a0       24 Free
    00000001100f01a0 000000000009a3a0       24 Free
    00000001101101d0 000000000009a3a0       24 Free
    0000000110130200 000000000009a3a0       24 Free
    0000000110150230 000000000009a3a0       24 Free
    0000000110170260 000000000009a3a0       24 Free
    0000000110a69eb0 000000000009a3a0       24 Free
    0000000110a89ee0 000000000009a3a0       24 Free
    0000000110aa9f10 000000000009a3a0       24 Free
    0000000110aa9f28 000000000009a3a0   130760 Free
    0000000110ce9e20 000000000009a3a0       24 Free
    0000000110d09e50 000000000009a3a0       24 Free
    0000000110d29e80 000000000009a3a0       24 Free
    0000000110d29e98 000000000009a3a0   130952 Free
    0000000110e69e68 000000000009a3a0       24 Free
    0000000110e89e98 000000000009a3a0       24 Free
    0000000110ea9ec8 000000000009a3a0       24 Free
    0000000110ea9ee0 000000000009a3a0   130976 Free
    0000000110f09eb0 000000000009a3a0       24 Free
    0000000110f29ee0 000000000009a3a0       24 Free
    0000000110f29ef8 000000000009a3a0  6160888 Free
    000000011154a108 000000000009a3a0  1048912 Free
    000000011168a270 000000000009a3a0  6948616 Free
    0000000111daa990 000000000009a3a0       24 Free
    0000000111e2a9c0 000000000009a3a0   262552 Free
    0000000111f2ab88 000000000009a3a0  2619808 Free
    00000001123aa558 000000000009a3a0 15734424 Free
    00000001134abc08 000000000009a3a0   131120 Free
    00000001134ebc50 000000000009a3a0  9310408 Free
    total 0 objects
    ------------------------------
    total 0 objects
    Statistics:
                  MT    Count    TotalSize Class Name
    000000000009a3a0      262    179792720      Free
    Total 262 object

    0:000> !dumpobj 0xc1ac1088
    Free Object
    Size:        46317456(0x2c2bf90) bytes

    0:000> !dumpobj 0xc1ac1088+2c2bf90
    Name:        System.ServiceModel.Channels.OverlappedIOCompleteCallback
    MethodTable: 000007fef31f4198
    EEClass:     000007fef2cbd510
    Size:        64(0x40) bytes
    File:        C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel\v4.0_4.0.0.0__b77a5c561934e089\System.ServiceModel.dll
    Fields:
                  MT    Field   Offset                 Type VT     Attr            Value Name
    000007fef9025ab8  4000074        8        System.Object  0 instance 00000000803ab088 _target
    000007fef9025ab8  4000075       10        System.Object  0 instance 0000000000000000 _methodBase
    000007fef9033540  4000076       18        System.IntPtr  1 instance      7fef3037ed0 _methodPtr
    000007fef9033540  4000077       20        System.IntPtr  1 instance                0 _methodPtrAux
    000007fef9025ab8  4000078       28        System.Object  0 instance 0000000000000000 _invocationList
    000007fef9033540  4000079       30        System.IntPtr  1 instance                0 _invocationCount

     

    0:000> !dumpobj 0xc1ac1088+2c2bf90
    Name:        System.ServiceModel.Channels.OverlappedIOCompleteCallback
    MethodTable: 000007fef31f4198
    EEClass:     000007fef2cbd510
    Size:        64(0x40) bytes
    File:        C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel\v4.0_4.0.0.0__b77a5c561934e089\System.ServiceModel.dll
    Fields:
                  MT    Field   Offset                 Type VT     Attr            Value Name
    000007fef9025ab8  4000074        8        System.Object  0 instance 00000000803ab088 _target
    000007fef9025ab8  4000075       10        System.Object  0 instance 0000000000000000 _methodBase
    000007fef9033540  4000076       18        System.IntPtr  1 instance      7fef3037ed0 _methodPtr
    000007fef9033540  4000077       20        System.IntPtr  1 instance                0 _methodPtrAux
    000007fef9025ab8  4000078       28        System.Object  0 instance 0000000000000000 _invocationList
    000007fef9033540  4000079       30        System.IntPtr  1 instance                0 _invocationCount
    0:000> !gcroot 0xc1ac1088+2c2bf90
    Note: Roots found on stacks may be false positives. Run "!help gcroot" for
    more info.
    Scan Thread 0 OSTHread 528
    RSP:2fed78:Root:  00000000bfff5928(System.Object[])->
      00000000bfff5950(Microsoft.ApplicationServer.Caching.VelocityWindowsService)->
      000000007fff80b8(Microsoft.ApplicationServer.Caching.ServiceLayer)->
      00000000c0045c00(Microsoft.ApplicationServer.Caching.DistributedObjectManager)->
      00000000c00e8378(Microsoft.ApplicationServer.Caching.CASPerfCounter)->
      00000000c0056b38(Microsoft.Fabric.Data.ReliableServiceManager)->
      00000000c005d0b8(Microsoft.Fabric.Data.PARA.PartitionAgent)->
      00000000c005d238(Microsoft.Fabric.Data.Replication.Replicator)->
      0000000080402730(System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]])->
      0000000080402908(System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]][])->
      00000000804016a0(Microsoft.Fabric.Data.Replication.ReplicationSession)->
      00000000804031b8(Microsoft.Fabric.Common.Timer)->
      00000000c005a838(Microsoft.Fabric.Common.NormalPriorityTimerQueue)->
      00000000802641a0(Microsoft.Fabric.Common.Timer)->
      0000000080263d38(Microsoft.Fabric.Common.TcpOutputSession)->
      00000000c005eb38(Microsoft.Fabric.Common.TcpTransportFactory)->
      00000000c0061390(System.ServiceModel.Channels.TcpChannelFactory`1[[System.ServiceModel.Channels.IDuplexSessionChannel, System.ServiceModel]])->
      00000000c00614b0(System.ServiceModel.Channels.CommunicationObjectManager`1[[System.ServiceModel.Channels.IChannel, System.ServiceModel]])->
      00000000c00614e8(System.Collections.Hashtable)->
      000000008021f818(System.Collections.Hashtable+bucket[])->
      00000000803ac158(System.ServiceModel.Channels.ClientFramingDuplexSessionChannel)->
      00000000803ab1e0(System.ServiceModel.Channels.BufferedConnection)->
      00000000803ab088(System.ServiceModel.Channels.SocketConnection)->
      00000000c46ed018(System.ServiceModel.Channels.OverlappedIOCompleteCallback)
    Scan Thread 4 OSTHread 538
    Scan Thread 7 OSTHread 57c
    Scan Thread 9 OSTHread 720
    RSP:31aef38:Root:  00000000c00e8778(Microsoft.ApplicationServer.Caching.MemoryPressureMonitor)->
      00000000c00e8738(Microsoft.ApplicationServer.Caching.ThrottleDelegate)->
      00000000c0045c00(Microsoft.ApplicationServer.Caching.DistributedObjectManager)->
      00000000c00e8378(Microsoft.ApplicationServer.Caching.CASPerfCounter)->
      00000000c0056b38(Microsoft.Fabric.Data.ReliableServiceManager)->
      00000000c005d0b8(Microsoft.Fabric.Data.PARA.PartitionAgent)->
      00000000c005d238(Microsoft.Fabric.Data.Replication.Replicator)->
      0000000080402730(System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]])->
      0000000080402908(System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]][])->
      00000000804016a0(Microsoft.Fabric.Data.Replication.ReplicationSession)->
      00000000804031b8(Microsoft.Fabric.Common.Timer)->
      00000000c005a838(Microsoft.Fabric.Common.NormalPriorityTimerQueue)->
      00000000802641a0(Microsoft.Fabric.Common.Timer)->
      0000000080263d38(Microsoft.Fabric.Common.TcpOutputSession)->
      00000000c005eb38(Microsoft.Fabric.Common.TcpTransportFactory)->
      00000000c0061390(System.ServiceModel.Channels.TcpChannelFactory`1[[System.ServiceModel.Channels.IDuplexSessionChannel, System.ServiceModel]])->
      00000000c00614b0(System.ServiceModel.Channels.CommunicationObjectManager`1[[System.ServiceModel.Channels.IChannel, System.ServiceModel]])->
      00000000c00614e8(System.Collections.Hashtable)->
      000000008021f818(System.Collections.Hashtable+bucket[])->
      00000000803ac158(System.ServiceModel.Channels.ClientFramingDuplexSessionChannel)->
      00000000803ab1e0(System.ServiceModel.Channels.BufferedConnection)->
      00000000803ab088(System.ServiceModel.Channels.SocketConnection)->
      00000000c46ed018(System.ServiceModel.Channels.OverlappedIOCompleteCallback)
    RSP:31aef50:Root:  00000000c00e8778(Microsoft.ApplicationServer.Caching.MemoryPressureMonitor)->
      00000000c00e8738(Microsoft.ApplicationServer.Caching.ThrottleDelegate)->
      00000000c0045c00(Microsoft.ApplicationServer.Caching.DistributedObjectManager)->
      00000000c00e8378(Microsoft.ApplicationServer.Caching.CASPerfCounter)->
      00000000c0056b38(Microsoft.Fabric.Data.ReliableServiceManager)->
      00000000c005d0b8(Microsoft.Fabric.Data.PARA.PartitionAgent)->
      00000000c005d238(Microsoft.Fabric.Data.Replication.Replicator)->
      0000000080402730(System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]])->
      0000000080402908(System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]][])->
      00000000804016a0(Microsoft.Fabric.Data.Replication.ReplicationSession)->
      00000000804031b8(Microsoft.Fabric.Common.Timer)->
      00000000c005a838(Microsoft.Fabric.Common.NormalPriorityTimerQueue)->
      00000000802641a0(Microsoft.Fabric.Common.Timer)->
      0000000080263d38(Microsoft.Fabric.Common.TcpOutputSession)->
      00000000c005eb38(Microsoft.Fabric.Common.TcpTransportFactory)->
      00000000c0061390(System.ServiceModel.Channels.TcpChannelFactory`1[[System.ServiceModel.Channels.IDuplexSessionChannel, System.ServiceModel]])->
      00000000c00614b0(System.ServiceModel.Channels.CommunicationObjectManager`1[[System.ServiceModel.Channels.IChannel, System.ServiceModel]])->
      00000000c00614e8(System.Collections.Hashtable)->
      000000008021f818(System.Collections.Hashtable+bucket[])->
      00000000803ac158(System.ServiceModel.Channels.ClientFramingDuplexSessionChannel)->
      00000000803ab1e0(System.ServiceModel.Channels.BufferedConnection)->
      00000000803ab088(System.ServiceModel.Channels.SocketConnection)->
      00000000c46ed018(System.ServiceModel.Channels.OverlappedIOCompleteCallback)
    Scan Thread 10 OSTHread 81c
    Scan Thread 12 OSTHread 868
    RSP:412e598:Root:  00000000c005ea28(Microsoft.Fabric.Common.HighPriorityTimer)->
      00000000c00573b0(Microsoft.Fabric.Common.HighPriorityTimerQueue)->
      00000000c03fac08(Microsoft.Fabric.Common.HighPriorityTimer)->

    ....

    0:000> !dumpobj 0x8159c6e8
    Free Object
    Size:        40872672(0x26faae0) bytes
    0:000> !dumpobj 0x8159c6e8+26faae0
    Name:        System.Byte[]
    MethodTable: 000007fef9030bb0
    EEClass:     000007fef8bb2480
    Size:        2072(0x818) bytes
    Array:       Rank 1, Number of elements 2048, Type Byte
    Element Type:System.Byte
    Content:     ...........V...s...a.V.D.....:http://schemas.microsoft.com/mb/2004/09/rp/message/genericB).....B...129434366544687500B...1294354
    Fields:
    None

    0:000> !gcroot 0xc1ac1088+2c2bf90
    Note: Roots found on stacks may be false positives. Run "!help gcroot" for
    more info.
    Scan Thread 0 OSTHread 528
    RSP:2fed78:Root:  00000000bfff5928(System.Object[])->
      00000000bfff5950(Microsoft.ApplicationServer.Caching.VelocityWindowsService)->
      000000007fff80b8(Microsoft.ApplicationServer.Caching.ServiceLayer)->
      00000000c0045c00(Microsoft.ApplicationServer.Caching.DistributedObjectManager)->
      00000000c00e8378(Microsoft.ApplicationServer.Caching.CASPerfCounter)->
      00000000c0056b38(Microsoft.Fabric.Data.ReliableServiceManager)->
      00000000c005d0b8(Microsoft.Fabric.Data.PARA.PartitionAgent)->
      00000000c005d238(Microsoft.Fabric.Data.Replication.Replicator)->
      0000000080402730(System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]])->
      0000000080402908(System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Microsoft.Fabric.Data.Replication.ReplicationSession, Microsoft.WindowsFabric.Data]][])->
      00000000804016a0(Microsoft.Fabric.Data.Replication.ReplicationSession)->
      00000000804031b8(Microsoft.Fabric.Common.Timer)->
      00000000c005a838(Microsoft.Fabric.Common.NormalPriorityTimerQueue)->
      00000000802641a0(Microsoft.Fabric.Common.Timer)->
      0000000080263d38(Microsoft.Fabric.Common.TcpOutputSession)->
      00000000c005eb38(Microsoft.Fabric.Common.TcpTransportFactory)->
      00000000c0061390(System.ServiceModel.Channels.TcpChannelFactory`1[[System.ServiceModel.Channels.IDuplexSessionChannel, System.ServiceModel]])->
      00000000c00614b0(System.ServiceModel.Channels.CommunicationObjectManager`1[[System.ServiceModel.Channels.IChannel, System.ServiceModel]])->
      00000000c00614e8(System.Collections.Hashtable)->
      000000008021f818(System.Collections.Hashtable+bucket[])->
      00000000803ac158(System.ServiceModel.Channels.ClientFramingDuplexSessionChannel)->
      00000000803ab1e0(System.ServiceModel.Channels.BufferedConnection)->
      00000000803ab088(System.ServiceModel.Channels.SocketConnection)->
      00000000c46ed018(System.ServiceModel.Channels.OverlappedIOCompleteCallback)
    Scan Thread 4 OSTHread 538
    Scan Thread 7 OSTHread 57c
    Scan Thread 9 OSTHread 720
    RSP:31aef38:Root:  00000000c00e8778(Microsoft.ApplicationServer.Caching.MemoryPressureMonitor)->
      00000000c00e8738(Microsoft.ApplicationServer.Caching.ThrottleDelegate)->
      00000000c0045c00(Microsoft.ApplicationServer.Caching.DistributedObjectManager)->

    ....


    0:000> !locks

    Scanned 1263 critical sections

     

    0:000> !threads
    ThreadCount:      19
    UnstartedThread:  0
    BackgroundThread: 15
    PendingThread:    0
    DeadThread:       3
    Hosted Runtime:   no
                                               PreEmptive                                                   Lock
           ID  OSID        ThreadOBJ     State GC       GC Alloc Context                  Domain           Count APT Exception
       0    1   528 00000000000bebd0      a020 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA
       4    2   538 00000000000c9070      b220 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA (Finalizer)
       7    4   57c 0000000000ec2ec0   3009220 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA (Threadpool Worker)
       9    6   720 0000000000f06c60   3009220 Enabled  00000000c125e848:00000000c125ee78 00000000000b2e30     0 MTA (Threadpool Worker)
      10    7   81c 00000000031f3670   100a220 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA (Threadpool Worker)
      12    a   868 0000000003220650   200b220 Enabled  0000000080c1dec0:0000000080c1fdd8 00000000000b2e30     0 MTA
      13    b   86c 0000000003275400   200b220 Enabled  0000000080c71860:0000000080c72d08 00000000000b2e30     0 MTA
      14   19   bec 00000000032db590      b220 Enabled  00000000c0d3d228:00000000c0d3d6e0 00000000000b2e30     0 MTA
      17   1e  16c0 00000000032df520   1009220 Enabled  0000000080c8b538:0000000080c8cd08 00000000000b2e30     0 MTA (Threadpool Worker)
      18   28  103c 00000000032da060   1009220 Enabled  00000000c123eee0:00000000c1240e78 00000000000b2e30     0 MTA (Threadpool Worker)
      19    3   fa0 00000000032d9950   1009220 Enabled  00000000c121fe60:00000000c1220e78 00000000000b2e30     0 MTA (Threadpool Worker)
      20   10   d04 00000000032d8b30   a009220 Enabled  0000000080bf8950:0000000080bf9da8 00000000000b2e30     0 MTA (Threadpool Completion Port)
      21   24   738 00000000032dee10   1009220 Enabled  00000000c125eee0:00000000c1260e78 00000000000b2e30     0 MTA (Threadpool Worker)
      22   25   db0 00000000032fade0      b220 Enabled  0000000080c2c110:0000000080c2de08 00000000000b2e30     0 MTA
      23   18   c88 00000000032ddff0      b220 Enabled  0000000080c5c440:0000000080c5de08 00000000000b2e30     0 MTA
    XXXX   26       00000000032de700   8019820 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA (Threadpool Completion Port)
    XXXX    5       00000000032dc3b0   8019820 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 MTA (Threadpool Completion Port)
    XXXX   23       00000000032d7d10   8019820 Enabled  0000000000000000:0000000000000000 00000000000b2e30     0 Ukn (Threadpool Completion Port)
      24   17  1a14 00000000032dcac0   8009220 Enabled  00000000c119c0f8:00000000c119ce78 00000000000b2e30     0 MTA (Threadpool Completion Port)

     

  • 2011年3月16日 10:04
     
     
    Unfortunately the patch didn't help. The results was better when testing - 1GB cache and memory usage was about 800 MB/per host. On the start of next day it dropped to 300 MB per host. However after 2 hours in production use we hit 800 mb per host or 3GB summary memory usage (across 4 hosts) for 70 MB cache! Invoke-CacheGC dropped this to about 2 GB summary and we stopped using it in production.
  • 2011年3月22日 12:33
     
     

    I'm sorry for your experience here. Other customers are using this successfully in production, but I am guessing that a lot of those customer are using dedicated cache servers rather than placing the cache service on web/application servers. I think that is your scenario here, if I am reading this thread information correctly. I think when the caching service is the dedicated service on the machine, that it can use a lot of extra memory without ever hitting low-memory conditions. Even so, I'm going to continue to look into this and to look for other customers that might come through our support channels with evidence of whether there are any other problems that are contributing to what you saw.

    Jason Roth

  • 2011年3月24日 13:50
     
     已答复

    First I've planned to use 2 virtual servers for high availability with 1.5GB each. First problem - it wants Windows Enterprise! Well this is a marketing thing we have rights to use whatever version need on our virtualized servers so...Ок.

    Next problem - it need to be in the domain to use the sql provider!!? Its in the DMZ and we needed to open some holes.

    Next - it turned out that when one of the nodes is stopped the other one throws errors on put that there is no write quorum that is ridiculous - I know that it is not high available but let it to be some-available and just hide this error.

    Then I've added third (and later fourth) node. Planned to use 1.5 GB memory per node but because of the high memory usage increased it to 3 GB per host so from the first planned 2 hosts with 3 GB total I've ended up with 4 hosts, 12 GB memory and its still not enough!

    I've found our SA ID but I can't see why to open a case...

    I will see tomorrow what more info can provide you. Next week I'm going to evaluate ScaleOut's server and I hope to resolve our problems without more unexpected problems.

    For a compare I have evaluated ScaleOut's server - it does not need central configuration store (ie SQL Server or windows share), does not need Enterprise Windows, does not need domain so we was able to configure it to work in an isolated virtual network, Works with 2 nodes with high availability and when only one is working nothing bad happens and when they are again two - duplicates data as if nothing happened. It uses 200 MB per node on 2 nodes in our production environment! Where AppFabric caching used 4 nodes with 1.5GB used on each one and wanted more...

    I will look at it again if and when AppFabric becomes competitive.

    • 已标记为答案 BAlexandrov 2011年3月24日 13:50
    •  
  • 2011年5月9日 8:58
     
     

    I'm sorry for your experience here. Other customers are using this successfully in production, but I am guessing that a lot of those customer are using dedicated cache servers rather than placing the cache service on web/application servers. I think that is your scenario here, if I am reading this thread information correctly. I think when the caching service is the dedicated service on the machine, that it can use a lot of extra memory without ever hitting low-memory conditions. Even so, I'm going to continue to look into this and to look for other customers that might come through our support channels with evidence of whether there are any other problems that are contributing to what you saw.

    Jason Roth

    It seems we've got the same issue of memory leaking when using AppFabric as caching service in production environment. We have 2 AppFabric 6.1 servers on virtual machines of Windows Server 2008 R2 Enterprise 64bit, 2 2.53Ghz processors, 3GB memory. 

    The cache store configuration is:
    CacheName: SessionStore
    TimeToLive: 30 mins
    CacheType: Partitioned
    Secondaries: 0
    IsExpirable:True
    EvictionType:None
    NotificationsEnabled:True

    We had a test program to simulate totally 30,000 sessions at the same time. Meaning that old sessions will be removed then new session will be added. We ran the program for about 6 hours, the memory usage of caching service raises to 2.53 GB, and it would not be freed even if I stopped the program. We only store session object in cache, each one is very small.

    The statistic is :
    Size:26211589
    ItemCount:29487
    RegionCount:1
    RequestCount:7617154
    MissCount:2205039 

    Why AppFabric caching service eats so much memory?

    Thanks,
    Kriss 

  • 2012年4月16日 1:15
     
     

    I'm sorry for your experience here. Other customers are using this successfully in production, but I am guessing that a lot of those customer are using dedicated cache servers rather than placing the cache service on web/application servers. I think that is your scenario here, if I am reading this thread information correctly. I think when the caching service is the dedicated service on the machine, that it can use a lot of extra memory without ever hitting low-memory conditions. Even so, I'm going to continue to look into this and to look for other customers that might come through our support channels with evidence of whether there are any other problems that are contributing to what you saw.

    Jason Roth

    It seems we've got the same issue of memory leaking when using AppFabric as caching service in production environment. We have 2 AppFabric 6.1 servers on virtual machines of Windows Server 2008 R2 Enterprise 64bit, 2 2.53Ghz processors, 3GB memory. 

    The cache store configuration is:
    CacheName: SessionStore
    TimeToLive: 30 mins
    CacheType: Partitioned
    Secondaries: 0
    IsExpirable:True
    EvictionType:None
    NotificationsEnabled:True

    We had a test program to simulate totally 30,000 sessions at the same time. Meaning that old sessions will be removed then new session will be added. We ran the program for about 6 hours, the memory usage of caching service raises to 2.53 GB, and it would not be freed even if I stopped the program. We only store session object in cache, each one is very small.

    The statistic is :
    Size:26211589
    ItemCount:29487
    RegionCount:1
    RequestCount:7617154
    MissCount:2205039 

    Why AppFabric caching service eats so much memory?

    Thanks,
    Kriss 

    I am seeing the SAME behaviour.

    My scanerio is pretty simple.

    1- 1 Server, Windows 7, Shared Location for Storage (XML)

    2- Nothing in AppServer

    3- Take 600 +Mb by default and more ...

    HostName = RIZWAN-LT.vsg.vsgsolutions.com
    -------------------------

        NamedCache = Horizon
            Healthy               = 50.00
            UnderReconfiguration  = 0.00
            NotPrimary            = 0.00
            InadequateSecondaries = 0.00
            Throttled             = 0.00

        NamedCache = default
            Healthy               = 50.00
            UnderReconfiguration  = 0.00
            NotPrimary            = 0.00
            InadequateSecondaries = 0.00
            Throttled             = 0.00

    PS C:\Windows\system32> Get-CacheStatistics Horizon

    Size              : 0
    ItemCount         : 0
    RegionCount       : 0
    RequestCount      : 0
    ReadRequestCount  : 0
    WriteRequestCount : 0
    MissCount         : 0
    IncomingBandwidth : 0
    OutgoingBandwidth : 0

    Memory Taken: 662, 372 K ...

    Any reason for this?

    The Velcoity CTP 3, uses 67 K at start and then grows with usage... We are thinking that Production version of AppServer Fabric would be more good. But it seems that it starts with too much memory...

    Any ideas ?

    Thanks,

  • 2012年4月19日 4:05
     
     

    Hi folks,

    Thanks for sharing your experiences. Here's a couple of points to keep in mind when evaluating the memory consumption for Velocity:

    1. With the 1.1 release, we pre-allocate the memory required to get efficiency and performance subsequently. This is half of the memory available on the box.

    2. Cache service is a managed process, so memory is not necessarily released at the point where objects are deleted, but at the point where .NET would do a GC.

    Given that caches are latency sensitive, any situations that cause the cache to start paging out need to be avoided. We hence recommend that you deploy cache on a dedicated set of servers. If the amount of cache needed is smaller, you could reduce the cache size by setting it via the following cmdlet:

    Cache size (MB)(total space allocated for storing data on the cache host)

    The size attribute in the host element. The host element is a child of the hosts element.

    Assigned at installation time. Reconfigure this setting with the CacheSize parameter of the Set-CacheHostConfig command. View this setting with the Get-CacheHostConfig command.

    You can get more details here: http://msdn.microsoft.com/en-us/library/hh351231.aspx

    thanks,
    Prashant

  • 2012年6月21日 18:35
     
     

    Hi Guys,

    I would recommend to try the following:

    1. pre-set the cache memory to <50% of the available memory on the each host if running with Web or App;

    2. if you use regions, please create one on application level or clear/delete the region after each session.

    3. dedicated cache server preferable with AppFabric

    4. try MemCached, no Domain required if running in DMZ.

    5. wait for AppFabric 2.0 coming out. 

    thanks,

    jason,

  • 2012年6月26日 15:13
     
     

    We're experiencing similar behavior with a 1GB cluster size and two machines in the cluster with 6GB of RAM. When viewing the DistributedCacheService.exe through task manager it has grown over 4.5GB in size with only one 20MB object in cache. When I run the Invoke-CacheGC command through powershell the memory allocated to the process instantly drops to ~400MB and then grows slowly from there.