locked
Corrupt Cache or Cache Host ? RRS feed

  • Question

  • Hello,

    I've found an interesting issue. I am trying to get an object from cache. I connect to the AppFabric just fine, to prove that I do, I put and get an item into the default cache.

    When I try to get something from a named cache, it fails with the exception below. The Named Cache exists as I can see in the Get-Cache command, the cachestatisitcs state that it has nothing in there. Nothing in the cache should not result in this exception.

    The only thing I can conclude is that the either the cache is corrupt or the host that this cache is on, is corrupt.  I know I can move past this error by removing and creating a new cache, as I've done it for another cache.

    I want to know why it is corrupt, how it got to be corrupt and what can be monitored to see if it is corrupt.

    Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0001>:There is a temporary failure. Please retry later.
       at Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBodyrespBody)
       at Microsoft.ApplicationServer.Caching.DataCache.InternalGet(String key, DataCacheItemVersion& version, String region)
       at Microsoft.ApplicationServer.Caching.DataCache.Get(String key)   at TargetCorp.Pharmacy.Common.Cache.VelocityLookAsideCache`2.LoadEntityFromCache(TPrimaryIndex primaryIndex) in C:\PolicyAdmin\Projects\MultiThreading\ConsoleApplication1

     

    Thank you


    Mohammad Faridi.
    Tuesday, March 1, 2011 5:31 PM

All replies

  • How many machines do you have in the cluster? Does this error go away after some time? We generally see it when there are intermittent connectivity issues, or high server load. It is possible that you are able to connect to the default cache because your request was being served by a different node.

    If you can monitor the performance counters (AppFabric caching:host as well as AppFabric caching:Cache for client requests, data size etc) on the servers, it can provide more insight on what's happening.

    Also, what's your scenario? Do you use features like high-availability/notifications/named regions/tags/local cache ?

    Thursday, March 3, 2011 12:17 PM
  • Hi Pragya,

    There are 4 machines in the cluster. The cluster has been up for days, the issue will not go away until I remove the cache and create a new one. It can connect to the default cache.

    We don't use any high-availability/notifications/named regions/tags/local cache features. Unfortunately because this was impairing testing, I had to remove and recreate the caches. Since then, I have not see the error with my simple test of retrieving an item from cache. Before it used to return item from default cache, but fail to return items from the cache that was having issues. The statistics showed it was fine, so was the cluster health. The cache was created a while ago but did not have any items in the cache.

    I did not monitor the performance counters. If it happens again I will monitor the perf counters.

    Thank you


    Mohammad Faridi.
    Thursday, March 3, 2011 3:39 PM
  • Hi Pragya,

    I have the same situation again. What exactly do you want me to monitor ?

     

    Thanks


    Mohammad Faridi.
    Thursday, March 10, 2011 12:14 AM
  • Hi Mohammad,

    This error comes when there is key level contention. I experienced the same when using CTP version.  The only solution which worked for me is to increase retrycount.

    The default value of retryCount is 5. If client does not get item within this count then it logs this error.

    For session state you can set retryCount attribute value. If you experiencing this error while accessing through code then you have to write custom logic for retrying to access item.

    hope this helps.


    Thanks, Hitesh
    • Proposed as answer by edhickey Tuesday, March 29, 2011 6:14 PM
    Tuesday, March 22, 2011 2:43 PM