ErrorCode<ERRCA0017>:SubStatus<ES0007>:There is a temporary failure. Please retry later. (The request failed because the server is in throttled state.)
-
Friday, July 02, 2010 12:21 AM
I'm having an extremely difficult time attempting to get AppFabric working. I keep getting a bunch of these "temporary failure" DataCacheExceptions. I've been able to work past a few (some appear randomly??) but this one happens every single time I try to create a region. I've tried restarting the cache cluster, restarting the service and rebooting the server. Nothing seems to help. Ideas?
All Replies
-
Tuesday, July 06, 2010 3:06 PM
This does sound confusing. I have a few questions for you:
1. Are you using the release version of Windows Server AppFabric or are you on one of the Beta releases?
2. Is anything else using the cache when you get this error? When you restart the cache, is your call to CreateRegion() one of the first things that happens on the cache?
3. Can you send me the output from the Get-CacheClusterHealth Windows PowerShell command? See http://msdn.microsoft.com/en-us/library/ff718177.aspx for more information about using commands like this.
I'll try to help you troubleshoot this issue. Throttling indicates a lack of memory on one or more of the cache hosts (servers) in your cluster. If you just restarted the cluster before your test, then that seems like an incorrect error. But we'll need to look at the evidence closer.
Thanks.
Jason Roth
-
Thursday, July 08, 2010 1:42 PM
I have the same exception.
Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0007>:There is a temporary failure. Please retry later. (The request failed because the server is in throttled state.) в Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBody respBody) в Microsoft.ApplicationServer.Caching.DataCache.InternalAdd(String key, Object value, TimeSpan timeout, DataCacheTag[] tags, String region) в Microsoft.ApplicationServer.Caching.DataCache.Add(String key, Object value) в SimpleTest.AppFabricCacheProvider.Add(String key, Object value) в D:\work\appfabric\TestApp\SimpleTest\SimpleTest\AppFabricCacheProvider.cs:строка 29 в SimpleTest.TestAdd.Start() в D:\work\appfabric\TestApp\SimpleTest\SimpleTest\TestAdd.cs:строка 43 в SimpleTest.TestBase.Run() в D:\work\appfabric\TestApp\SimpleTest\SimpleTest\TestBase.cs:строка 30 в SimpleTest.TestRunner.Run() в D:\work\appfabric\TestApp\SimpleTest\SimpleTest\TestRunner.cs:строка 60 в SimpleTest.Program.Main() в D:\work\appfabric\TestApp\SimpleTest\SimpleTest\Program.cs:строка 14
It occurs when I try to insert about 500 DataSets into cache. Before It, I inserted 1000 DataTables and did not throw an exception.
-
Thursday, July 08, 2010 2:04 PM
When you get this error, look at the result from Get-CacheClusterHealth in Windows Powershell. Here is an example:
Cluster health statistics
=========================
HostName = CacheServer1
-------------------------
NamedCache = default
Healthy = 0.00
UnderReconfiguration = 0.00
NotPrimary = 0.00
NoWriteQuorum = 0.00
Throttled = 25.00
NamedCache = Cache1
Healthy = 0.00
UnderReconfiguration = 0.00
NotPrimary = 0.00
NoWriteQuorum = 0.00
Throttled = 25.00
HostName = CacheServer2
-------------------------
NamedCache = Cache1
Healthy = 25.00
UnderReconfiguration = 0.00
NotPrimary = 0.00
NoWriteQuorum = 0.00
Throttled = 0.00
NamedCache = default
Healthy = 25.00
UnderReconfiguration = 0.00
NotPrimary = 0.00
NoWriteQuorum = 0.00
Throttled = 0.00
Unallocated named cache fractions
---------------------------------
You can see in this output, that the cache cluster has determined that CacheServer1 is throttled (ignore the actual percentage shown and just focus on what categories the numbers are in, which in this case is Throttled). This means that the available physical memory on CacheServer1 has reached a critically low level. You can verify if this is correct or not by looking at perfmon on CacheServer1 to see how many available MB of memory you have versus the total amount of memory. If you're in the 5-15% range, then that's the reason.
Jason Roth
- Proposed As Answer by JasonRothMicrosoft Employee Thursday, July 15, 2010 11:54 AM
-
Thursday, July 15, 2010 7:14 AMYou are rigth. This exception is thrown if ussage of physical memory is more then 95%.
-
Tuesday, August 10, 2010 6:06 PM
For anyone reading this post in the future with the error above, there is a new Windows Server AppFabric Deployment and Management Guide that contains a troubleshooting section. This problem and possible solutions are included in this topic:
Throttling Troubleshooting (Windows Server AppFabric Caching)http://msdn.microsoft.com/en-us/library/ff921030.aspx
Jason
-
Friday, October 08, 2010 12:37 AM
Hi Jason,
I've run into this exact issue and thanks for the details, I've also noticed, the more RAM you throw at the machine causes throttling in real terms to kick in at times when there's more free RAM than before. e.g. throttling occurs at % of free RAM, i.e. 5% on and 10% off. So on a box with 1GB ram these values are small. On a machine with 32GB RAM these values are large - so even so you may have 1.6GB RAM free, caching is throttled.
The issues I'm finding specifically to throttling:
1) is there a way to detect AppFabric Caching is throttled - *before* making the call the Add(xxx) in the cache, rather than getting an exception. So for e.g. you could clean the Cache, memory etc before hand.
2) Can you clarify a little around local caching and memory consumption?
From what I understand, we can have a SQL Cache with a local (in memory) cache on a node.
(I'm currently using just a local cache).
I'm assuming the cache is a certain size and an 'expiration' type policy exists - what I would *love* to see, is that when Throttling turns on, the Cache continues to work (there may be just less items in there) and the cache is aware of the limited memory.Currently I just get exceptions every time I try and access the cache talking about
"Please retry later. (The request failed because the server is in throttled state.)"
Possible to turn this behaviour off?Many thanks,
Mick.
Mick Badran - http://blogs.breezetraining.com.au/mickb -
Friday, October 08, 2010 8:49 PM
Perhaps this was working in August , but right now its not working for me .. I am still getting the same messag. I also checked the status of the cache cluster and figured there was no throttling for both the servers in the cluster....
anyone knows why ???
perhaps, Jason has already left for the long weekend, eh , leaving us all alone to figure out AppFabric , the behemoth it is ! :)
Thanks and Regards - Gagan -
Tuesday, October 12, 2010 12:49 PM
I'm back. :) Let me try to address the questions:
- First, this is a good observation that throttling happens with a large amount of memory left on a high-end machine. I don't think there is a way to configure this, but maybe the throttling algorithm could be changed in the future to take it into account. We're working on capacity planning recommendations. Unless you need 32GB of RAM on a single box, it might be better to go with 16GB of RAM on two machines. This unusable memory due to throttling thresholds is one reason. Another reason is that garbage collection will have more and more memory to clean when it is triggered with a 32GB machine. This could result in noticeable pauses due to GC. On machines with less memory, these pauses are not as significant. Although there is work being done to address these GC pauses, 16GB seems to be a more advisable maximum memory on a single cache host unless you have memory requirements for regions (which reside on a single cache host) that exceed this.
- There is no API now to check for throttling. However, you could try to call the Get-CacheClusterHealth command. I'd have to run an experiment to do this and interpret the results. If any numbers showed up as "Throttled", you would know that your cluster is in a throttled state. Do you want me to work on this code sample? Would this be helpful?
- There's no way to turn off the exceptions when the cache is in a throttled state. However, there are mechanisms that should help prevent throttling. For example, if you have a cache that is evictable, then items will be evicted to free up memory and resolve the throttled state. So in that sense, the cache does continue to be operational, but you'll get more cache misses that require you to repopulate the cache. Expired items are also evicted to free up memory. So throttling shouldn't be a continuous condition unless you: have other applications using up memory, have a non-evictable cache, or have a high rate of adding items to the cache so that you're constantly caching and evicting. Generally, throttling is a bad scenario that you don't want to ignore.
- If you're getting throttling errors consistently, can you use Perfmon to check the available memory on the box and other applications that are using memory? Is all of the memory being consumed by the DistributedCacheService.exe?
Throttling should be a condition that can be resolved. It might mean that you need to scale out to more servers. Or it might just require a closer look at what processes are using memory on the throttled host. But if you have more than 10% available physical memory on the machine, you should not be gettinng this message. It actually should get down closer to 5% before throttling, but if your process has a bunch of memory paged out, then it might throttle at 10%.
Jason
-
Thursday, October 14, 2010 12:00 AM
Hi Jason, thanks for the great response.
Your suggested RAM limits on hosts seem reasonable at around 16GB (as we move more and more into the x64 world, this is common place)
I'll nut out a few more different cache settings with this current scenario and see what transpires.
The issue I have with observing the free 'RAM' on a box to see if throttling should apply or not, is that there are several Services that will consume as much RAM as they can 'for later use' and if they will give it back if the O/S needs it (What 'needs it' means is open to interpretation of each Services implementation)
For example - left unchecked Services such as SQL Server + Exchange Store services will do exactly this.
So generally the memory levels are pretty tight if these guys are running on the box. I'm seeing that with these on the same box, AppFabric throttling is triggered more than less - even with a pretty basic local cache.
Typically these hosts are also configured for 'Server side' GC mode - rather than workstation. i.e. the process/service will do a lazy memory unallocation back to the O/S. This compounds the issue.
I don't think it's off the map to be able to take advantage of local caching (hence AppFabric Cache) on hosts in this sort of scenario.
If we look at the previous alternatives - Enterprise Cache, ASP.NET cache etc., these were simple implementations that worked in these type of cases.
I get the high end enterprise distributed caching model with partitions etc, but at this point in time - I have to ask is AppFabric Cache being too clever and not catering for these simple cases??
(I'm envisaging all Servers will have AppFabric Windows Server installed as part of their base installation, catering for all WCF/WF deployments AND there's a cache feature that can be employed by apps also)
Cheers,
Mick.
Mick Badran - http://blogs.breezetraining.com.au/mickb -
Thursday, October 14, 2010 10:38 AM
Mick,
These are good observations/questions. You're right that AppFabric Caching is not going to work well with services (like SQL, etc) that take and then keep as much memory as possible. I'm more familiar with SQL Server, and I know that it is very egocentric when it comes to memory. It feels that it should be the main reason the machine exists. Services like this often don't work well with any other server applications/services unless they are configured. For SQL Server, you can configure max server memory (http://support.microsoft.com/kb/321363). Other applications may have similar ways to restrict their memory use. This could help the situation with caching.
Perhaps in the simple scenario you're question of whether AppFabric caching is being too clever is valid. But from the cache cluster's perspective, the only way that it can respond quickly enough to be useful is when the entire cache is in-memory. The minute it has to be paged out, you lose a lot of performance.
I'll see if I have some time soon to write up some code that will programmatically look for throttling conditions. Currently, the best way to do this is to handle the exception and then take action.
Jason
-
Monday, November 08, 2010 9:13 PM
Jason,
In your example above, CacheServer1 is 'Throttled' but CacheServer2 is not, why would AppFabric throw an exception when trying to add? Why would it not add the new item to CacheServer2 and not throw? I would think that if I had a cluster of 10 nodes, and 1 is throttled, 1 simple add should not throw but just store the item in an available node. What am I missing?
Josh
-
Monday, November 08, 2010 9:15 PM
@joshua,
that might because the server that's throttled is your leadhost ... have u tried changing your clusterconfig.xml and point your leaadhost to the other server...
please revert back if this solved your purposse.. even I am an interested to know about this ..
thanks
Thanks and Regards - Gagan -
Monday, November 15, 2010 8:11 PM
Also, when you use named regions, the region is assigned to one of the cache hosts. If you added an item to that region and the cache server for that region was throttled, there would be no way to resolve this other than to throw an exception. I'm not sure if that is what was happening here, but it is a possibility and good to keep in mind when troubleshooting.
Jason
-
Monday, November 29, 2010 11:14 AM
I have got the same error by running the CacheClusterHealth command the statistics looked fine. When I looked into my task manager the W3WP process was taking 200MB+ memory I am not sure why. I reset the iis and tried again. The cache start working.
Before the IISReset I tried restarting the cache services and ran the Stop-CacheCluster and Start-CacheCluster command but didn't work. Any idea what was happening in the IIS?
Abdul Rafay - MVP & MCTS BizTalk Server
blog: http://abdulrafaysbiztalk.wordpress.com/
Please indicate "Mark as Answer" if this post has answered the question. -
Tuesday, November 01, 2011 3:48 PM
Jason,
Thanks for your posts this topic. They have been helpful in resolving the throttling issue. However, I have the same question as Joshua M. I have two servers defined for my cache cluster. One of them was throttled, but the other was not. What didn't app fabric roll over to the non-throttled server, instead of throwing the throttled error? Is there a configuration setting that I need to adjust?
Thanks,
Dave Ehrlich

