AppFabric 1.1 Cache - Problem with Start-CacheCluster
-
Friday, May 11, 2012 12:57 PM
I have the following environment:
8 Load-balanced Web servers running Windows Server 2008 R2 x64 Standard
2 Clustered database servers running Windows Server 2008 R2 x64 Enterprise and Sql Server 2008 R2 Enterprise
All servers are in a domain and in the same VLAN without any port restrictions between them.
I am using a domain account that is local admin in all servers and running Power Shell as administrator.
I installed, configured and tested AppFabric Cache in the first server and it was OK.
The cache cluster configuration provider is Sql Server.
Then I installed and configured the second server to use AppFabric Cache and tried to start the cache cluster with Start-CacheCluster, but after 5 minutes the following entries appears in Event Viewer in both servers:
Log Name: Application Source: .NET Runtime Date: 10/05/2012 18:54:51 Event ID: 1026 Task Category: None Level: Error Keywords: Classic User: N/A Computer: XXXXXXXXXXX Description: Application: DistributedCacheService.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: Microsoft.ApplicationServer.Caching.DataCacheException Stack: at Microsoft.ApplicationServer.Caching.VelocityWindowsService.StartServiceCallback(System.Object) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
and
Log Name: Application Source: Application Error Date: 10/05/2012 18:54:52 Event ID: 1000 Task Category: (100) Level: Error Keywords: Classic User: N/A Computer: XXXXXXXXXXXX Description: Faulting application name: DistributedCacheService.exe, version: 1.0.4632.0, time stamp: 0x4eafeccf Faulting module name: KERNELBASE.dll, version: 6.1.7601.17651, time stamp: 0x4e21213c Exception code: 0xe0434352 Fault offset: 0x000000000000cacd Faulting process id: 0x9f60 Faulting application start time: 0x01cd2ef65b24cc93 Faulting application path: C:\Program Files\AppFabric 1.1 for Windows Server\DistributedCacheService.exe Faulting module path: C:\Windows\system32\KERNELBASE.dll Report Id: c4cc0d1b-9aea-11e1-b7db-d4bed9b1a0bd
Can someone help me on this? I have searched the web a LOT and found nothing conclusive, I'm stuck on this for hours.
Thank you.
EDIT:
Here is the output when I attempt to start the cluster:
PS C:\Windows\system32> Start-CacheCluster Start-CacheCluster : ErrorCode<ERRCAdmin003>:SubStatus<ES0001>:Time-out occurred on net.tcp://host2:22233. At line:1 char:19 + Start-CacheCluster <<<< + CategoryInfo : NotSpecified: (:) [Start-CacheCluster], DataCacheException + FullyQualifiedErrorId : ERRCAdmin003,Microsoft.ApplicationServer.Caching.Commands.StartCacheClusterCommand HostName : CachePort Service Name Service Status Version Info -------------------- ------------ -------------- ------------ host1:22233 AppFabricCachingService UP 3 [3,3][1,3] host2:22233 AppFabricCachingService STARTING 3 [3,3][1,3]After 5 minutes, here is the output of Get-CacheHost
PS C:\Windows\system32> Get-CacheHost HostName : CachePort Service Name Service Status Version Info -------------------- ------------ -------------- ------------ host1:22233 AppFabricCachingService UP 3 [3,3][1,3] host2:22233 AppFabricCachingService DOWN 3 [3,3][1,3]
Also, here is my cluster configuration exported through "Export-CacheClusterConfig -File c:\temp\clusterconfig.xml"
<?xml version="1.0" encoding="utf-8"?> <configuration> <configSections> <section name="dataCache" type="Microsoft.ApplicationServer.Caching.DataCacheSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" /> </configSections> <dataCache size="Medium"> <caches partitionCount="256"> <cache consistency="StrongConsistency" name="default" minSecondaries="0"> <policy> <eviction type="Lru" /> <expiration defaultTTL="10" isExpirable="true" /> </policy> </cache> </caches> <hosts> <host replicationPort="22236" arbitrationPort="22235" clusterPort="22234" hostId="226475190" size="8185" leadHost="true" account="Domain\User" cacheHostName="AppFabricCachingService" name="host1" cachePort="22233" /> <host replicationPort="22236" arbitrationPort="22235" clusterPort="22234" hostId="2055987555" size="8185" leadHost="false" account="Domain\User" cacheHostName="AppFabricCachingService" name="host2" cachePort="22233" /> </hosts> <deploymentSettings> <deploymentMode value="RoutingClient" /> </deploymentSettings> </dataCache> </configuration>
- Edited by Pedro Lima Friday, May 11, 2012 4:54 PM
All Replies
-
Friday, May 11, 2012 1:37 PM
Hi Pedro,
We would require to know the exact DataCacheException in order to help you. Can you check the events under the log name :"Microsoft-Windows-Application Server-System Services/Admin"?
Also, please check if there could be any connectivity or firewall issues between the machines.
Thanks,
Bharath
-
Friday, May 11, 2012 1:47 PM
Thanks for your response, Bharath, here is what you asked.
Also, there are no port restrictions between the servers
Log Name: Microsoft-Windows-Application Server-System Services/Admin Source: Microsoft-Windows Server AppFabric Caching Date: 11/05/2012 10:39:56 Event ID: 111 Task Category: (1) Level: Error Keywords: User: NETWORK SERVICE Computer: xxxxxxxxxxxxxxxx Description: AppFabric Caching service crashed with exception {Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<UnspecifiedErrorCode>:SubStatus<ES0001>:ErrorCode<ERRService0001>:SubStatus<ES0001>:Service initialization failed. No user action required. ---> Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRService0001>:SubStatus<ES0001>:Service initialization failed. No user action required. ---> Microsoft.Fabric.Common.OperationCompletedException: Operation completed with an exception ---> System.TimeoutException: The operation has timed out. --- End of inner exception stack trace --- at Microsoft.Fabric.Common.OperationContext.End() at Microsoft.Fabric.Common.SharedCommunicationObject.EndOpen(IAsyncResult result) at Microsoft.Fabric.Federation.FederationSite.EndOpen(IAsyncResult result) at Microsoft.Fabric.Data.ReliableServiceManager.EndOpen(IAsyncResult ar) at Microsoft.ApplicationServer.Caching.DOMNode..ctor(Int32 id, String displayFriendlyNodeId, Int32 port, EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, ReliableServiceProvider dataStore, ServiceResolverBase& client) --- End of inner exception stack trace --- at Microsoft.ApplicationServer.Caching.DOMNode..ctor(Int32 id, String displayFriendlyNodeId, Int32 port, EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, ReliableServiceProvider dataStore, ServiceResolverBase& client) at Microsoft.ApplicationServer.Caching.DistributedObjectManager..ctor(EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, WcfServerChannel channel) at Microsoft.ApplicationServer.Caching.DistributedObjectManager.GetInstance(EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, WcfServerChannel channel) at Microsoft.ApplicationServer.Caching.ServiceLayer.ServiceStart(Boolean deleteTkt) at Microsoft.ApplicationServer.Caching.DataCacheServiceBase.ServiceStart(ServiceConfigurationManager scm, Boolean deleteTkt) at Microsoft.ApplicationServer.Caching.VelocityWindowsService.StartService(Boolean deleteTKT) at Microsoft.ApplicationServer.Caching.VelocityWindowsService.OnStart(String[] args) --- End of inner exception stack trace --- at Microsoft.ApplicationServer.Caching.VelocityWindowsService.ThrowCallback(Object exception) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()}. Check debug log for more information
I have edited the main post to provide more information.
- Edited by Pedro Lima Friday, May 11, 2012 4:52 PM
-
Monday, June 11, 2012 6:17 PM
We got the exact same exception
9 Load-balanced Web servers running Windows Server 2008 R2 x64 Standard
2 Clustered database servers running Windows Server 2008 R2 x64 Enterprise and Sql Server 2008 R2 Enterprise
3 AppFabric Cache servers in a cluster running Windows Server 2008 R2 x64 Enterprise
- 2 AppFabric server run fine.
- 1 AppFabric server was running fine until last week. Then he started to give the same exception that Pedro Lima reported.
All servers are in a domain and in the same VLAN with port restrictions between them.
I am using a domain account that is local admin in all servers and running Power Shell as administrator.
The cache cluster configuration provider is Sql Server.
PowerShell :
PS C:\Windows\system32> start-cachehost APPFABRICSV3 22233 -Debug -Verbose HostName : CachePort Service Name Service Status Version Info -------------------- ------------ -------------- ------------ APPFABRICSV3:22233 AppFabricCachingService STARTING 3 [3,3][1,3] Start-CacheHost : Error occurred while performing the operation on host APPFABRICSV3:22233 : ErrorCode<ERRCA dmin003>:SubStatus<ES0001>:Time-out occurred on net.tcp://APPFABRICSV3:22233. At line:1 char:16 + start-cachehost <<<< APPFABRICSV3 22233 -Debug -Verbose + CategoryInfo : NotSpecified: (:) [Start-CacheHost], DataCacheException + FullyQualifiedErrorId : ERRCAdmin003,Microsoft.ApplicationServer.Caching.Commands.StartCacheHostCommand PS C:\Windows\system32> get-cachehost HostName : CachePort Service Name Service Status Version Info -------------------- ------------ -------------- ------------ APPFABRICSV1:22233 AppFabricCachingService UP 3 [3,3][1,3] APPFABRICSV2:22233 AppFabricCachingService UP 3 [3,3][1,3] APPFABRICSV3:22233 AppFabricCachingService UNKNOWN 3 [3,3][1,3] PS C:\Windows\system32> get-cachehostconfig APPFABRICSV3 22233 HostName : APPFABRICSV3 ClusterPort : 22234 CachePort : 22233 ArbitrationPort : 22235 ReplicationPort : 22236 Size : 4095 MB ServiceName : AppFabricCachingService HighWatermark : 99% LowWatermark : 90% IsLeadHost : TruePowerShell log
Host APPFABRICSV1 is Reachable.,DistributedCache.CacheAdmin,Verbose,2012-6-7 11:13:44.439 Command Start-CacheHost Parameters: APPFABRICSV3, 22233, -100: Time=06/07/2012 15:15:07,DistributedCache.AdminPS,Verbose,2012-6-7 11:15:07.456 Host APPFABRICSV3 is Reachable.,DistributedCache.CacheAdmin,Verbose,2012-6-7 11:15:07.487 Host APPFABRICSV3 is Reachable.,DistributedCache.CacheAdmin,Verbose,2012-6-7 11:16:07.675
EventLog Applications
Faulting application name: DistributedCacheService.exe, version: 1.0.4632.0, time stamp: 0x4eafeccf Faulting module name: KERNELBASE.dll, version: 6.1.7601.17651, time stamp: 0x4e21213c Exception code: 0xe0434352 Fault offset: 0x000000000000cacd Faulting process id: 0xfec Faulting application start time: 0x01cd44c1f9696d61 Faulting application path: C:\Program Files\AppFabric 1.1 for Windows Server\DistributedCacheService.exe Faulting module path: C:\Windows\system32\KERNELBASE.dll Report Id: 59912a24-b0b6-11e1-b78c-0050569400cb
EventLog Applications
Application: DistributedCacheService.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: Microsoft.ApplicationServer.Caching.DataCacheException Stack: at Microsoft.ApplicationServer.Caching.VelocityWindowsService.StartServiceCallback(System.Object) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
EventLog System
The AppFabric Caching Service service terminated unexpectedly. It has done this 6 time(s).
EventLog Microsoft-Windows-Application
AppFabric Caching service crashed with exception {Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<UnspecifiedErrorCode>:SubStatus<ES0001>:ErrorCode<ERRService0001>:SubStatus<ES0001>:Service initialization failed. No user action required. ---> Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRService0001>:SubStatus<ES0001>:Service initialization failed. No user action required. ---> Microsoft.Fabric.Common.OperationCompletedException: Operation completed with an exception ---> System.TimeoutException: The operation has timed out. --- End of inner exception stack trace --- at Microsoft.Fabric.Common.OperationContext.End() at Microsoft.Fabric.Common.SharedCommunicationObject.EndOpen(IAsyncResult result) at Microsoft.Fabric.Federation.FederationSite.EndOpen(IAsyncResult result) at Microsoft.Fabric.Data.ReliableServiceManager.EndOpen(IAsyncResult ar) at Microsoft.ApplicationServer.Caching.DOMNode..ctor(Int32 id, String displayFriendlyNodeId, Int32 port, EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, ReliableServiceProvider dataStore, ServiceResolverBase& client) --- End of inner exception stack trace --- at Microsoft.ApplicationServer.Caching.DOMNode..ctor(Int32 id, String displayFriendlyNodeId, Int32 port, EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, ReliableServiceProvider dataStore, ServiceResolverBase& client) at Microsoft.ApplicationServer.Caching.DistributedObjectManager..ctor(EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, WcfServerChannel channel) at Microsoft.ApplicationServer.Caching.DistributedObjectManager.GetInstance(EndpointID[] urisDOM, ServiceConfigurationManager configurationManager, WcfServerChannel channel) at Microsoft.ApplicationServer.Caching.ServiceLayer.ServiceStart(Boolean deleteTkt) at Microsoft.ApplicationServer.Caching.DataCacheServiceBase.ServiceStart(ServiceConfigurationManager scm, Boolean deleteTkt) at Microsoft.ApplicationServer.Caching.VelocityWindowsService.StartService(Boolean deleteTKT) at Microsoft.ApplicationServer.Caching.VelocityWindowsService.OnStart(String[] args) --- End of inner exception stack trace --- at Microsoft.ApplicationServer.Caching.VelocityWindowsService.ThrowCallback(Object exception) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()}. Check debug log for more information
-
Tuesday, June 12, 2012 5:22 PMWe fixed the issue by restoring a backup of our faulty server and by restarting the cluster.
- Edited by Francis Grignon Tuesday, June 12, 2012 5:23 PM

