none
SP site becomes unresponsive unless Application Pool is recycled RRS feed

  • Question

  • Everyday random SP site becomes inaccessible to users connected via WFE. Sometimes it's whole site collection, sometimes it's specific subsite. Common symptom has been high CPU usage by w3wp.exe process (AppPool) on affected WFE. After couple minutes of wait time, users get error message: "Something went wrong ....." Upon check with correlation ID, it seems like request is waiting, possibly on DB; but when I look up on DB and App servers, there is not hike in system resource or error logged.

    06/27/2014 12:49:54.26	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	General	ajji6	High	Unable to write SPDistributedCache call usage entry.	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.47	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Database	ahjqp	High	[Forced due to logging gap, cached @ 06/27/2014 12:49:54.27, Original Level: Verbose] SQL connection time: 0.0927492181268848 for Data Source=DBServer;Initial Catalog=DBName;Integrated Security=True;Enlist=False;Pooling=True;Min Pool Size=0;Max Pool Size=100;Asynchronous Processing=True;Connect Timeout=15;Application Name=SharePoint[w3wp][2][DBName]	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.47	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Files	ak8dj	High	UserAgent not available, file operations may not be optimized.    at Microsoft.SharePoint.SPFileStreamManager.CreateCobaltStreamContainer(SPFileStreamStore spfs, ILockBytes ilb, Boolean copyOnFirstWrite, Boolean disposeIlb)     at Microsoft.SharePoint.SPFileStreamManager.SetInputLockBytes(SPFileInfo& fileInfo, SqlSession session, PrefetchResult prefetchResult)     at Microsoft.SharePoint.CoordinatedStreamBuffer.SPCoordinatedStreamBufferFactory.CreateFromDocumentRowset(Guid databaseId, SqlSession session, SPFileStreamManager spfstm, Object[] metadataRow, SPRowset contentRowset, SPDocumentBindRequest& dbreq, SPDocumentBindResults& dbres)     at Microsoft.SharePoint.SPSqlClient.GetDocumentContentRow(Int32 rowOrd, Object ospFileStmMgr, SPDocumentBindRequest& dbreq, SPDocumentBindResults& dbres)     at Microsoft.SharePoint.Library.SPRequestInternalClass.GetFileAndMetaInfo(String bstrUrl, Byte bPageView, Byte bPageMode, Byte bGetBuildDependencySet, String bstrCurrentFolderUrl, Int32 iRequestVersion, Byte bMainFileRequest, Boolean& pbCanCustomizePages, Boolean& pbCanPersonalizeWebParts, Boolean& pbCanAddDeleteWebParts, Boolean& pbGhostedDocument, Boolean& pbDefaultToPersonal, Boolean& pbIsWebWelcomePage, String& pbstrSiteRoot, Guid& pgSiteId, UInt32& pdwVersion, String& pbstrTimeLastModified, String& pbstrContent, UInt32& pdwPartCount, Object& pvarMetaData, Object& pvarMultipleMeetingDoclibRootFolders, String& pbstrRedirectUrl, Boolean& pbObjectIsList, Guid& pgListId, UInt32& pdwItemId, Int64& pllListFlags, Boolean& pbAccessDenied, Guid& pgDocid, Byte& piLevel, UInt64& ppermMask, Object& pvarBuildDependencySet, UInt32& pdwNumBuildDependencies, Object& pvarBuildDependencies, String& pbstrFolderUrl, String& pbstrContentTypeOrder, Guid& pgDocScopeId)     at Microsoft.SharePoint.Library.SPRequestInternalClass.GetFileAndMetaInfo(String bstrUrl, Byte bPageView, Byte bPageMode, Byte bGetBuildDependencySet, String bstrCurrentFolderUrl, Int32 iRequestVersion, Byte bMainFileRequest, Boolean& pbCanCustomizePages, Boolean& pbCanPersonalizeWebParts, Boolean& pbCanAddDeleteWebParts, Boolean& pbGhostedDocument, Boolean& pbDefaultToPersonal, Boolean& pbIsWebWelcomePage, String& pbstrSiteRoot, Guid& pgSiteId, UInt32& pdwVersion, String& pbstrTimeLastModified, String& pbstrContent, UInt32& pdwPartCount, Object& pvarMetaData, Object& pvarMultipleMeetingDoclibRootFolders, String& pbstrRedirectUrl, Boolean& pbObjectIsList, Guid& pgListId, UInt32& pdwItemId, Int64& pllListFlags, Boolean& pbAccessDenied, Guid& pgDocid, Byte& piLevel, UInt64& ppermMask, Object& pvarBuildDependencySet, UInt32& pdwNumBuildDependencies, Object& pvarBuildDependencies, String& pbstrFolderUrl, String& pbstrContentTypeOrder, Guid& pgDocScopeId)     at Microsoft.SharePoint.Library.SPRequest.GetFileAndMetaInfo(String bstrUrl, Byte bPageView, Byte bPageMode, Byte bGetBuildDependencySet, String bstrCurrentFolderUrl, Int32 iRequestVersion, Byte bMainFileRequest, Boolean& pbCanCustomizePages, Boolean& pbCanPersonalizeWebParts, Boolean& pbCanAddDeleteWebParts, Boolean& pbGhostedDocument, Boolean& pbDefaultToPersonal, Boolean& pbIsWebWelcomePage, String& pbstrSiteRoot, Guid& pgSiteId, UInt32& pdwVersion, String& pbstrTimeLastModified, String& pbstrContent, UInt32& pdwPartCount, Object& pvarMetaData, Object& pvarMultipleMeetingDoclibRootFolders, String& pbstrRedirectUrl, Boolean& pbObjectIsList, Guid& pgListId, UInt32& pdwItemId, Int64& pllListFlags, Boolean& pbAccessDenied, Guid& pgDocid, Byte& piLevel, UInt64& ppermMask, Object& pvarBuildDependencySet, UInt32& pdwNumBuildDependencies, Object& pvarBuildDependencies, String& pbstrFolderUrl, String& pbstrContentTypeOrder, Guid& pgDocScopeId)     at Microsoft.SharePoint.SPWeb.GetWebPartPageContent(Uri pageUrl, Int32 pageVersion, PageView requestedView, HttpContext context, Boolean forRender, Boolean includeHidden, Boolean mainFileRequest, Boolean fetchDependencyInformation, Boolean& ghostedPage, String& siteRoot, Guid& siteId, Int64& bytes, Guid& docId, UInt32& docVersion, String& timeLastModified, Byte& level, Object& buildDependencySetData, UInt32& dependencyCount, Object& buildDependencies, SPWebPartCollectionInitialState& initialState, Object& oMultipleMeetingDoclibRootFolders, String& redirectUrl, Boolean& ObjectIsList, Guid& listId)     at Microsoft.SharePoint.ApplicationRuntime.SPRequestModuleData.FetchWebPartPageInformationForInit(HttpContext context, SPWeb spweb, Boolean mainFileRequest, String path, Boolean impersonate, Boolean& isAppWeb, Boolean& fGhostedPage, Guid& docId, UInt32& docVersion, String& timeLastModified, SPFileLevel& spLevel, String& masterPageUrl, String& customMasterPageUrl, String& webUrl, String& siteUrl, Guid& siteId, Object& buildDependencySetData, SPWebPartCollectionInitialState& initialState, String& siteRoot, String& redirectUrl, Object& oMultipleMeetingDoclibRootFolders, Boolean& objectIsList, Guid& listId, Int64& bytes)     at Microsoft.SharePoint.ApplicationRuntime.SPRequestModuleData.GetFileForRequest(HttpContext context, SPWeb web, Boolean exclusion, String virtualPath)     at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.InitContextWeb(HttpContext context, SPWeb web)     at Microsoft.SharePoint.WebControls.SPControl.SPWebEnsureSPControl(HttpContext context)     at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.GetContextWeb(HttpContext context)     at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.PostResolveRequestCacheHandler(Object oSender, EventArgs ea)     at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()     at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)     at System.Web.HttpApplication.PipelineStepManager.ResumeSteps(Exception error)     at System.Web.HttpApplication.BeginProcessRequestNotification(HttpContext context, AsyncCallback cb)     at System.Web.HttpRuntime.ProcessRequestNotificationPrivate(IIS7WorkerRequest wr, HttpContext context)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr rootedObjectsPointer, IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr rootedObjectsPointer, IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)     at System.Web.Hosting.UnsafeIISMethods.MgdIndicateCompletion(IntPtr pHandler, RequestNotificationStatus& notificationStatus)     at System.Web.Hosting.UnsafeIISMethods.MgdIndicateCompletion(IntPtr pHandler, RequestNotificationStatus& notificationStatus)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr rootedObjectsPointer, IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr rootedObjectsPointer, IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.47	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Monitoring	b4ly	High	Leaving Monitored Scope (GetFileAndMetaInfo). Execution Time=214.831982835807	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.47	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Monitoring	b4ly	High	Leaving Monitored Scope (GetWebPartPageContent). Execution Time=215.150319384168	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.47	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Monitoring	b4ly	High	Leaving Monitored Scope (PostResolveRequestCacheHandler). Execution Time=218.581761089747	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.51	w3wp.exe (0x167DC)	0x35758	Web Content Management	Publishing	7fz3	Medium	Setting [Display] as the FormContext.FormMode for the current page	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.52	w3wp.exe (0x167DC)	0x35758	Web Content Management	Publishing	7fz3	Medium	Setting [Display] as the FormContext.FormMode for the current page	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.52	w3wp.exe (0x167DC)	0x35758	Web Content Management	Publishing	7fz3	Medium	Setting [Display] as the FormContext.FormMode for the current page	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.58	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Upgrade	aiaih	High	[Forced due to logging gap, cached @ 06/27/2014 12:49:54.57, Original Level: Verbose] desiredVersion: {0}	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.58	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Database	8acb	High	[Forced due to logging gap, Original Level: VerboseEx] Reverting to process identity	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.71	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	Upgrade	aiaih	High	[Forced due to logging gap, cached @ 06/27/2014 12:49:54.63, Original Level: Verbose] desiredVersion: {0}	89069f9c-4105-d058-3f0e-a278a9dbf46c
    06/27/2014 12:49:54.71	w3wp.exe (0x167DC)	0x35758	SharePoint Foundation	General	g3ql	High	[Forced due to logging gap, Original Level: Verbose] GetUriScheme(siteurl that was requested)	89069f9c-4105-d058-3f0e-a278a9dbf46c
    


    We are able to temporarily fix it by recycling AppPool on affected WFE but unable to find root cause and solution. Best bet looks like to open case but wanted to give a shot or atleast try to identify the source of issue.

    Env: PROD, SP Server 2013  Topology: 2 WFE - 2 App - 2 DB (SQL - AOAG) 



    MK Sin

    Wednesday, July 2, 2014 9:32 PM

Answers

All replies

  • ULS log details are incomplete.

    http://technet.microsoft.com/en-us/library/hh407293%28v=office.15%29.aspx

    Application pools recycle when memory limits are exceeded (SharePoint 2013)

    Was any new WSP deployed in farm. 

    Any new customization done in farm

    Do we see heavy CPU\memory utilization in farm


    If this helped you resolve your issue, please mark it Answered

    • Proposed as answer by Sundar Narasiman Saturday, July 12, 2014 4:07 AM
    • Unproposed as answer by Mo Key Wednesday, July 16, 2014 7:22 PM
    Thursday, July 3, 2014 6:59 AM
    Moderator
  • looks like a resource hungry process starts....

    How many Site collections in the web app and also in the farm?

    In your site collection, do you have any customization? do you have large list / libraries? any workflow running on sites?

    DId you change the List view threshold from default (5000) to some higher values.

    I experienced same problem  couple of months ago, when user trying to browse a List with more than 100K items and their view pulling all records...interesting List View Threshold breach all standard practices and set 100k, App stuck and we also notice high CPU/Memory Usage on DB server as well.

    To fix the issue we have to kill the app pool. 

    I would recomend, analyze your site collections, check # list items, Workflow running, WOrkflow history list and also any large operations.


    Please remember to mark your question as answered &Vote helpful,if this solves/helps your problem. ****************************************************************************************** Thanks -WS MCITP(SharePoint 2010, 2013) Blog: http://wscheema.com/blog

    Thursday, July 3, 2014 7:10 AM
    Moderator
  • @Inderjeet, On ULS log, those were the only entries I found when searched with correlation ID.  We haven't introduced any changes - deployment or configuration change. Memory is constant, w3wp (appPool) takes up about ~2 GB. System memory is normally 75-80% consumed. Normally CPU consumption goes higher 40-50% and stays around that number until, AppPool is recycled. 

    @WaqasWe have 34 site collections each having it's own content database. We do not have large list, may be few with 5000 list items. But I can rule out this as we have seen it occur on site collection or site with quite small lists and libraries. Same applies with workflows; few Desinger and OOTB workflows but running on very small list. During issue it's only CPU for AppPool that spikes to around 40-50%, so it's not like resources are completely exhausted. Matter of fact, requests going through same server to other sites and site collections are processed without issue. Even affected sites and site collections from different WFE is processed fine.

    Any idea on capturing what this process is doing during the issue?

    Thanks,

    MK


    MK Sin

    Thursday, July 3, 2014 3:27 PM
  • Other site collections are in the same web app or different? 

    I can tell you one thing, we had a bad few line workflows which brought down the whole workflows at farm level. You have to check all the workflow and make sure nothing stuck...

    I think you need PSSdiag to identify the issue because their is no logging while something in stuck. MSFT support always use this tool to identify the issue.

    I would say, move the site collections into own web app with new app pool. then test the behaviour. please perform this in the test farm.


    Please remember to mark your question as answered &Vote helpful,if this solves/helps your problem. ****************************************************************************************** Thanks -WS MCITP(SharePoint 2010, 2013) Blog: http://wscheema.com/blog

    Thursday, July 3, 2014 3:59 PM
    Moderator
  • All the site collection belongs to same (web application). As of now there is no pattern on any specific site collection, we do have replica (close to same configuration) but issue is strictly scoped to PROD environment. 

    I will look into PSSdiag tool. 

    Thank you.


    MK Sin

    Thursday, July 3, 2014 5:26 PM
  • Hi,

    Can you try this to capture the issue?

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;919792 


    Thanks and Regards, Parth

    Friday, July 11, 2014 9:04 PM
  • The best way to diagonise this is to capture the required Performance Monitor (PerfMon) counters for SharePoint. You have the PerfMon counters for SharePoint here, you can capture them and that will give a clue on which hardware  resource is going out of supply.

    http://technet.microsoft.com/en-us/library/ff758658(v=office.15).aspx


    Please mark the replies as answers if they help or unmark if not.

    Saturday, July 12, 2014 4:11 AM
  • @Parth & @SundarNarasiman, Thank you for suggestion. Just waiting for issue to occur again. I will post back with my findings.

    MK Sin

    Wednesday, July 16, 2014 7:31 PM