Answered SQL server clustering

  • Wednesday, January 30, 2013 1:51 AM
     
     

    We have an active/passive clustering setup using SQL 2000 Ent (SP4) on Win 2003 Ent. On 1 node, we found SQL server is failed to online. When I run "sqlservr.exe -c" in command prompt, the sql server online successfully. From the windows event log, it returned:

    OnlineThread: ResUtilsStartResourceService failed (status 5b4). It means this operation returned because the timeout period expired.

    What should I further check this issue? Please advise.


All Replies

  • Wednesday, January 30, 2013 2:37 AM
    Answerer
     
     Answered

    Hello Fred,

    Check the SQL Server error log. If the resource (sql server) is failing to come online then the errorlog for the engine should tell you why. I would also check the Event Viewer for Application, Security, and System (on windows 2003) to make sure that any security or system violations aren't happening or could be correlated with it.

    It might also be helpful to check the cluster log as well, it might tell you if other resource dependencies are causing it to possibly fail.

    -Sean


    Sean Gallardy | Blog | Twitter

  • Wednesday, January 30, 2013 11:51 AM
     
     Answered

    Presumably, on failover SQL is failing to online on that second node -in case you are not aware the FCI will only run on one node at any one time.

    Check all the logs Sean mentions, probably the most useful one is the Cluster log which should at least tell you what is failing to be brought online.

    If downtime is not an issue, you can also Offline your running instance on the good node, failover to the other node and try bringing online the storage first and incrementally bring online the other FCI resources in a logical order so that you bring online the dependency resources before the depandant (e.g. SQLServer Service before SQLAgent). This will also point you to where the failure is occurring in your FCI.

    For instance a common issue of problems (certainly in lower versions of Windows and SQL) was the failure of storage to online on other nodes. Performing the technique above will give you a more granular online operation rather than bringing the entire FCI on at the same time.


    Regards,
    Mark Broadbent.

    Contact me through (twitter|blog|SQLCloud)

    Please click "Propose As Answer" if a post solves your problem
    or "Vote As Helpful" if a post has been useful to you
    Watch my sessions at the PASS Summit 2012

  • Thursday, January 31, 2013 9:05 AM
    Moderator
     
     

    Hi Fred,

    Please refer to the following article to troubleshooting this issue:

    http://www.sqldbadiaries.com/2011/01/25/sql-server-cluster-resource-doesnt-come-online-service-control-stop-before-startup/

    Regards,

    Fanny Liu


    Fanny Liu
    TechNet Community Support

  • Sunday, February 03, 2013 11:23 PM
     
     

    Hi there,

    Make sure all shared disks are connected.

    If you run the cluster validation test you may figure out if anything wrong with the cluster.

    From your message i guess your disk is not available. And it may be trying to access the unavailable disk. I think i have seen this message in that instance. It may not be your case.

    Make sure your windows cluster is configured correctly.

    Thanks

    kumar