Luns in clustered environment gets moved or does not get mounted at all.


  • We have 4 sql servers running in a clustered environment. We have san storage as the shared disk.

    By default Sql servers1 has the 3 luns mounted ie E:Sql data F:dttc and Q:Qurom . At times we find the luns

    scattered in 3 servers like E: in Sql server 2 F: in Sql Server3 and Q: in Sql server 4. When we reboot 2,3 and 4 all the

    luns get mounted in sever1 and the instance gets up. Once we found that the Luns were not at all mounted in any of

    the 4 servers. So shutdown all the servers and restarted and all the luns got mounted in sever1. I am new to the

    environment and the people here say that these things happen periodically say once in 4 to 5 months. What could

    be the reason and how to prevent this happening in future.

    Saturday, October 19, 2013 5:52 PM

All replies

  • Have you observerd the eventviwer logs for the timings it is showing such scenrios? Please share the logs. Also, contact to your Windows and storage team to scan the disk and check the status from there side. Must be some issue from storage.

    Please click the Mark as answer button and vote as helpful if this reply solves your problem

    Monday, October 21, 2013 3:15 AM
  • We noticed that the server was down at 7.30 AM. The server is running on windows-2008 R2. That morning there were some Windows Update

    downloaded and the start button was prompting for a shutdown. The events in the window log that was shown as error or warnings prior to the failure

    are listed below:

    Service: MSDTC$02a26710-b9d3-4110-8b26-42e1a0f2d57f is still running. Attempt to cleanup the service has failed:3.05 am
    MSsqlServerOlapsService stopped
    An error occurred while writing a trace event to the file, \\?\E:\OLAP\Log\FlightRecorderCurrent.trc.
    [sqsrvres] SvcTerminate: Service is stopped.
    [sqsrvres] OnlineThread: asked to terminate while waiting for QP.
    SQLVDI: Loc=SignalAbort. Desc=Client initiates abort. ErrorCode=(0). Process=6568. Thread=5464. Client. Instance=. VD=Global\VNBU0-6568-5464-1379603264_SQLVDIMemoryName_0.
    Application or service 'McAfee Rogue System Sensor' could not be restarted.
    Application or service 'McAfee Rogue System Sensor' could not be shut down.
    Application or service 'McAfee Rogue System Sensor' could not be restarted.
    Windows detected your registry file is still in use by other applications or services. The file will be unloaded now. The applications or services that hold your registry file may not function properly afterwards. 
    Would be blocked by port blocking rule (rule is in warn-only mode) (Anti-virus Standard Protection:Prevent mass mailing worms from sending mail).
    initerrlog: Could not open error log file 'E:\MSSQL10_50.MSSQLSERVER\MSSQL\Log\ERRORLOG'. Operating system error = 3(The system cannot find the path specified.).
    Faulting application name: bpcd.exe, version: 7.500.12.207, time stamp: 0x4f321f01
    Faulting module name: libnbbase.dll, version: 7.500.12.207, time stamp: 0x4f32188b
    Exception code: 0xc0000005
    Fault offset: 0x00000000000185ac
    Faulting process id: 0xc5c
    Faulting application start time: 0x01cec57cd7054e6a
    Faulting application path: C:\Program Files\Veritas\NetBackup\bin\bpcd.exe
    Faulting module path: C:\Program Files\Veritas\NetBackup\bin\libnbbase.dll
    Report Id: 1e8aaa8c-3170-11e3-8eea-14feb511e9be
    [sqsrvres] CheckServiceAlive: Service is dead 

    Monday, October 21, 2013 9:03 AM