none
Warning: Fatal error 823 occurred

    Question

  • Hopefully I chose the correct forum....

    On a SQL 2008 64 bit (10.0.5500.0) I got this error messages three time with interval of two minutes:

    Warning: Fatal error 823 occurred at May 11 2013  2:38PM. Note the error and time, and contact your system administrator.
    Error: 21, Severity: 24, State: 1.

    The error messages were during a server restart. The server was restarted after there was a problem accessing the drives. The drives are in a SAN. While adding additional disk space, the drives became unavailable to the server. Once the drives became availalbe/visible to the server then it was when the server was restarted.

    I am assuming "Error: 21, Severity: 24, State: 1." is part of the same issue even though is in a different line but created exactly a the same time.

    After that initial start up of the server, this error message is no longer coming up.

    Now, I found this text for "MSSQLSERVER_823"

    "This error is often the result of a hardware error, but may be caused by the device driver."

    in this link:

    http://msdn.microsoft.com/en-us/library/aa337267(v=sql.100).aspx

    So, my questions are:

    What truly caused this error?

    How do I know if is no longer an issue?

    Thanks before hand for your help and guidance.


    Paulino

    Sunday, May 12, 2013 3:50 PM

Answers

  • >What truly caused this error?

    Probably the same storage problem that caused the drives to become unavailable to the server before.

    >How do I know if is no longer an issue?

    This tool: sqliosim is useful to test out your storage to ensure it is working correctly for SQL Server.  Also ensure that you have the latest drivers and firmware for your HBAs.

    David

     



    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Paulino PP Monday, May 13, 2013 1:08 PM
    Sunday, May 12, 2013 4:39 PM

All replies

  • >What truly caused this error?

    Probably the same storage problem that caused the drives to become unavailable to the server before.

    >How do I know if is no longer an issue?

    This tool: sqliosim is useful to test out your storage to ensure it is working correctly for SQL Server.  Also ensure that you have the latest drivers and firmware for your HBAs.

    David

     



    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Paulino PP Monday, May 13, 2013 1:08 PM
    Sunday, May 12, 2013 4:39 PM
  • In additions to David's post, I would recommend that you run DBCC CHECKDB on all databases, to make sure that they are healthy.

    Errors like this one are not to be taken lightly. If they come out of the blue, there is all reason to move new hardware quickly. In this case, it seems that the error occurred with a special operation on the SAN, why there is less reason to panic. Nevertheless, monitor the situation and if the error occurs again, you are in trouble.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, May 12, 2013 5:27 PM
  • Run following command for all your databases

    DBCC CHECKDB('DatabaseName') WITH NO_INFOMSGS, ALL_ERRORMSGS

    If you see something like 

    CHECKDB found 0 allocation errors and 8 consistency errors in database. 

    numbers may vary then follow below procedure for database which gave above error

    --Put the database into single user mode
    ALTER DATABASE [DatabaseName] SET SINGLE_USER WITH rollback immediate
    --Check the erors and fix any issues found (that you can)
    DBCC CHECKDB ('DatabaseName', REPAIR_REBUILD)
    --Put the database back into multiuser mode
    ALTER DATABASE [DatabaseName] SET MULTI_USER WITH NO_WAIT

    If you do not receive any error message...its good you can ask you Storage team or whoever responsible to check the SAN.May be some Firmware

    upgrade is required.


    Soldier..Sir we are surrounded from all sides by enemy.. Major: Good, we can attack in any direction Thats attitude..

    Sunday, May 12, 2013 5:49 PM
    Moderator
  • As for what to do if DBCC reports errors, Books Online has this note:

    Use the REPAIR options only as a last resort. To repair errors, we recommend restoring from a backup. Repair operations do not consider any of the constraints that may exist on or between tables. If the specified table is involved in one or more constraints, we recommend running DBCC CHECKCONSTRAINTS after a repair operation. If you must use REPAIR, run DBCC CHECKDB without a repair option to find the repair level to use. If you use the REPAIR_ALLOW_DATA_LOSS level, we recommend that you back up the database before you run DBCC CHECKDB with this option.

    I highly recommend this strategy. Supposedly you have fairly good understanding on when this error occurred, why it should be possible to restore to a point in time just before the troubles started. Provided that you have the proper backups, of course.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, May 12, 2013 6:24 PM
  • David, thanks for your feedback. I helps me belief it was the issue with the SAN.

    I've checked the logs since then and I see no more errors like this. So, I will go with that.

    I am not technical at all, so after reading on sqliosim, that is way beyond my technical skills. I will just monitor the logs and if no longer see the error message for a while, the I will just conclude it was the initial SAN issue.

    Thanks so much for such quick and helpful response.


    Paulino

    Monday, May 13, 2013 1:14 PM
  • Erland,

    The scheduled CHECKDB routine I have for the server ran just fine after a few hours of the incident, so, databases are healthy. Thanks for the advice!


    Paulino

    Monday, May 13, 2013 1:15 PM
  • Ok.  Make sure you are taking frequent backups of your databases.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Monday, May 13, 2013 1:41 PM
  • Thanks for the advice

    Paulino

    Monday, May 13, 2013 2:33 PM
  • You should also run DBCC CHECKCONSTRAINTS to be really sure.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Monday, May 13, 2013 3:13 PM