locked
Database Mirroring & DR Exercise RRS feed

  • Question

  • SQL 2008

    Production Server - principal database

    DR Server - mirror database and witness

    The issue I am having is that when we cut the link to our production network for the DR exercise the principal database shows as disconnected/in recovery and the frontend production application does not work 100%.  The DR frontend works connected to the mirror database but users on the production network can not use the production frontend because of the principal database being in recovery mode.

    My question is what steps should I be performing to insure the production database and mirror database are both available when the link is cut to our production network.

    Thanks!

    Thursday, June 14, 2012 8:18 PM

Answers

  • Hi!

    Basic thing about mirroring: You have the Witness server in place, so you avoid data loss. The basic rule is: If the principal and the mirror can no longer see each other on the network, the one that can see the Witness Server will become the new principal, the other one will asume the role of mirror. This is what happened in your test with the DR site having the Witness. So to keep your production online the Witness needs to be in the main site.

    Now... To do a DR test (or bring DR online in a real DR scenario) you do not per se break the mirror. What you do is connect manually to the SQL Server in the DR site and issue an "ALTER DATABASE <dbname> SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS" command. With this you basically tell the passive node that it should forget about the Witness and just assume that he has to be the primary now. Once you issue this command the DB will come online and you can connect to it.

    After bringing the link back up between the sites your Mirrors will get into a "Suspended" state. In your case both servers are online, being "Principal/Suspended", because one was online with the Witness intact and the other was forced online by you. In this scenario you have to kill one and restore it from a backup of the other one. (As you said.) In a real DR scenario where your primary data center had been completely down the primary would have lost its primary role by then and you would end up with the old primary being "Mirror/Suspended" while the DR site would be "Principal/Suspended". In this scenario you could issue a "ALTER DATABASE <dbname> SET PARTNER RESUME" command to force the changes on the DR server to be replicated back to the primary site and everything that had happened within the primary site to be dismissed.

    Theoretically (at least when following documentation...) it should in your test scenario be possible to issue the SET PARTNER RESUME command on your primary database, forcing the DR site to drop everything it had done and resync with the primary, but I have never tried that, so I can't really recommend it...

    Hope that helps

    Lucifer

    • Marked as answer by Maggie Luo Sunday, June 24, 2012 2:42 PM
    Friday, June 15, 2012 4:44 PM

All replies

  • Well, what you are describing is a fail over scenario.  you have to use automatic client redirect  with mirroring.

    See MSDN for details regarding automatic client redirection with mirroring and supported data access protocols.

    For e.g. if you are using .net you can use failover partner attribute in the connection string, this will cause client application to connect to your DR automatically (which just became principal) after failover.


    • Edited by Chirag Shah Thursday, June 14, 2012 9:33 PM
    Thursday, June 14, 2012 9:33 PM
  • Hi Ribcas!

    Bears one question for me: Where did you put your Witness server?

    The thing is: If the Witness server is in the DR site your principal has to go down because it no longer sees any of the others. And that should be like this... In the end, if both DBs are online at the same time you can't reconnect them afterwards... This is then a case of "Split Brain" where you would run into a chance of dataloss if you are not carefull...

    So if I got your scenario correct the way would be: Have your Witness in the primary facility and force the DR site online manually. That would ensure the primary is still working, and it would only trash your DR copy of the data...

    Lucifer

    Thursday, June 14, 2012 9:35 PM
  • The witness server is on the DR server.  We can move it to the production server, that will not be an issue.

    However, when you say force the DR site online manually . . . what exactly do you mean.  Sorry I do not setup mirroring much so I am learning as I go.

    I was thinking that we can have the mirroring setup to be synchronous but the day before the DR exercise remove the mirroring.  I am not worried that there will be any data changes that night. After the DR exercise we can then make a backup of production and restore it to DR and setup mirroring again.

    The issue here is that I understand the mirror server becomes the principal server when we cut the link to our production network.  However, the production network is still up during the DR exercise so the users don't and shouldn't be connected to our DR network.  In a true DR, the users will be but not during our DR exercises.

    Thanks

    Friday, June 15, 2012 4:14 PM
  • Hi!

    Basic thing about mirroring: You have the Witness server in place, so you avoid data loss. The basic rule is: If the principal and the mirror can no longer see each other on the network, the one that can see the Witness Server will become the new principal, the other one will asume the role of mirror. This is what happened in your test with the DR site having the Witness. So to keep your production online the Witness needs to be in the main site.

    Now... To do a DR test (or bring DR online in a real DR scenario) you do not per se break the mirror. What you do is connect manually to the SQL Server in the DR site and issue an "ALTER DATABASE <dbname> SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS" command. With this you basically tell the passive node that it should forget about the Witness and just assume that he has to be the primary now. Once you issue this command the DB will come online and you can connect to it.

    After bringing the link back up between the sites your Mirrors will get into a "Suspended" state. In your case both servers are online, being "Principal/Suspended", because one was online with the Witness intact and the other was forced online by you. In this scenario you have to kill one and restore it from a backup of the other one. (As you said.) In a real DR scenario where your primary data center had been completely down the primary would have lost its primary role by then and you would end up with the old primary being "Mirror/Suspended" while the DR site would be "Principal/Suspended". In this scenario you could issue a "ALTER DATABASE <dbname> SET PARTNER RESUME" command to force the changes on the DR server to be replicated back to the primary site and everything that had happened within the primary site to be dismissed.

    Theoretically (at least when following documentation...) it should in your test scenario be possible to issue the SET PARTNER RESUME command on your primary database, forcing the DR site to drop everything it had done and resync with the primary, but I have never tried that, so I can't really recommend it...

    Hope that helps

    Lucifer

    • Marked as answer by Maggie Luo Sunday, June 24, 2012 2:42 PM
    Friday, June 15, 2012 4:44 PM
  • Thank you for the explanation.  I understand it a lot better now and will try what you stated.  Thanks again.
    Friday, June 15, 2012 7:55 PM
  • Sure thing. If something should still be unclear before/during/after your test please don't hesitate to start a post.

    Oh, and please mark a post as Answer if it did in fact answer your question.

    Friday, June 15, 2012 9:57 PM