locked
Always On (2016) behavior during (simulated) unplanned outtage RRS feed

  • Question

  • We just configured a basic availability group in Synchronous mode (and not auto-failover) between db1 and db2 last week. Okay, so the original state before planned failover was normal meaning no warnings/errors and the application was happily attached.  So we did a planned failover to transition the pair to db1 being (primary) and db2 (standby) and the displays echoed that they were happy to do so (and were synchronized).  We then started our apps on the new primary node(db1), everything attached and was happy.  We then simulated a hard network outage by pulling the network cables out of the standby server and the primary reported that it was no longer synchronized with the standby (to be expected), but then all the apps were immediately ejected from the primary database...is that behavior to be expected?  I was looking around for documentation on this behavior and couldn't really find anything, so I thought I'd reach out to a forum of people that are nice and not condescending like the self appointed Gods present on the Oracle forums. :)  Is there any case other than a shutdown that would eject the apps from the primary database?  This is what my testers reported and I wasn't there this weekend when they tested out failover (I was playing golf...there, I said it!).  Thanks for any input that you have.        
    Monday, June 1, 2020 8:22 PM

All replies

  • Hi AllanChase99,

    Do you mean that after disconnecting the primary and secondary replicas, all applications connected to the primary replica are disconnected? This is not the expected behavior of Always On.

    Do you use the AG listener to connect to SQL Server?  Or only specify the instance name of the primary replica in the application to connect to SQL Server. After failover, did you modify the instance name.

    If specifying the instance name, please make sure to specify the current primary replica. But In this way, once the AlwaysOn failover occurs, you need to redirect the application to the new primary replica by modifying the application's connection string. This is very inconvenient.

    In order to allow applications to transparently connect to the primary replica without being affected by failover, it is recommended to use AG Listener.

    Best Regards,
    Cris


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, June 2, 2020 2:35 AM
  • Is there any case other than a shutdown that would eject the apps from the primary database? 

    I'm going to make an assumption here, which might be wrong. The assumption is that with the basic availability group, you're using windows failover clustering and have 2 nodes.

    If the above is correct, when the network cable was pulled, the cluster lost quorum. You should see an error akin to this in the SQL Server errorlog and additional data in the cluster log. If that's the case, you'll want to setup a witness if not already completed.

    >so I thought I'd reach out to a forum of people that are nice and not condescending like the self appointed Gods present on the Oracle forums. :)

    :P I know what you're talking about and you made me laugh!


    The views, opinions, and posts do not reflect those of my company and are solely my own. No warranty, service, or results are expressed or implied.

    Tuesday, June 2, 2020 2:03 PM
    Answerer
  • Thank you for the response, I greatly appreciate it. Okay, to answer you first question, yes, once we pulled the network cable out of the standby (basically disconnecting the two), our apps went red on our status monitor. Ohhh, as for the AG Listener, I'll check that; I wasn't aware that you could run without a listener, so that thought didn't occur to me. I'm used to running Oracle db's all over the place here and everything has a listener. Great point, I will check that. We didn't modify anything post failover. Our application is a little different (and yes feel free to cringe because we do weird things). We have one primary (local) and one standby (remote, 1000 miles away) and no real concept of local failover to a local database nor will we run the app from the remote standby instance. On the remote instance we aren't running apps, just receiving data to apply. I know, it's strange, but we all have been driving to do strange things with small budgets and strict schedules. I'm actually headed over to our lab now to have a really good look at the SQL Server logs. --Allan
    Tuesday, June 2, 2020 3:12 PM
  • Hi Allan,

    Thanks for you reply.

    It looks like you are using an AG listener, so my previous reply does not apply. As Sean said, what's your failover cluster quorum Configuration?

    If the cluster contains an even number of voting nodes, you should configure a disk witness or a file share witness. Otherwise, for a two-node cluster, if no witness is configured, the failure of one node will cause the cluster to stop running. 

    For the detailed error message, check the Windows Logs, CLUSTER.LOG, or SQL Server Errorlog. 

    Best Regards,
    Cris


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, June 3, 2020 2:07 AM
  • Hi Allan,

    Is there any update on this case? Was your issue resolved?

    If you have resolved your issue, please mark the useful reply as answer. This can be beneficial to other community members reading the thread.

    In addition, if you have another questions, please feel free to ask.
    Thanks for your contribution.

    Best regards,
    Cris


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, June 4, 2020 2:14 AM