locked
SQL Azure availability RRS feed

  • Question

  • In the SQL Azure SLA I read there's an uptime of 99,9%. In the presentations I see things about DB replicas and automatic failover.

    My question is: why doesn't that add up to 100% availability?

    I would expect that if my master DB is not accesable, the load balancer automatically switches to replica 1. If replica 1 fails replica 2 will be used. If there's an uptime of about 99% for each DB, it probably will never happen that all 3 DB's are down at the same time, which leads to virtually 100% uptime.

    Or doesn't it work like that?

    Tuesday, December 21, 2010 11:05 AM

Answers

  • I'd recommend reading the following article with regards to high availability calculations: http://en.wikipedia.org/wiki/High_availability

    There are various reasons why Microsoft can't currently provide an SLA higher than 99.9% uptime, such as natural disasters, hardware failures, hardware/software patches/upgrades, etc.

    However, you can take some specific measures to ensure that your application/data is more highly available. An example of that would be to grab a bag of popcorn and watch Liv Novik's presentation at the PDC entitled "How to Build Scale-Out Database Solutions on SQL Azure":  http://player.microsoftpdc.com/Session/591d586f-3732-4bff-8ee2-857f27d74df4

    Anyway, I think that 99.9% uptime is pretty reasonable for the first RDBMS in the cloud! 8.76 hours of downtime max per year isn't a terrible number. Also, that's the worst case scenario. I'm certain the guys working in the data centers are aiming for 5 nines. 


    -Ira Bell
    • Proposed as answer by Ira Bell, Nimbo Tuesday, December 21, 2010 5:06 PM
    • Marked as answer by Mog Liang Tuesday, January 4, 2011 4:04 AM
    Tuesday, December 21, 2010 5:06 PM

All replies

  • Actual availability versus what the lawyers will agree to place in marketting materials rarely coincide. :D
    Tuesday, December 21, 2010 2:00 PM
  • I'd recommend reading the following article with regards to high availability calculations: http://en.wikipedia.org/wiki/High_availability

    There are various reasons why Microsoft can't currently provide an SLA higher than 99.9% uptime, such as natural disasters, hardware failures, hardware/software patches/upgrades, etc.

    However, you can take some specific measures to ensure that your application/data is more highly available. An example of that would be to grab a bag of popcorn and watch Liv Novik's presentation at the PDC entitled "How to Build Scale-Out Database Solutions on SQL Azure":  http://player.microsoftpdc.com/Session/591d586f-3732-4bff-8ee2-857f27d74df4

    Anyway, I think that 99.9% uptime is pretty reasonable for the first RDBMS in the cloud! 8.76 hours of downtime max per year isn't a terrible number. Also, that's the worst case scenario. I'm certain the guys working in the data centers are aiming for 5 nines. 


    -Ira Bell
    • Proposed as answer by Ira Bell, Nimbo Tuesday, December 21, 2010 5:06 PM
    • Marked as answer by Mog Liang Tuesday, January 4, 2011 4:04 AM
    Tuesday, December 21, 2010 5:06 PM
  • Hi Ira ,

    If one the primary instance went down , What is the time taken to spin the second instance ?

    Thanks
    Seshu

    Thursday, March 10, 2011 9:26 PM
  • THere's actually 3 instances of your SQL Azure database that are always running and kept in sync real-time (transactions committed across all three simultaneously). So the proper question would be how long before connection attempts would be re-routed to one of the secondary copies. Nothing official has been stated on this, but it should only be a matter of a few seconds at most.
    Friday, March 11, 2011 8:18 PM