Ask a questionAsk a question
 

General DiscussionRESOLVED: Windows Azure Outage

  • Saturday, March 14, 2009 5:29 PMSteve MarxMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    UPDATE [3/17/09 7:44PM PDT]: Summary of what happened and what we're doing about it is on the Windows Azure blog: http://blogs.msdn.com/windowsazure/archive/2009/03/18/the-windows-azure-malfunction-this-weekend.aspx
    UPDATE [8:24PM PDT]: This issue is resolved.  Windows Azure is operating normally again.

    UPDATE [3:36PM PDT]: We've identified and verified a recovery process that we're just now applying throughout the cloud.  ETA is five hours to complete recovery.  I'll post again when everything's back to normal.

    Windows Azure is currently experiencing an outage.  We are investigating but do not yet have an ETA for a resolution.  A large number of deployments are currently offline, and are slow to restart.

    What is affected: Applications may be unreachable or in "stopped" or "initializing" states for long periods of time.

    When the outage began: About 10:30pm PST last night.

    Who is affected: Potentially anyone currently running an application in Windows Azure.


    We will post updates to this thread throughout the day as we investigate and resolve the outage.  There is currently no ETA for a fix.

All Replies

  • Saturday, March 14, 2009 5:51 PMwillvv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hi

    Good thing to know it's not just me, although it started earlier for me, around 5 p.m PST

    Application Id: 000000004400D96E

    Private deployment #s:

    40e00cdf53ff465fb207febc4cecdcdf
    f1da08d5c34b461db743408c8a0a5b50

    They have been initializing for almost 20 hours.
  • Saturday, March 14, 2009 6:01 PMKarthik-K Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Any idea on when will the issue be solved. Im having a demo on Windows Azure tomorrow!! Please help!!
    _______________________
    Karthik.K
    Microsoft Student Partner
  • Saturday, March 14, 2009 10:35 PMSteve MarxMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    UPDATE: We've identified and verified a recovery process that we're just now applying throughout the cloud.  ETA is five hours to complete recovery.  I'll post again when everything's back to normal.
  • Saturday, March 14, 2009 11:33 PMRoger Jennings Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Steve,

    If the outage began at 10:30 PM and it's now ~11:30 PM (UCT) the next day that's more than a 24-hour outage, disregarding the five hours to roll the recovery.

    Not so good,

    --rj

    P.S. My two Azure demo apps (blobs http://bit.ly/gxjjP and tables http://bit.ly/oifH) are back up and running.

    OakLeaf Blog
  • Saturday, March 14, 2009 11:40 PMDavid YackMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Things seem to be starting to come alive - one of ours we stopped and restarted and it was back up now

    Will look forward to hearing more about what happened to cause the downtime

    Dave
    .NET MVP / Microsoft Regional Director
  • Sunday, March 15, 2009 3:24 AMSteve MarxMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    This issue is resolved.  Windows Azure is operating normally again.  If you experience any new issues, please start a new forum thread so our team can investigate.

  • Sunday, March 15, 2009 3:26 AMDavid YackMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Great, Thanks for the update Steve.

    When can we expect to hear some more details about what happened?
    .NET MVP / Microsoft Regional Director
  • Sunday, March 15, 2009 11:07 AMSteve MarxMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    When the whole team's back in the office on Monday, I'm sure we'll do a root cause analysis to understand exactly what went wrong and what we need to do to ensure it doesn't happen again.

    Once we have that sorted out, I'll put together a summary.  (Probably won't get that out until after the MIX conference next week.)

  • Monday, March 16, 2009 9:30 PMRoger Jennings Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Steve,

    If you don't want to spend all your time at MIX 09 answering questions about the outage, I'd suggest posting the explanation before Wednesday.

    Cheers,

    --rj
    OakLeaf Blog
  • Wednesday, March 18, 2009 2:44 AMSteve MarxMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    I just posted a summary of what happened and what we're doing about it to the Windows Azure blog: http://blogs.msdn.com/windowsazure/archive/2009/03/18/the-windows-azure-malfunction-this-weekend.aspx.

  • Wednesday, March 18, 2009 12:40 PMRick Razzano Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Steve Marx said:

    When the whole team's back in the office on Monday, I'm sure we'll do a root cause analysis to understand exactly what went wrong and what we need to do to ensure it doesn't happen again.

    Once we have that sorted out, I'll put together a summary.  (Probably won't get that out until after the MIX conference next week.)

    Wow...I can see you are really taking this service interruption seriously.   Quite the urgent response to a major event.

    Outage? Down for 22 hours?  No problem, when Ted gets back from getting coffee and once we've had Jenny's office birthday party and after that, if we don't decide to take a long lunch then go home early, maybe we'll get around to figuring out what brought down the system for all that time.  Then again, I'm sure it can wait.  Oooh, developer conference!

    When you're ready to support real enterprises, let us know.  Until then, you might just want to stick to hosting data for blogs about animals with hats on and fantasy football.