Answered Frequent role restarts since the leap-day issue

  • Tuesday, March 13, 2012 1:04 AM
     
     

    We are noticing that since the 2/29 issue, many of our deployments are showing individual role abort/restarts at seemingly random times during the day (in fact, several per day), even on deployments that have been stable since mid-January. Is there some rolling upgrade process still going on in the North-Central data center? I can appreciate that machines are getting updated as-needed, but I really think in addition to "Last abort time" you need to display "Abort Reason"; this should (somehow) reflect an operations-required bounce, or a code/deployment (health) issue.

    Thoughts? This is _very_ unsettling...

All Replies

  • Tuesday, March 13, 2012 1:51 AM
     
     

    Hello.

    According to Dashboard (https://www.windowsazure.com/en-us/support/service-dashboard/) there were some issues with North Central data center. But it's indeed not normal. Can you contact Windows Azure support? http://www.windowsazure.com/support

  • Tuesday, March 13, 2012 8:13 PM
     
     

    Bumping this thread...

    If we don't hear anything from the MSFT community here, we will have to contact support - just using the forums as a first-check.  What we are looking for are if there are any known, rolling upgrades that may be occurring - I don't much care what they are, just that they are happening. If they are _not_ happening (aside from normal OS patching, etc.), we definitely have a problem - and a very serious one since previously stable deployments are experiencing the same abort/restarts.

    Thanks!

  • Tuesday, March 27, 2012 3:29 AM
     
     Answered

    Hi SagerCat,

    As far as I know, the impact of the leap day issue is no longer there. Do you still have that recycling problem with your deployments?  There's something I think you can try :

    1. If you have configured remote desktop connection, logon to check the event logs and C:\logs\** files. You might find some footprints of the restarts.

    2. Did you try re-imaging them? Some of my customers were able to solve their problem by doing that.

    3. If the problem is still going on, I really suggest you to raise a service ticket at http://www.windowsazure.com/en-us/support/contact/ . You will leave the deployments running, provide your subscritpion ID, deployment ID, the symptom and the accurate time (with timezone) . people will help you check both your VM and the hosting environment.

    -Emma