We created a service request (112032206193148) for an issue which occured Wednesday 3/21/2012 from 5:35 PM CST to 6:30 PM CST in the Chicago data center. All of our roles could
not be reached during this time period until, at 6:30 PM CST, we redeployed our package for new assignment of servers. Based on an analysis of errors reported from our servers, all roles were unable to communicate to blob, table, and SQL Azure.
Exception stack traces show thousands of time-outs during this outage period and our software was down hard, severely affecting our customers.
Azure customer support is 'looking into it' but because the problem resolved there is no sense of urgency from them that they will identify the problem cause. We have to report to our customers and leadership team that Azure is reliable and it is very
difficult to do this when we have no official explanation from Microsoft regarding the cause of the outage.
Has anyone seen this behavior before, where a role loses all connectivity to blob, table, and SQL Azure at the same time? And for an hour? What could be the cause of this behavior? I can't rule out that it is a bug in our server code, but
it is suspect for us to lose connectivity across all roles.