銷售: 1-800-867-1380

 none
Storage certificate expired?

    問題

  • So is it just me, or did the HTTPS certificate for Azure Storage just expire? 

    Might want to fix that, ASAP. It also wouldn't hurt to put a sticky note on someone's monitor so they remember to update that before it expires next time.

    UPDATE:

    After reading through most of the replies here, the best solution seems to be changing your configuration through the management portal. Change all of your storage connection strings to use HTTP instead of HTTPS - EXCEPT for the Diagnostics connection string. That one must remain as HTTP (thanks to @Matt in Cambridge for that detail)

    • 已編輯 Brian Reischl 2013年2月23日 上午 12:11 Adding update with solution
    2013年2月22日 下午 08:56

解答

  • The following line of code will ignore the expired SSL certificate exceptions.  Put this line of code at the beginning of your WebRole.OnStart method and also in your Global.asax's Begin_Application method:

    // Temporarily not care about Microsoft's ridiculous SSL certificate expiration.
    
    ServicePointManager.ServerCertificateValidationCallback = (a, b, c, d) => true;

    You can continue to use SSL endpoints to Azure Storage, including the Diagnostics configuration.  The main problem you will face is deploying this code to the cloud because Service Management is down.  If you were fortunate enough to setup Remote Desktop, you can remote into each instance and copy/paste deploy the solution.

    To do this, you must create a cspkg by right-clicking on the Azure Project and selecting 'Package'.  This will build a cspkg file for you and open the folder in Windows Explorer.  Open the cspkg with 7Zip (or another zip utility) and then open the .cssx file inside this archive (should be the largest file).  Navigate to the siteroot folder and copy/paste its contents into E:\sitesroot on your web role instances.  Apply an application pool recycle in IIS and you should be up and running with the new line of code you added.  I would *not* suggest an iisreset as this requires another service (World Wide Web Publishing Service) to be started before your application will work.

    An even lengthier version of the delegate can be used to allow a one day grace period on expiration dates:

    ServicePointManager.ServerCertificateValidationCallback = (sender, cert, chain, sslPolicyError) => { if (sslPolicyError == SslPolicyErrors.None) { return true; } else if (sslPolicyError == SslPolicyErrors.RemoteCertificateChainErrors) { var expDate = DateTime.Parse(cert.GetExpirationDateString()); if (expDate < DateTime.Now && expDate.AddDays(1) >= DateTime.Now) { // TODO: Warn about SSL certificate expiration. return true; } } return false; };

    Please note: This delegate applies to *all* SSL certificate validations.  Use with care.

    2013年2月23日 上午 12:12
  • Just restored my application using remote desktop to service instances and uploading patch in there. Weren't able to alter configuration directly on servers, but deployed patched dll that has hardcoded storage connection string with 'http' endpoint.

    Hope that helps you our somebody to find quick solution.

    2013年2月22日 下午 10:27
  • Thanks Alex, that's super helpful.

    I was able to restore my application by modifying the connection strings in the Windows Azure portal and changing "DefaultEndpointsProtocol=https" to "DefaultEndpointsProtocol=http" here:

    windows.azure.com -> Cloud Services -> (app name) -> Configure -> Connection strings section.

    Hope this helps someone else.



    2013年2月22日 下午 10:45

所有回覆

  • No sh*t. We've got several applications down, thanks to this.
    2013年2月22日 下午 08:58
  • If your app is set up to accept reconfiguration on the fly, you might be able to flip to using HTTP instead of HTTPS. If not, you're pretty much up a creek. Doing new deployments seems to use storage under the covers, so you can't right now. 
    2013年2月22日 下午 09:00
  • Wasn't it about a year ago they cert issues too? I think leap year caught them.
    2013年2月22日 下午 09:07
  • THANK YOU! We'll give that a try wherever possible.
    2013年2月22日 下午 09:11
  • ETA on fix please.

    Seems like a pretty rookie mistake.

    2013年2月22日 下午 09:15
  • This is also affecting BLOB storage (not just Table storage)
    2013年2月22日 下午 09:16
  • yes me too have the same issues...able to connect to storage account with out ssl but with ssl it fails.
    • 已編輯 LongHorn7 2013年2月22日 下午 09:25
    2013年2月22日 下午 09:19
  • Most of our apps are screwed up now! 

    WHATS NEXT? All compute instances die because someone at the data center switched them off?

    2013年2月22日 下午 09:19
  • Get the following error when configuring the DefaultEndpointsProtocol of the Storage Connection in the Management Portal.

    Failed to update the configuration for the production deployment of cloud service <servicename>.

    The server encountered an internal error. Please retry the request.

    Common MS. This is the second time in a week we have an outage!

    2013年2月22日 下午 09:27
  • This is unacceptable, I'm supposed to release an enterprise app on this platform?

    Imagine how many phone calls I would have gotten by now from very angry customers.

    Sad...

    2013年2月22日 下午 09:29
  • We're entirely down too.  US South Central.  It's only the SSL endpoint for us as well.
    2013年2月22日 下午 09:33
  • Down here too. Microsoft???? You let the certificate expire???

    Wayne

    2013年2月22日 下午 09:39
  • I feel sorry for the person at Microsoft who will have to deal with this.  Looks like someone must be regenerating a new certificate and figuring out how to deploy it platform-wide right about now.  Pretty amazing that the entire Windows Azure storage platform is offline globally.
    2013年2月22日 下午 09:45
  • Seems our application is down as well, however this seems to affecting our entire application at this point not just blob/table storage. Also can't seem to access the portal currently either. What is going on Microsoft, can we get some techs on this now please.
    2013年2月22日 下午 09:45
  • Same issues here, we have Production stuff affected (US and Europe) We noticed this before the Dashboard complained about it, I spoke with Premier Support and they did not have an ETA yet. it seems ironic, just last week we were notified about updating our certificates as they would be expiring in the next 3 months, and MSFT did not have an alert for their own certificates expiring??

    2013年2月22日 下午 10:26
  • Just restored my application using remote desktop to service instances and uploading patch in there. Weren't able to alter configuration directly on servers, but deployed patched dll that has hardcoded storage connection string with 'http' endpoint.

    Hope that helps you our somebody to find quick solution.

    2013年2月22日 下午 10:27
  • That's a nifty trick. 

    You may also be able to reconfigure via RDP. The config files are stored in c:\config. But I'm not sure if changes will be picked up on the fly or not. 

    2013年2月22日 下午 10:30
  • Thanks Alex, that's super helpful.

    I was able to restore my application by modifying the connection strings in the Windows Azure portal and changing "DefaultEndpointsProtocol=https" to "DefaultEndpointsProtocol=http" here:

    windows.azure.com -> Cloud Services -> (app name) -> Configure -> Connection strings section.

    Hope this helps someone else.



    2013年2月22日 下午 10:45
  • Editing my config via the portal caused my instances to recycle until they eventually reported as corrupt, requiring the deployment to be deleted and re-created. I have no idea if this is repeatable but you should be careful.

    2013年2月22日 下午 11:04
  • Storage is currently experiencing a Worldwide outage impacting HTTPS operations (SSL traffic). Status of affected services will be updated in the table below. We have identified the root cause and are validating the recovery options before implementing them. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers

    2013年2月22日 下午 11:04
  • Switching to http helps, but for "blob.Exists", I'm getting a Forbidden error.
    2013年2月22日 下午 11:09
  • One more thing to note - after changing the diagnostics connection string to HTTP, the Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitor.Start() method throws an Exception that it's not a secure connection - so best to leave that one as HTTPS.  I have the diagnostic initialization code isolated in its own try/catch and its own thread, so my app is still able to start (albeit with no diagnostics).  I imagine if someone has put this code in the main code path, this might result in an unhandled exception and the role may never come online.






    2013年2月22日 下午 11:23
  • Good tip. How do you do that Matt? Where is the diagnostic initialization code?
    • 已編輯 tofutim 2013年2月22日 下午 11:26
    2013年2月22日 下午 11:25
  • This is what I've written for my web roles.  In WebRole.cs:

            public override bool OnStart()
            {
                ThreadPool.QueueUserWorkItem(new WaitCallback(StartDiagnostics), new object());
                return base.OnStart();
            }
            public void StartDiagnostics(object o)
            {
                try
                {
                    int transferPeriodMinutes = 2;
                    var config = DiagnosticMonitor.GetDefaultInitialConfiguration();
                    config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(transferPeriodMinutes);
                    config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
                    config.Logs.BufferQuotaInMB = 10;
                    config.DiagnosticInfrastructureLogs.ScheduledTransferPeriod = TimeSpan.FromMinutes(transferPeriodMinutes);
                    config.DiagnosticInfrastructureLogs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
                    config.DiagnosticInfrastructureLogs.BufferQuotaInMB = 10;
                    config.WindowsEventLog.DataSources.Add("Application!*");
                    config.WindowsEventLog.DataSources.Add("System!*");
                    config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(2);
                    config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
                    config.WindowsEventLog.BufferQuotaInMB = 10;
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET Applications(__Total__)\Requests/Sec",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET v4.0.30319\Requests Queued",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET v4.0.30319\Requests Rejected",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET v4.0.30319\Request Execution Time",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET v4.0.30319\Request Wait Time",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\ASP.NET\Application Restarts",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\.NET CLR Exceptions(_Global_)\# Exceps Thrown / sec",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\Processor(_Total)\% Processor Time",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration()
                    {
                        CounterSpecifier = @"\Memory\Available Bytes",
                        SampleRate = TimeSpan.FromSeconds(30),
                    });
                    config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(transferPeriodMinutes);
                    config.PerformanceCounters.BufferQuotaInMB = 10;
                    Microsoft.WindowsAzure.Diagnostics.CrashDumps.EnableCollection(true);
                    DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", config);
                }
                catch (Exception exc)
                {
                    // log this exception if you want
                }
            }

    With the above setup, even if the StartDiagnostics method takes a long time or throws an exception, the app is not affected, other than diagnostic data not being saved.

    Note, you must remove the ASP.NET perf counters if you want to use this code with a worker role.




    2013年2月22日 下午 11:38
  • I am also unable to change my Cloud Services to use HTTP instead of HTTPS in the storage connection string as a workaround.  Apparently the SSL certificate used to publish a cloud service through Visual Studio has expired as well.

    Someone is going to get fired over this one.  How do you forget to renew a certificate, especially when you own your own certificate authority.

    2013年2月22日 下午 11:54
  • Thank god we have a DR plan and ditched Azure or we would still be down. Any ETA on this amateur like mistake? Also why only 2 years? Plus why can 1 cert take down everything? I would think they would have multiple issued certs with different parameters. seems like if someone was able to take over this cert they would have access to everything. Or worse yet what if it expires? oh yeah we know what happens then.

    AzureCloud eq 'Vapor'

    2013年2月22日 下午 11:59
  • I too noticed that publishing from VS is broken, but if your connection strings are stored in the ServiceConfiguration.cscfg file, I can confirm that a valid workaround is to modify them in the windows.azure.com portal.  My app is now back online, with the exception of the diagnostics being down (see above).


    2013年2月23日 上午 12:08
  • The following line of code will ignore the expired SSL certificate exceptions.  Put this line of code at the beginning of your WebRole.OnStart method and also in your Global.asax's Begin_Application method:

    // Temporarily not care about Microsoft's ridiculous SSL certificate expiration.
    
    ServicePointManager.ServerCertificateValidationCallback = (a, b, c, d) => true;

    You can continue to use SSL endpoints to Azure Storage, including the Diagnostics configuration.  The main problem you will face is deploying this code to the cloud because Service Management is down.  If you were fortunate enough to setup Remote Desktop, you can remote into each instance and copy/paste deploy the solution.

    To do this, you must create a cspkg by right-clicking on the Azure Project and selecting 'Package'.  This will build a cspkg file for you and open the folder in Windows Explorer.  Open the cspkg with 7Zip (or another zip utility) and then open the .cssx file inside this archive (should be the largest file).  Navigate to the siteroot folder and copy/paste its contents into E:\sitesroot on your web role instances.  Apply an application pool recycle in IIS and you should be up and running with the new line of code you added.  I would *not* suggest an iisreset as this requires another service (World Wide Web Publishing Service) to be started before your application will work.

    An even lengthier version of the delegate can be used to allow a one day grace period on expiration dates:

    ServicePointManager.ServerCertificateValidationCallback = (sender, cert, chain, sslPolicyError) => { if (sslPolicyError == SslPolicyErrors.None) { return true; } else if (sslPolicyError == SslPolicyErrors.RemoteCertificateChainErrors) { var expDate = DateTime.Parse(cert.GetExpirationDateString()); if (expDate < DateTime.Now && expDate.AddDays(1) >= DateTime.Now) { // TODO: Warn about SSL certificate expiration. return true; } } return false; };

    Please note: This delegate applies to *all* SSL certificate validations.  Use with care.

    2013年2月23日 上午 12:12
  • I bet a lot of people here have accidentally let an SSL cert expire, or nearly done so. I know I have. It's easy to forget, right? It's an amateur mistake, but it happens. You end up with some egg on your face, add a calendar reminder for next year, and move on.

    But it's now been almost six and a half hours. The last progress update on the dashboard is that you're "evaluating accelerated repair options" and hope to have a fix in two hours. Seriously? An eight-plus hour, near-total outage is an "accelerated repair"? I, by myself, a one man band, have fixed this problem in under two hours. You're Microsoft! You've got your own intermediate certificate authority! It's not like you needed to get purchasing authority for a new certificate - it should've taken about six seconds! And it's not like you could really break SSL any worse, right? So quit evaluating options to maybe fix it someday and just deploy the certificate already! 

    At this point all I can really say is: W.T.F.

    </rant>

    2013年2月23日 上午 03:01
  • If anyone is interested to read about the leap year bug from last year: http://blogs.msdn.com/b/windowsazure/archive/2012/03/09/summary-of-windows-azure-service-disruption-on-feb-29th-2012.aspx

    And while we're just hanging out waiting for the Azure SSL to come back to life, here's an interesting read from Wired's 2006 interview with Gary McKinnon: http://www.wired.com/techbiz/it/news/2006/06/71182?currentPage=all

    ... and a good interview with Bill Gates on Charlie Rose when Win95 and good ol' dial-up networking were hot (really puts Azure in perspective): http://www.youtube.com/watch?v=M1EsIusQJQM


    2013年2月23日 上午 04:39
  • Might be back in business? I just got an application to upload...
    2013年2月23日 上午 05:28
  • Yup!  All of my storage is now working properly in South Central US as of 12:33AM EST.  Thanks Microsoft folks for the late hours getting this done; can't imagine it was a simple recovery process.

    2013年2月23日 上午 05:34
  • I think it's all fixed now but yeah, 6 hours is a bit of a wait if you've built your enterprise app on this platform. I think they did a good job getting things back online although it did take awhile, weigh that against the number of servers/instances/etc. out there.

    Here's the Register's article on the outage:

    http://www.theregister.co.uk/2013/02/22/azure_problem_that_should_never_happen_ever/

    Hopefully this is a wake up call to audit *everything* and set 1-2 week pre-reminders so this never happens again.

    2013年2月23日 下午 02:39
  • Well i think this is still a temporary fix, the cert in place now, is just valid through July 31 2013. so they will probably work on replacing this again in the coming weeks, or if they forget, maybe we'll experience another one at this time... :)

    2013年2月23日 下午 04:30
  • Wow, is it for real? I can't believe Redmond forgot renew the Certificates.

    Gulab Prasad,
    gulab@exchangeranger.com
    My Blog | Z-Hire Employee Provisioning App

    2013年2月23日 下午 06:01
  • I located a post on a different thread where you can monitor for upcoming certificate expirations using System Center Operations Manager (SCOM):

    - Go to the authoring pane\Management Pack Templates
    - Right click and select New Web Application.
    - Put in the appropriate name, watcher node information, and https://  url.
    - Click the Customize checkbox to go into the advanced configuration/recording screen.
    - Double click on the web request item that was created by the wizard.
    - Click on the Custom Error tab
    - Insert an item for the "base page" as "Days to Expiry"
    - Put in the operator (like Less than)  and the number of days before expiration in the value field.
    - Save and apply the changes.

    I really hope we see improvement in this area. Just a heads up that the next expiration is now 156 days from today (7/31/2013) unless I'm reading this wrong:

    If I were responsible for the platform, directly after the first outage on leap year I would have probably asked for someone to provide me a spreadsheet with all of the certificates responsible for each respective Azure service and have it sorted by their expiration date. The Azure platform is so incredibly fantastic and it's such a tragedy that the naysayers will use events like this to badmouth Microsoft and Azure. It's 2013 and the great race is on. There's absolutely no excuse for a school boy mistake like this. Let's step our game up guys.


    -Ira Bell



    2013年2月25日 下午 05:30
  • Here's a sample C# console application which will return the expiration date of a certificate for a given URL (many thanks to Wade Wegner for mentioning System.Net.ServicePoint):

    using System;

    using System.Collections.Generic;

    using System.Linq;

    using System.Text;

    using System.Net;

    namespace ConsoleApplication1

    {

        class Program

        {

            static void Main(string[] args)

            {

                Console.WriteLine(GetSSLExpiryDate());

                Console.ReadLine();

            }

            public static string GetSSLExpiryDate()

            {

                string url = "https://manage.windowsazure.com/";

                var request = WebRequest.Create(url) as HttpWebRequest;

                var response = request.GetResponse();

                if (request.ServicePoint.Certificate != null)

                {

                    return request.ServicePoint.Certificate.GetExpirationDateString();

                }

                else

                {

                    return string.Empty;

                }

            }

        }

    }

    Will return the following:


    -Ira Bell

    2013年2月25日 下午 11:29