locked
Slow Blob Retrieval RRS feed

  • Question

  • I am finding that occasionally throughout the day, retrieving a small 224K Blob takes an inordinately long time (a minute) and then becomes fast again (sub-second).   In one example, the Blob had been written 2.5 minutes earlier, in case that matters.   Microsoft's SQL claims this action should take at most 2 - 4 seconds as I understand it.
     
    This C# application runs on 3 small vmsize servers (an uploaded .cspkg file, not a vmrole) .  It uses the Storage Client Library to talk to Windows Azure Storage (Containers and Blobs).  It also uses SQL Azure, but I collect timings on the SQL access and the retrieval of the Blob separately and can eliminate the SQL Azure as a culprit.
    Here are the lines of code that are slowing down:
     
        /// <summary>
        /// Copies blob to a specified file path (i.e. on a LocalResource) if the blob exists. 
        /// Returns true if the blob was downloaded, false otherwise. 
        /// False is returned if the blob does not exist (as opposed to an exception).
        public static bool TryDownloadToFile(this CloudBlob blob, string outputFilePath)
        {
          try
          {
            blob.FetchAttributesEx(); //Populate properties. Throws AzureBlobNotFoundException    
            blob.DownloadToFile(outputFilePath);
            return true;
          }
          catch (AzureBlobNotFoundException)
          {
            return false;
          }
        }
    
     
    The machine is not particularly busy when this happens.   The timings I have are recorded by the application, so it is not a JIT compilation issue or similar.   
    The application personalizes images and then caches the result as a blob so that other running instances can serve them up quickly.  It is written as a handler. 
    I would be happy to provide further details.
    Tuesday, July 12, 2011 11:49 PM

Answers

  • Another thing to note is that the Storage Client has a default exponential backoff retry policy. If the client isn't busy but the storage account / blob partition is then the client could be receiving a Server Busy, which would then cause it to backoff by approx 3 seconds, 30s, 90s etc.  This would all be absorbed inside the method itself.  If you are trying to debug a timing issue like this I would suggest you set the rety policy to NoRetry to eliminate this possibility and directly handle any potential exceptions that are thrown.

     

    joe

    • Marked as answer by Wenchao Zeng Wednesday, July 20, 2011 5:34 AM
    Tuesday, July 19, 2011 4:42 PM

All replies

  • Hi Bayardw,

    If this issue only occurs when the code is deployed on cloud. I'd suggset installing fiddler (with Decrypting HTTPS-protected traffic) on the vm instance via RDP to capture the raw http requests for investigating.

    By the way, to test the average blob download time, I created an console application using the following code and ran the application via RDP to the service instance:

    class DownloadBlob
    {
        static void Main(string[] args)
        {
            var storageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=<removed>;AccountKey=<removed>");
            var blobStorage = storageAccount.CreateCloudBlobClient();

            // The blob size is 5 MB.
            CloudBlob blob = blobStorage.GetBlobReference("files/WindowsAzureProject1.cspkg");
            string targetPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location) + "\\result.zip";

            DateTime currentTime = DateTime.Now;

            TryDownloadToFile2(blob, targetPath);

            Console.WriteLine((DateTime.Now - currentTime).TotalSeconds.ToString() + " seconds");
            Console.ReadLine();
        }

        public bool TryDownloadToFile2(CloudBlob blob, string outputFilePath)
        {
            blob.DownloadToFile(outputFilePath);
            return true;
        }
    }

    After running the same console application about 10 times in a small instance, the result is that downloading a 5.8MB blob needs between 1 second and 20 seconds.

    Thanks.


    Wenchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
    Wednesday, July 13, 2011 9:12 AM
  • Another thing to note is that the Storage Client has a default exponential backoff retry policy. If the client isn't busy but the storage account / blob partition is then the client could be receiving a Server Busy, which would then cause it to backoff by approx 3 seconds, 30s, 90s etc.  This would all be absorbed inside the method itself.  If you are trying to debug a timing issue like this I would suggest you set the rety policy to NoRetry to eliminate this possibility and directly handle any potential exceptions that are thrown.

     

    joe

    • Marked as answer by Wenchao Zeng Wednesday, July 20, 2011 5:34 AM
    Tuesday, July 19, 2011 4:42 PM
  • Hi,

    I will mark the reply as answer. If you find it no help, please feel free to unmark it and follow up.

    Thanks.


    Wenchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
    Wednesday, July 20, 2011 5:34 AM
  • That makes complete senes.  I will code that up and try it out.   Your advice was very timely.   Thank you!
    Friday, July 22, 2011 6:38 PM