locked
How much time it will take to search file from the azure blob storage if number of blobs is larger RRS feed

  • Question

  • I have One Container for storing word files,images,audios of my project.Here All blob names are stored in  the database at the time of upload.When user logs in then i am downloading all the assigned files to the user.I need some suggestion regarding if i keep the only one container  then does it affect the search speed?

    I can split them into maximum 3 containers like

    1)wordfile

    2)image

    3)audio.bcoz of some functional issues.

    Could you please suggest some solotion.

     

     

     

     

    Monday, April 18, 2011 12:20 PM

Answers

All replies

  • As I recall, a container for blobs is essentially a partition boundary. As such, all objects retrieved from a single container are still subject to the same published partition level performance levels (500 transactions per second for example). By splitting them up, you can help distribute this load more affectively.

    However, if your objects have a specific one-to-one relationship with a user, you may want to consider using the user as the partition ID, thus allowing you to distrubute the load by user. Furthermore, depending on what you're trying to accomplish, you may want to consider a table that uses the user as the partition key and stores meta data about each of the blobs you have on file. When you can query that table for the meta data for the use rand only hit blob storage once the user actually requests a specific blob.

    • Proposed as answer by Wenchao Zeng Wednesday, April 20, 2011 3:58 AM
    Monday, April 18, 2011 1:16 PM
  • in your container use for blob's name users username(if it is unique) or better id + / + blobname.

    When user logs in you will download only files in blobdirectory with unique user name or id.

    take a look at it here http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.cloudblobdirectory_members.aspx

     

    so if my unique name is testuser. blob¨s name will be:

    testuser/someblob.jpg

    testuser/someaudio.vmw

    Monday, April 18, 2011 1:17 PM
  • thank u for ur suggestion .My Blobnames are unique;Suppose i have 1 lakh small images in the container and want to download specific image(single image) does it affect the speed as compared to download specific image(single image) from 500 images.Thanks in advance.

    For retrieving image i am using following code:

         string sAccountName = "abcdtest";

                string sAccountKey = "some key";

                //  byte[] byteAccountKey = StrToByteArray(sAccountKey);

                StorageCredentialsAccountAndKey accountAndKey = new StorageCredentialsAccountAndKey(sAccountName, sAccountKey);

                CloudStorageAccount account = new CloudStorageAccount(accountAndKey, false);

                var container = account.CreateCloudBlobClient().GetContainerReference("abcd");

                // Get blob
                CloudBlob blob = container.GetBlobReference("user1_123456.doc");

                // Create a policy

                SharedAccessPolicy policy = new SharedAccessPolicy();

                policy.Permissions = SharedAccessPermissions.Read;
                policy.SharedAccessStartTime = DateTime.UtcNow - TimeSpan.FromMinutes(10);
                policy.SharedAccessExpiryTime = DateTime.UtcNow + TimeSpan.FromMinutes(15);

                // Create signature for the blob from the policy
                string sSignature = blob.GetSharedAccessSignature(policy);


                // Make the signed Url
                string sSignedUrl = blob.Uri.AbsoluteUri + sSignature;

    • Edited by Ninds001 Monday, April 18, 2011 1:50 PM
    Monday, April 18, 2011 1:38 PM
  • Answer is no. if you download image directly by blob's full path performance will not decrease with growing amount of blobs in the container.
    • Proposed as answer by Wenchao Zeng Wednesday, April 20, 2011 3:58 AM
    Monday, April 18, 2011 1:47 PM
  • However... there is a throughput limit per container. So if you try to simultaneously download multiple blobs from the same container, may run into performance issues. So just plan accordingly.
    Monday, April 18, 2011 2:02 PM
  • No, i am  not downloading blobs simultaneously.i am using foreach loop and one by one downloading

    the blobs.Thank u so much for ur important suggestion.


    • Edited by Ninds001 Tuesday, April 19, 2011 6:02 AM spelling mistake
    Tuesday, April 19, 2011 5:48 AM
  • Just to clear a few things up, there is no throughput limit per container.   There use to be one back in 2008 when we first provided our public 2008 CTP, but we got rid of that restriction in 2009.    Instead the throughput scalability targets are in terms of the overall storage account, as well as per blob.

     

    The partition key for the blob namespace is “container name + blob name” (basically the blob name).   This allows us to split the blob namespace across different servers and scale out the access across servers for blobs within a container.   So you can put all of your blobs within a single container and get the same scalability and throughput as if you used many containers.

     

    Please see the end of the following blog post to see the scalability targets for Blobs, Tables and Queues, and what the partition keys are:

    http://blogs.msdn.com/b/windowsazurestorage/archive/2010/05/10/windows-azure-storage-abstractions-and-their-scalability-targets.aspx

     

    In addition, see the following post that describes the high level architecture and some of these concepts:

    http://blogs.msdn.com/b/windowsazurestorage/archive/2010/12/30/windows-azure-storage-architecture-overview.aspx

     

     

    Thanks,

    Brad

    • Marked as answer by Wenchao Zeng Monday, April 25, 2011 3:49 AM
    Saturday, April 23, 2011 8:57 AM
  • Thanks Brad. Majorly helpful. :)
    Monday, April 25, 2011 12:45 PM