Explanation of why rehydrating few, large blobs from archive better than many, small blobs


  • Can someone explain what exactly are the technical reasons for the statement "... Large blob sizes are strongly recommended for optimal performance. Rehydrating several small blobs concurrently may add additional time." that is made in the following document:

    Furthermore, what are the practical impacts of rehydrating many small blobs concurrently? For example, if I have 1000, 1GB blobs that I were to rehydrate what would be the performance difference (namely, wall-time difference) between this hydration effort and, say, rehydrating 1, 1000GB blob? Are the 1000 small blobs rehydrated in parallel or serially? Is the effective throughput of rehydrating the 1TB of data lower if it is distributed among 1000 blobs rather than 1 large blob?



    vineri, 13 iulie 2018 16:28

Toate mesajele

  • Hi Robert,

    During movement to Archive storage, many small blobs are packed into a few large objects.  If many large objects must be retrieved, but only a small number of blobs are being moved back to an active tier, efficiency suffers.  Larger objects reduce the amount of wasted effort.  When many small blobs are moved, and happen to be packed together, efficiency is also improved.   That said, I would ask if you have a specific scenario where rehydration time is not meeting expectations?  Our guideline is that rehydration typically is less than 15 hours.


    Klaas, Azure Storage 

    klaas [Principal PM Manager @Microsoft]

    vineri, 13 iulie 2018 23:12
  • Checking in to see if the above response helped to answer your query. Let us know if there are still any additional issues we can help with.
    în urmă cu 1 oră şi 53 minute