none
Blob copying very slow between two different containers in the same storage account

    Question

  • Hi

    I'm pretty new to Azure and am generally getting stuck in where I can in order to learn more about the available services.  Something that has quickly raised its head is copying of blobs from one place to another.  From general discussion about this on various forums, copying of blobs from one place to another is heavily dependant on the physical locations of the data centers that the from and to storage accounts belong to.  I decided to start with a copy of one blob from what should be a local location to the same local location.  To do this I did the following...

    • created a new storage account, Northern Europe, and set it as locally redundant (because I don't need anything more for the purposes of my tests)
    • spun up a basic VM (normal storage, not premium btw) and had its disk created in "container1" in the new storage account (OS disk, 46gb used of 127gb)
    • created "container2", empty, as the intended target for the blob copy
    • did some research and used (plagiarised with some adaptation to my circumstances) some powershell script to copy blobs across storage accounts, which also appears to work fine for copying within the same storage account

    After initiating the copy with Start-CopyAzureStorageBlob and then intermittently checking on the progress using Status and BytesCopied from Get-AzureStorageBlobCopyState, it went pretty slowly.  The entire operation took approximately 45 minutes to complete.  Are there specific nuances of blob copying that I might simply not be aware of maybe?

    Feedback would be appreciated, thanks in advance.

    Friday, February 5, 2016 8:23 PM

Answers

  • IFNGPS,

    "Start-CopyAzureStorageBlob" is asynchronous by nature and so is AZCopy which by default uses server-side asynchronous copy. Between storage accounts, the asynchronous copy blob operation runs in the background using spare bandwidth capacity, so there is no SLA in terms of how fast a blob will be copied.

    You can force a synchronous copy by specifying "/SyncCopy" parameter for AZCopy to ensures that the copy operation will get consistent speed.

    However it is important to note that AzCopy performs the synchronous copy by downloading the blobs to copy from the specified source to local memory, and then uploading them to the Blob storage destination. So performance will also depend on network conditions between the location where AZCopy is being run and Azure DC location. Also note that /SyncCopy might generate additional egress cost comparing to asynchronous copy, the recommended approach is to use this option in the Azure VM which is in the same region as your source storage account to avoid egress cost.

    For more details please take a look at "Transfer data with the AzCopy Command-Line Utility", "Microsoft Azure Storage Performance and Scalability Checklist"

    Hope this helps.

    Sriprasad



    Wednesday, February 10, 2016 8:39 PM

All replies

  • Hi,

    For best performance for copying blobs within same storage or other storage accounts use AZCOPY.

    https://azure.microsoft.com/en-in/documentation/articles/storage-use-azcopy/

    https://www.opsgility.com/blog/windows-azure-powershell-reference-guide/copying-vhds-blobs-between-storage-accounts/

    Hope this helps.

    Girish Prajwal

    Saturday, February 6, 2016 12:58 PM
    Moderator
  • Hi Giresh

    I created a new container in the same storage account and used AZCOPY to transfer the same source blob from container A to the new container but the speed was almost exactly the same as using the Start-CopyAzureStorageBlob method. I wrote my statement following the syntax under "Copy a blob within a storage account"...

    AzCopy /Source:https://myaccount.blob.core.windows.net/mycontainer1 /Dest:https://myaccount.blob.core.windows.net/mycontainer2 /SourceKey:key /DestKey:key /Pattern:abc.txt

    ...which is what gave me the slow speed.  The transfer rate peaked out at approx 18 MBps and took almost 45 mins to complete, more or less identical results to my existing copy method.

    I'm beginning to wonder if this slow speed is simply how fast the data transfers in the North Europe location can be done at. My reasoning being that I have reduced my test to only test the speed between two containers in the same storage account. I haven't yet done the same test in a different region.

    Thanks for your input.

    Edit: Sincere apologies, I had made the noob mistake of not checking my storage account key values in use and I had used the same container names in my different storage accounts so my script didn't error out. My timings have been based on a "storage account 1\copyfromcontainer" to "storage account 2\copytocontainer".  To *correctly* summarise...

    storage account 1 + 2, both in North Europe and set as locally redundant

    transfer from "storage account 1\copyfromcontainer" to "storage account 1\copytocontainer", instant

    transfer from "storage account 1\copyfromcontainer" to "storage account 2\copytocontainer", 45 minutes

    My query is still the same though, given that copy operations between storage accounts in the same region are supposed to be much quicker than copying between different regions, is this 18 MBps figure the best I can hope for? Are there any other people who have tried similar things who achieve faster results?


    • Edited by IFNGPS Saturday, February 6, 2016 3:31 PM new / corrected information
    Saturday, February 6, 2016 2:51 PM
  • IFNGPS,

    "Start-CopyAzureStorageBlob" is asynchronous by nature and so is AZCopy which by default uses server-side asynchronous copy. Between storage accounts, the asynchronous copy blob operation runs in the background using spare bandwidth capacity, so there is no SLA in terms of how fast a blob will be copied.

    You can force a synchronous copy by specifying "/SyncCopy" parameter for AZCopy to ensures that the copy operation will get consistent speed.

    However it is important to note that AzCopy performs the synchronous copy by downloading the blobs to copy from the specified source to local memory, and then uploading them to the Blob storage destination. So performance will also depend on network conditions between the location where AZCopy is being run and Azure DC location. Also note that /SyncCopy might generate additional egress cost comparing to asynchronous copy, the recommended approach is to use this option in the Azure VM which is in the same region as your source storage account to avoid egress cost.

    For more details please take a look at "Transfer data with the AzCopy Command-Line Utility", "Microsoft Azure Storage Performance and Scalability Checklist"

    Hope this helps.

    Sriprasad



    Wednesday, February 10, 2016 8:39 PM
  • Hi Sriprasad

    I've considered doing this when it comes to actually getting the transfer working as fast as possible by spinning up an Azure VM with sufficient storage using the same storage location to use for the AzCopy /SyncCopy operation.  Hopefully this will yield a faster copy speed than the asynchronous background copy currently does.

    Thanks for your input.

    Thursday, February 11, 2016 5:35 PM