none
AZCOPY.EXE - uploading thousands of small files, how to manage very occasional timeout errors

    Întrebare

  • I'm using AZCOPY to upload many thousands of small files per day into Azure Blob storage.

    This is a largely very successful process, but we occasionally miss a single file, and this is difficult to manage.

    For example, I'm uploading approximately 2000 files per hour, each file is usually about 50K - 100K.

    In a given day, approximately 24,000 files are uploaded successfully, and we often receive just a single failure per day (sometimes 0, sometimes 1, very rarely a few more). This is the error:

    The transfer failed: The client could not finish the operation within specified timeout.

    This error takes about 15 minutes to arise, which is difficult to manage if we have uploaded the other 2000 files within a few seconds.

    So two questions really:

    1 - How to make this AZCOPY upload process more reliable?

    2 - If we are to get a failure, how to make it fail faster than 15 minutes?

    I only have a 5 minute window each hour to complete my upload process, so I need to upload (usually in 10 - 20 seconds), generate any failures, and then retry the failure, inside the window.

    Thanks!

    miercuri, 4 iulie 2018 16:54

Toate mesajele

  • Use the latest version of AzCopy for reliable performance (today's version 7.1.0 is up to date)

    We have written a very simple batch file to resume AzCopy when there is file transfer failed.

    You can save it as a batch file and change the AzCopy Command as you like.

    Please let us know if it works for you and if you need further assistance.

    :START AzCopy /source:[source] /dest:[dest] /pattern:[pattern] /DestKey[key] /y /s if %errorlevel% LSS 0 goto START

    miercuri, 4 iulie 2018 19:31
    Moderator
  • thank you, I have upgraded AZCOPY (I was on a quite old version), and I will look to modify my script.

    My main concern is the timeout takes so long (15 mins) before the error, then doing a retry at that point is really too late.

    I am also testing with a reduced concurrency on AZCOPY  (my default was 16, have reduced to 8 using /NC), and hope to see improved stability.

    vineri, 6 iulie 2018 16:20
  • AzCopy time-out every 15 minutes. For each chunk of the file, we will retry 10 times if the chunk transfer failed with timeout. But if the file still transfer fail with these retry, we will record the file in the checkpoint, and you can resume the transfer of the file with the checkpoint. I would suggest you to leave your feedback here.  


    luni, 9 iulie 2018 19:35
    Moderator
  • Now AzCopy will retry 10 times and max 15 mins when a request fail with network issue, or server issue. This is to try the best to make the transfer success.

    But for this case, it need transfer fail fast in 5 mins. Currently AzCopy still can't get it. But we have a issue to track customized Retry policy in DMlib (Azcopy based on DMlib) https://github.com/Azure/azure-storage-net-data-movement/issues/125. We are still discussing this, and if the retry policy is open, user can set the retry policy to define if want to fail fast, or try best to make transfer success.

    And when transfer fail, resume AzCopy with same commandline until return code is 0, can make the transfer success finally.

    marți, 10 iulie 2018 06:17
  • Checking in to see if the above response helped to answer your query. Let us know if there are still any additional issues we can help with.

    joi, 12 iulie 2018 19:06
    Moderator