locked
AZCOPY.EXE - uploading thousands of small files, how to manage very occasional timeout errors RRS feed

  • Question

  • I'm using AZCOPY to upload many thousands of small files per day into Azure Blob storage.

    This is a largely very successful process, but we occasionally miss a single file, and this is difficult to manage.

    For example, I'm uploading approximately 2000 files per hour, each file is usually about 50K - 100K.

    In a given day, approximately 24,000 files are uploaded successfully, and we often receive just a single failure per day (sometimes 0, sometimes 1, very rarely a few more). This is the error:

    The transfer failed: The client could not finish the operation within specified timeout.

    This error takes about 15 minutes to arise, which is difficult to manage if we have uploaded the other 2000 files within a few seconds.

    So two questions really:

    1 - How to make this AZCOPY upload process more reliable?

    2 - If we are to get a failure, how to make it fail faster than 15 minutes?

    I only have a 5 minute window each hour to complete my upload process, so I need to upload (usually in 10 - 20 seconds), generate any failures, and then retry the failure, inside the window.

    Thanks!

    Wednesday, July 4, 2018 4:54 PM

All replies

  • Use the latest version of AzCopy for reliable performance (today's version 7.1.0 is up to date)

    We have written a very simple batch file to resume AzCopy when there is file transfer failed.

    You can save it as a batch file and change the AzCopy Command as you like.

    Please let us know if it works for you and if you need further assistance.

    :START AzCopy /source:[source] /dest:[dest] /pattern:[pattern] /DestKey[key] /y /s if %errorlevel% LSS 0 goto START

    Wednesday, July 4, 2018 7:31 PM
  • thank you, I have upgraded AZCOPY (I was on a quite old version), and I will look to modify my script.

    My main concern is the timeout takes so long (15 mins) before the error, then doing a retry at that point is really too late.

    I am also testing with a reduced concurrency on AZCOPY  (my default was 16, have reduced to 8 using /NC), and hope to see improved stability.

    Friday, July 6, 2018 4:20 PM
  • AzCopy time-out every 15 minutes. For each chunk of the file, we will retry 10 times if the chunk transfer failed with timeout. But if the file still transfer fail with these retry, we will record the file in the checkpoint, and you can resume the transfer of the file with the checkpoint. I would suggest you to leave your feedback here.  


    • Edited by vikranth s Monday, July 9, 2018 7:36 PM
    • Proposed as answer by vikranth s Tuesday, July 10, 2018 12:50 PM
    Monday, July 9, 2018 7:35 PM
  • Now AzCopy will retry 10 times and max 15 mins when a request fail with network issue, or server issue. This is to try the best to make the transfer success.

    But for this case, it need transfer fail fast in 5 mins. Currently AzCopy still can't get it. But we have a issue to track customized Retry policy in DMlib (Azcopy based on DMlib) https://github.com/Azure/azure-storage-net-data-movement/issues/125. We are still discussing this, and if the retry policy is open, user can set the retry policy to define if want to fail fast, or try best to make transfer success.

    And when transfer fail, resume AzCopy with same commandline until return code is 0, can make the transfer success finally.

    • Proposed as answer by vikranth s Tuesday, July 10, 2018 12:50 PM
    Tuesday, July 10, 2018 6:17 AM
  • Checking in to see if the above response helped to answer your query. Let us know if there are still any additional issues we can help with.

    • Edited by vikranth s Thursday, July 12, 2018 7:06 PM
    Thursday, July 12, 2018 7:06 PM
  • So I created a batch file with: 

    :START
    AzCopy.exe /Source:"C:\Users\s2sco\Documents\Outlook Files\import1" /Dest:"TheSASURL" /V:"c:\temp\logs\azcopy1.log" /Y /s
    if %errorlevel% LSS 0 goto START

    But when I run it, for some reason the ran command throws a random %2 into the middle of the SAS URL. I checked that it is right in the batch file but the output on the cmd line has a %2 in the middle of it and I get this error:

    [2019/11/24 21:40:43][ERROR] The syntax of the command is incorrect. Invalid SAS token in parameter "Dest". Value of "se" is invalid in SAS token.

    Which is true because the SAS URL has an extra two characters in it. How do I stop this from happening?

    Monday, November 25, 2019 2:46 AM
  • @MattyChix Do you run AzCopy in a script like *.bat? This is because "%2" has specific meaning in *.bat. (You need to double all "%" in the SAS)

    And convert back to ":", or the SAS will get access denied.

    BTW, which AzCopy version do you use? The SAS issue might also cause by the Client library version mismatch between AzCopy and the Client library to generate SAS.

     If the issue still persists,  I would request you to create New Forum thread under Storage Service area, since we would like to follow up there, Also please share the screen shot of the error message.

    Kindly let us know if the above helps or you need further assistance on this issue.
    -----------------------------------------------------------------------------------------

    Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members.

    Tuesday, November 26, 2019 7:03 AM