locked
Async CopyBlob fails with "403: Copy failed when reading the source" RRS feed

  • Question

  • Hi there!

    We use CopyBlob  functionality to copy files between accounts, and we use URL with shared access signature as the copy-source.

    The problem:

    though the CopyBlob request is accepted and starts executing, checking copy status then fails with 

    403: Copy failed when reading the source

    after some time. The tests showed that it's caused by the expiration of the source URL.

    If I use private URL (shared access sig) to download the file,  the URL expiration is checked only once when the request is made, and then I have plenty of time to download all the content, so why the CopyBlob (which fully runs on the Azure side) behaves differently?

    Having CopyBlob working that way requires to guess how long it will take for Azure server to copy the particular file (and generate enough-lifetime shared access signature). The CopyBlob documentation doesn't define any time frame during which the copy operation is guaranteed to be completed, so in the worst case we have to generate an unlimited lifetime URL for source.

    In my opinion the permission check should be done only once (during CopyBlob REST api call), and then copying should proceed despite of the time it can take.

    Is it a bug or feature, and how should we properly handle it in the latter case?

    Sincerely,

    IP

    Monday, March 24, 2014 1:34 PM

All replies

  • Hi,

    How did you write the URL ? Did you use timeout, like this?

    https://myaccount.blob.core.windows.net?comp=list&timeout=20
    

    You could see this link:http://msdn.microsoft.com/en-us/library/windowsazure/dd179431.aspx

    Any new information,  please let me know.

    Regards,

    Will


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Tuesday, March 25, 2014 9:41 AM
  • No we do not use specified timeouts. Does the timeout affects asynchronous CopyBlob request somehow?

    I think the question is misunderstood a little. The problem is not in that some of requests fails, but that the asynchronous copy process fails. I put the logs below describing the details of what is happening.

    1. We initiate a CopyBlob task for a 304MB file from srcaccount  to dstaccount . As a x-ms-copy-source we specify a presigned URL with the lifetime of 1 minute (enough to start async task for testing purposes).

    PUT /dst/Windows6.1-KB968211-x64-RefreshPkg.msu HTTP/1.1
    x-ms-copy-source: https://srcacc.blob.core.windows.net/src/Windows6.1-KB968211-x64-RefreshPkg.msu?ss=2014-03-25T13%3A13%3A38Z&se=2014-03-25T13%3A14%3A36Z&sp=r&sr=b&sig=d3tn6iHJS2JViC25bADlXZhhjEFimEiyp%2BGSic8G%2FmQ%3D
    x-ms-date: Tue, 25 Mar 2014 13:13:38 GMT
    x-ms-version: 2013-08-15
    Authorization: SharedKey dstacc:t7umNwAHbx36ca+4LvZWF1tLtE/yJEzulYQO38yZvsY=
    Content-Length: 0

    202: Accepted
    x-ms-request-id: 6adcd35a-8d1f-4f32-af52-8b4e7c646f2e
    x-ms-version: 2013-08-15
    x-ms-copy-id: bc3ce350-da98-4204-ba5f-24e5922ce135
    x-ms-copy-status: pending
    Date: Tue, 25 Mar 2014 13:13:38 GMT
    ETag: "0x8D116432C75C8BF"
    Last-Modified: Tue, 25 Mar 2014 13:13:40 GMT
    Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0


    2. Then the only thing we should do is to check for copy status periodcally

    2.1. First status check (most unneeded headers are omitted)

    HEAD /dst/Windows6.1-KB968211-x64-RefreshPkg.msu HTTP/1.1

    200: OK
    x-ms-copy-id: bc3ce350-da98-4204-ba5f-24e5922ce135
    x-ms-copy-source: https://srcacc.blob.core.windows.net/src/Windows6.1-KB968211-x64-RefreshPkg.msu?ss=2014-03-25T13%3A13%3A38Z&se=2014-03-25T13%3A14%3A36Z&sp=r&sr=b&sig=d3tn6iHJS2JViC25bADlXZhhjEFimEiyp%2BGSic8G%2FmQ%3D
    x-ms-copy-status: pending
    x-ms-copy-progress: 0/318432502
    Date: Tue, 25 Mar 2014 13:13:52 GMT


    2.2. Next check reports that copy is almost completed (notice the x-ms-copy-progress header):

    HEAD /dst/Windows6.1-KB968211-x64-RefreshPkg.msu HTTP/1.1

    200: OK
    x-ms-copy-id: bc3ce350-da98-4204-ba5f-24e5922ce135
    x-ms-copy-source: https://srcacc.blob.core.windows.net/src/Windows6.1-KB968211-x64-RefreshPkg.msu?ss=2014-03-25T13%3A13%3A38Z&se=2014-03-25T13%3A14%3A36Z&sp=r&sr=b&sig=d3tn6iHJS2JViC25bADlXZhhjEFimEiyp%2BGSic8G%2FmQ%3D
    x-ms-copy-status: pending
    x-ms-copy-progress: 201326592/318432502
    Date: Tue, 25 Mar 2014 13:14:29 GMT


    2.3. Then at some point the copy status becomes 'failed' with the description in 'x-ms-copy-status-description' header:

    HEAD /dst/Windows6.1-KB968211-x64-RefreshPkg.msu HTTP/1.1

    200: OK
    x-ms-request-id: ca148b28-71b0-433e-ae77-426889fce407
    x-ms-copy-id: bc3ce350-da98-4204-ba5f-24e5922ce135
    x-ms-copy-source: https://srcacc.blob.core.windows.net/src/Windows6.1-KB968211-x64-RefreshPkg.msu?ss=2014-03-25T13%3A13%3A38Z&se=2014-03-25T13%3A14%3A36Z&sp=r&sr=b&sig=d3tn6iHJS2JViC25bADlXZhhjEFimEiyp%2BGSic8G%2FmQ%3D
    x-ms-copy-status: failed
    x-ms-copy-status-description: 403 AuthenticationFailed "Copy failed when reading the source."
    x-ms-copy-progress: 268435456/318432502
    x-ms-copy-completion-time: Tue, 25 Mar 2014 13:14:37 GMT
    Date: Tue, 25 Mar 2014 13:14:36 GMT

    As you can see the copy process, though successfully started, failed in the middle of operation juts because the initial presigned URL has expired.

    The question is:

    Is it expected behavior of Azure Server or a bug?

    If this is expected behavior, how long the source file URL should live? The documentation doesn't give any predefined timeline, but I wouldn't like to generate time-unlimited URLs for security reasons.





    • Edited by IvanP_CBL Tuesday, March 25, 2014 1:58 PM
    Tuesday, March 25, 2014 1:56 PM
  • SAS is designed so that if it expires, any further requests will return unauthorized. It checks authorization on every request, not just the first one. Indeed, it can be difficult to figure out how long you should configure the expiration time, not only for copy blobs, but for other scenarios as well (such as grant a specific user access to the data for some period). However, if it is only checked authorization once, then the expiration time is essentially useless. A user can access the blob even after SAS expires, unless he has never tried to request the blob even once before it expires.
    Thursday, March 27, 2014 7:34 AM
  • Yes, SAS is checked on each request,  but it should be checked only on the moment of the request!

    If you try to do another request after expiration, it must fail. But the first request that has been done before expiration will still be able to download the file even if the URL expired during the download.

    The problem with CopyBlob is that SAS has been successfully verified on the moment of the request. After that it's Azure Server's work to do the copy.

    After CopyBlob I do not make any requests with SAS. It doesn't matter whether I check the async copy status or not - it fails after some time on an Azure Server side.

    Compare the following cases if  I generated the URL that lives for 2 minutes:

    1) If I just download it with slow connection (64 Kb/s)  I can do this for hours - the file will be finally downloaded even if URL expired after 2 minutes of initiating the request. (you can test it with any download manager, by limiting number of connections to 1, and download speed to 64K/s, for example)

    2) If I try to CopyBlob - it will fail atfer 2 minutes, and I cannot do anything with it.

    Don't you find this curious? CopyBlob should be completed once it was successfully initiated/

    Thursday, March 27, 2014 11:08 AM
  • Copy blob may look like a single operation. But actually it often ends up with multiple requests. To download a large blob (as part of copying), it is not a good idea to download everything in a single request. Instead, it is recommended to divide the huge blob into multiple ranges (pieces). Then download several, say, 10, pieces concurrently. After one piece is downloaded, start to download the 11th piece. If a piece fails to download, redownload that single piece rather than the whole blob. Using this design, copy blob needs to issue a lot of requests for large blobs, and each request must be authenticated.
    In general, you need to estimate how long the target operation (be it copy blob, let someone else access the blob, or something else) will complete, and set the expiration time to be a bit larger than the estimated value.
    Friday, March 28, 2014 6:49 AM
  • It's not a question about how to speed up the download or copy process. It's a questions about Blob API.

    The API call should either return error if permissions are not enough, or complete.

    GetBlob API call checks the permissions, and then provides you with blob data despite of further security changes. It'll never stop sending data because your URL has expired while sending. It can fail for any other reason (network, internal error, etc), and for new API call you will have to provide a new security token.

    CopyBlob is a single API call, and I expect only a single security check - before starting the process.

    I don't need to know how it's implemented inside. IT can be implemented as obtaining security token (like a file handle) on start and the use it for million request. It's a server-side component and you'll never get and can use this token, so it's not a security hole. New CopyBlob request still would require a new token.

    Like on your local filesystem - security is checked only when you open the file to get the handle. Once you have the handle and it's not closed, you are allowed work with it (read, write, etc) even if permissions are changed. You can do read/write operations asynchronously,  and it will work, even if security already changed to deny you any access.

    As for the estimations - that's the thing I wouldn't like to do. It leads to generating long-term SASes. Today's blob limits are hundreds of Gbs, but tomorrow it'll be TBs. Should the ad-hoc SAS be generated for several days? It contradicts with recommendations for ad-hoc SAS.

    Friday, March 28, 2014 11:21 AM
  • Hi,


    From a support perspective this is really beyond what we can do here in the forums. If you cannot determine your answer here or on your own, consider opening a support case with us. Visit this link to see the various support options that are available to better meet your needs:  http://support.microsoft.com/default.aspx?id=fh;en-us;offerprophone.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Jun Zh - MSFT Microsoft Online Community Support

    Tuesday, April 1, 2014 9:06 AM