locked
How to Upload Large Files (100 to 2000MB) to webrole, then to Blob Storage RRS feed

  • Question

  • I want to have people upload large files to my webrole that I will then turn around and upload using my credentials to azure blob storage.  I don't want the person (client) to go directly to the azure blob storage because I don't want that client to have my azure credentails. I want that file to first come to a webrole, then I'll push it myself to the blob storage.  That is, I will authenticate the user myself, get the large file (byte array I assume), then use the Azure blob storage api to push it to blob storage.

    My client will always be a .Net Application.  I'd need to be able to show progress as the file is pushed to the webrole.  Also, can the webrole handle a byte array of 2gb?  Any sugestions on how to architect and implement this would be appreciated.


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Thursday, September 9, 2010 5:35 AM

Answers

  • Peter -

    I did a blog post on access control for blobs that shows how to create a shared access signature.

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 6:24 PM
    Thursday, September 9, 2010 6:08 PM
    Answerer
  • Hi Peter,

    One way to implement this is divide the input file in chunks of bytes and then receive those bytes in your web role and transferring those bytes as blocks in Azure Storage. Rob Gillen has an excellent post about how to achieve this: http://weblogs.asp.net/rgillen/archive/2010/04/26/external-file-upload-optimizations-for-windows-azure.aspx .

    If you're using ASP.Net file control to upload these large files, you may want to try Telerik's file upload control. I believe it displays progress as the file gets uploaded. However with ASP.Net the thing that needs to be addressed is the maximum file upload size (a setting which I believe you can change in web.config).

    I'm just curious as to what you mean by this: "My client will always be a .Net Application" and why in particular you are interested in uploading file to a web role instead of a worker role?

    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 1:55 PM
    Thursday, September 9, 2010 6:11 AM
  • In addition to what Gaurav has rightly stated, adding my 2 cents.

    Can you consider using Shared Access Signatures ? This will allow a better fine grained control over the way you let your users access the blobs/containers. You can control the type of access and duration programmatically.

    This would save a lot of ingress bandwidth to the Web role as well as is a more efficient mechanism since a middleman is taken out.

     

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 5:48 PM
    Thursday, September 9, 2010 7:50 AM
  • The signature is essentially generated out-of-band in that you do not need to connect to the Azure Storage Service to generate the signature - all you need is the account and key. You need some secure way to get the shared access signature to the user - and then it becomes a matter of trust between you and the user. The shared access signature can be used with both http and https - I don't think there is any way to mandate https.

    The following from my blog post on access control contains some advice on best practices for shared access signatures:

    Both CloudBlobContainer and CloudBlob contain two methods which can be used to generate a shared access signature for containers and blobs respectively:

    public String GetSharedAccessSignature(SharedAccessPolicy policy);
    public String GetSharedAccessSignature(SharedAccessPolicy policy, String groupPolicyIdentifier);

    GetSharedAccessSignature() generates a shared access signature, i.e. URL query string, based on the specified SharedAccessPolicy. The second method associates the shared access signature with a container-level access policy which means that the shared access signature can be revoked and that it can be valid for more than one hour. Note that when a shared access signature is associated with a container-level access policy then individual features of the SharedAccessPolicy can only appear either in the container-level access policy or the the GetSharedAccessSignature() request. Microsoft suggests a best practice for shared access signatures is to always associate them with a container-level access policy precisely so they can be revoked if necessary.

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 11:48 PM
    Thursday, September 9, 2010 9:04 PM
    Answerer

All replies

  • Hi Peter,

    One way to implement this is divide the input file in chunks of bytes and then receive those bytes in your web role and transferring those bytes as blocks in Azure Storage. Rob Gillen has an excellent post about how to achieve this: http://weblogs.asp.net/rgillen/archive/2010/04/26/external-file-upload-optimizations-for-windows-azure.aspx .

    If you're using ASP.Net file control to upload these large files, you may want to try Telerik's file upload control. I believe it displays progress as the file gets uploaded. However with ASP.Net the thing that needs to be addressed is the maximum file upload size (a setting which I believe you can change in web.config).

    I'm just curious as to what you mean by this: "My client will always be a .Net Application" and why in particular you are interested in uploading file to a web role instead of a worker role?

    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 1:55 PM
    Thursday, September 9, 2010 6:11 AM
  • In addition to what Gaurav has rightly stated, adding my 2 cents.

    Can you consider using Shared Access Signatures ? This will allow a better fine grained control over the way you let your users access the blobs/containers. You can control the type of access and duration programmatically.

    This would save a lot of ingress bandwidth to the Web role as well as is a more efficient mechanism since a middleman is taken out.

     

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 5:48 PM
    Thursday, September 9, 2010 7:50 AM
  • SarangKulkarni,

    I like your suggestion and I think it would work well.  I've read the article you list above but can't find any information (at least obvious to me) on how to create the actual signature. I assume I need to somehow use my azure credentials to create the "signedidentifier" somehow but can't figure out how to do that and don't see any examples.

    Can you point me at a sample that uses Shared Access Signatures?

    Thanks,


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Thursday, September 9, 2010 5:51 PM
  • Peter -

    I did a blog post on access control for blobs that shows how to create a shared access signature.

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 6:24 PM
    Thursday, September 9, 2010 6:08 PM
    Answerer
  • Do you have a sample project with all the correct references that I could look at that creates access control for blobs? that would be a big help.
    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Thursday, September 9, 2010 6:26 PM
  • Peter -

    You can find a very basic WPF project generating a shared access signature here.

    Thursday, September 9, 2010 7:24 PM
    Answerer
  • Is there anyway to send the signature securely?  that is, https?  It seems like having a signature go through http could be hacked and re-used before an expiration.  Is there a "best practices" for how to make the shared access work securely?
    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Thursday, September 9, 2010 8:25 PM
  • The signature is essentially generated out-of-band in that you do not need to connect to the Azure Storage Service to generate the signature - all you need is the account and key. You need some secure way to get the shared access signature to the user - and then it becomes a matter of trust between you and the user. The shared access signature can be used with both http and https - I don't think there is any way to mandate https.

    The following from my blog post on access control contains some advice on best practices for shared access signatures:

    Both CloudBlobContainer and CloudBlob contain two methods which can be used to generate a shared access signature for containers and blobs respectively:

    public String GetSharedAccessSignature(SharedAccessPolicy policy);
    public String GetSharedAccessSignature(SharedAccessPolicy policy, String groupPolicyIdentifier);

    GetSharedAccessSignature() generates a shared access signature, i.e. URL query string, based on the specified SharedAccessPolicy. The second method associates the shared access signature with a container-level access policy which means that the shared access signature can be revoked and that it can be valid for more than one hour. Note that when a shared access signature is associated with a container-level access policy then individual features of the SharedAccessPolicy can only appear either in the container-level access policy or the the GetSharedAccessSignature() request. Microsoft suggests a best practice for shared access signatures is to always associate them with a container-level access policy precisely so they can be revoked if necessary.

    • Marked as answer by Peter Kellner Thursday, September 9, 2010 11:48 PM
    Thursday, September 9, 2010 9:04 PM
    Answerer
  • Hi Neil,

    Thanks for the thorough answer.  I'm still a little confused on the security side. I totally get that I can generate a signature that is only valid for a certain amount of time, and that I can assign that signature to a policy so I can easily revoke it.  The piece that confuses me is this.

    If I send a signature securely to a client to use to access azure storage, it seems that since that client has to put the parameters in a GET uri, that that client has essentially broadcast to the world his signature and a malicious user who sees that traffic (any wifi for example) could take that signature and immediately start using it to push or pull data.  Even if the data is going securely because of https.  the malicious user could regenerate the same command, but as http and get the same data that the real client just got with http.  Am I missing something here?

    Is there anyway for the user to put the signature to azure securely so that no one has access to this signature if they happen to be snooping traffic?

    I just don't get it.  I'm hoping I'm missing something simple (as is often the case).

    Thanks again for all your help (and others posting to my questions also)


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Saturday, September 11, 2010 2:25 PM
  • If I send a signature securely to a client to use to access azure storage, it seems that since that client has to put the parameters in a GET uri, that that client has essentially broadcast to the world his signature and a malicious user who sees that traffic (any wifi for example) could take that signature and immediately start using it to push or pull data. 

    Correct. Anyone with access to the signature can use it until it expires. A shared access signature not based on a policy is not revocable which is presumably why authorization through that signature is limited to one hour. It should be a best practice to give shared access signatures the strictest authorization and shortest time limit practical - for example, I don't really think you should give them write access.

    Is there anyway for the user to put the signature to azure securely so that no one has access to this signature if they happen to be snooping traffic?

    I don't think there is. I believe the idea with shared access signatures is to open up the data service a little bit so that you can give limited direct access to blobs, perhaps a video stream, in a way that is not fully secure but yet is workable in practice. It all boils down to how valuable the data is.

    Saturday, September 11, 2010 5:46 PM
    Answerer
  • Neil,

    I've been reading the docs again and it does show that you can send data with a PUT and have the signature be part of the blog itself.  then, https should take care of security for that so writing blobs seems secure.  I totally get that for a PUT.  But, I don't understand how you can do this for a GET. I thought all that is passed in a GET is the url itself?

    Any thoughts would be appreciated. seems like an example would go a long way for how to do this.

    Thanks again,

     

    from: http://msdn.microsoft.com/en-us/library/ee395415.aspx

    PUT:

    The request URL specifies write permissions on the pictures container for the designated interval. Note that the resource represented by the request URL is a blob, but the Shared Access Signature is specified on the container. It's also possible to specify it on the blob itself.

    GET:

    The request URL specifies read permissions on the pictures container for the designated interval. Note that the resource represented by the request URL is a blob, but the Shared Access Signature is specified on the container. It's also possible to specify it on the blob itself.


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Saturday, September 11, 2010 11:42 PM
  • I've been reading the docs again and it does show that you can send data with a PUT and have the signature be part of the blog itself.  then, https should take care of security for that so writing blobs seems secure.  I totally get that for a PUT.  But, I don't understand how you can do this for a GET. I thought all that is passed in a GET is the url itself?

    The highlighted portion is wrong - the signature is always part of the URL. The following example URLs are from the linked MSDN article:

    PUT http://myaccount.blob.core.windows.net/pictures/photo.jpg?st=2009-02-09T08%3a49Z&se=2009-02-10T08%3a49Z&
    sr=c&sp=w&si=YWJjZGVmZw%3d%3d&sig= Rcp6gQRfV7WDlURdVTqCa%2bqEArnfJxDgE%2bKH3TCChIs%3d

    DELETE http://myaccount.blob.core.windows.net/pictures/profile.jpg?st=2009-02-09T08%3a49%3a37.0000000Z&se=2009-02-10T08%3a49%3a37.0000000Z&sr=c&sp=d&si=YWJjZGVmZw%3d%3d&sig= %2bSzBm0wi8xECuGkKw97wnkSZ%2f62sxU%2b6Hq6a7qojIVE%3d

    GET http://myaccount.blob.core.windows.net/pictures/profile.jpg? st=2009-02-09&se=2009-02-10 &sr=c&sp=r&si=YWJjZGVmZw%3d%3d&sig= dD80ihBh5jfNpymO5Hg1IdiJIEvHcJpCMiCMnN%2fRnbI%3d

    Sunday, September 12, 2010 8:06 AM
    Answerer
  • It's been a long tortuos path but I find out the same thing through azure support.  Very frustrating process, but I believe I can do what I want.  creating containers and listing them I believe can not be made secure.  That's still a work in progress for me.
    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Wednesday, September 29, 2010 9:11 PM