locked
Encrypt and Decrypt Blob Storage Blocks / Invalidates My Data? RRS feed

  • Question

  • Say I have a 160MB file I want to upload to blob storage.  Say I break it up into 40 4MB chunks.  If I encrypt each chunk with the AesCryptoServiceProvider and upload each chunk, then azure will combine them together into one big blob.  When I bring the chunks down (say in 20 8MB chunks), can I decrypt those with the same algorithm, then when I put them all together again, will I have a valid file?

    If not, then my question is: what is the proper way to encrypt files on someone's client and push them to azure.  then, when i pull them down again, let the client decrypt them into a valid file.  These files may be huge so I can't make a local copy of the file, encrypt it, then send it up. I need to do this in place.

    Also, I know everyone's first reaction who does not know the answer will be to day "try it".  I'm really hoping for a more sound answer based on the math.

    Thanks


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Friday, November 12, 2010 2:51 AM

Answers

  • That makes sense in your case not to use encryption since people provider their azure keys. In my case, I'm building a multi-tenancy app where keys will be common so I need to have the extra security of encryption.

    So, Steve Marx, are you out there?  Listening??  Are their limitations on how much metadata you can assign?  If I have potentially thousands of metadata, will bad things happen?

    Thanks,


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider

    >That makes sense in your case not to use encryption since people provider their azure keys. In my case, I'm building a multi-tenancy app where keys will be common so I need to have the extra security of encryption.

    Could you elaborate what the "extra security of encryption" you're talking about? Do you mean:

    • You need access control over the blob.

    In this case please refer to:

    http://msdn.microsoft.com/en-us/library/ee395415.aspx

    • Protect the blob so that even it's downloaded no clients can decrypt it without authentication.

    In this case I suggest you use asymmetric key algorithms. Client apps can use the public keys (retrived from a WCF service, but without authentication, for revocation purpose) to encrypt and upload files. But if they want to decrypt the file they need to get private key first. You can write a WCF service, publish it to cloud, let it hold the private keys. The clients need to call WCF service, pass authentication to get the private key related to the file. You're now able to leverage multipule built-in authentication mechanisms provided by WCF.

    You may also add a mechanism to revoke the private key in case it's compromised. To do so, you need to notify clients and use WorkerRole to decrypt chunks with old private key and encrypt the files using new public key in the background.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. Windows Azure Platform China Blog: http://blogs.msdn.com/azchina/default.aspx
    Monday, November 15, 2010 7:21 AM

All replies

  • It is guaranteed not to work if you use different size chunks for encryption and decryption. You should have no problem decrypting a chunk you previously enrypted. However, I would be tempted to store the chunks separately because you are totally hosed if you make even the slightest error in the chunk boundaries.

    Initially I thought of storing each blob in its own container but this actually seems like a good use of the blob "directory" feature where you name the different chunks something like:

    uploadedblob/chunk01
    uploadedblob/chunk02
    uploadedblob/chunk03
    uploadedblob/chunk04

    Friday, November 12, 2010 3:27 AM
    Answerer
  • I'm already using the directory to represent actual directories.  also, i would prefer not to create hundreds of thousands of chunks so i would like to keep the blob assembled.seems like lots of folks have solved this problem.   anyone who has experience with this and could shed some wisdom would be appreciated.
    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Friday, November 12, 2010 3:34 AM
  • Hi Peter,

    I'm curious about why you don't want to encrypt the whole file in first place and then break it in chunks. If I am not mistaken, if you take a 4MB chunk from a file and encrypt it there is a possibility that it's size will change. Are you somehow tracking (may be through blob metadata) about the chunks you're uploading so that you can reconstruct the blob using this metadata. Though I did not have a 1st hand experience with encryption but we have implemented a GZIP compression feature in our product Cloud Storage Studio and there we first compress the entire file using GZIP compression (and put this compressed file in temp directory) and then we break this compressed file in chunks (if needed) and upload it. Similarly while downloading, we first download the entire file and then decompress it.

    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

    Friday, November 12, 2010 8:02 AM
  • Hi Gaurav,

    Thanks for your response and letting me know what you do.  I'm building a general purpose sync product and my nervousness about compressing the file locally before sending is that it will require the user to have the extra space on their disk for this (which could be a lot).  I was not planning on compressing, though that is a good idea.

    It seems that a good encryption algorithm is the one in the .net library for AES encryption.  I wrote a little test program where I encrypted byte arrays that are multiples of 1024 in size and it seems to add 16 bytes to every result. My current thinking is that I could add one extra block to the upload which will have all these 16 byte overages, then when I download, download the block of these 16 byte blocks first, and as I reconstruct the file (multi-threaded of course) I can access all these little 16 byte pieces, decrypt each block and reconstruct the file. 

    Thoughts?  does this make sense? Am I missing something?


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Friday, November 12, 2010 3:00 PM
  • Hi Peter,

    A quick question: What happens on encryption when your byte array is not a multiple of 1024 bytes? Does it still add this 16 byte overage? Here are my thoughts on one way of doing this (this is me thinking out loud :D):

    Instead of 4 MB blocks, let's say we decide to break the blob in 2 MB blocks. So eseentally I will have 80 blocks for 160 MB blob. Now based on your test program, encryption adds some bytes (let's take 16 for the sake of argument) after encryption so essentially each block of mine would be 2097168 bytes (2 MB + 16 bytes). Now what we can do is we assign an ID to each block say from 001 - 160. What I would do next is as I start uploading the blocks, I keep the block id and the number of bytes in that block in say blob's metadata. I am recommending blob's metadata because when we're downloading the blob, we would have each block's size along with block's position intact with the blob. So on downloading, I would first get blob's metadata and start iterating over it and based on the block id (which is metadata key) and block size (which is metadata value), I would know exactly how many bytes I would need to decrypt.

    Does this make any sense? Very interesting problem by the way :)

    Hope this helps.

    Thanks

    Gaurav

    Friday, November 12, 2010 3:59 PM
  • That does nicely solve the extra data problem. Do you know if there is any realistic limit to how much (many) pieces of metadata you can push into a blob? I'm also thinking of adding metadata flags like aes key length, gzip, etc? Also, with gzip encryption, do you use simple password or key file? I'm not a key wizard but still want to folllow best practices. A password does not seem strong enough.
    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Friday, November 12, 2010 4:10 PM
  • I don't think I have read anywhere regarding the limitation on the number of metadata items. However since metadata is passed as request headers, if there is a limitation on the number of request headers you can send as a part of a request, I am not aware of that. Regarding the gzip, since we dealt with compressing the whole blob using GZIP, we just set it's content-encoding property to GZIP. Also we used GZIP compression to compress the blob, we never really got into gzip encryption you're talking about.

    Hope this helps.

    Thanks

    Gaurav

    Friday, November 12, 2010 8:23 PM
  • That makes sense in your case not to use encryption since people provider their azure keys. In my case, I'm building a multi-tenancy app where keys will be common so I need to have the extra security of encryption.

    So, Steve Marx, are you out there?  Listening??  Are their limitations on how much metadata you can assign?  If I have potentially thousands of metadata, will bad things happen?

    Thanks,


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Friday, November 12, 2010 8:26 PM
  • The following is documented for blobs and containers:

    The total size of the metadata, including both the name and value together, may not exceed 8 KB in size.

    Friday, November 12, 2010 9:29 PM
    Answerer
  • That makes sense in your case not to use encryption since people provider their azure keys. In my case, I'm building a multi-tenancy app where keys will be common so I need to have the extra security of encryption.

    So, Steve Marx, are you out there?  Listening??  Are their limitations on how much metadata you can assign?  If I have potentially thousands of metadata, will bad things happen?

    Thanks,


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider

    >That makes sense in your case not to use encryption since people provider their azure keys. In my case, I'm building a multi-tenancy app where keys will be common so I need to have the extra security of encryption.

    Could you elaborate what the "extra security of encryption" you're talking about? Do you mean:

    • You need access control over the blob.

    In this case please refer to:

    http://msdn.microsoft.com/en-us/library/ee395415.aspx

    • Protect the blob so that even it's downloaded no clients can decrypt it without authentication.

    In this case I suggest you use asymmetric key algorithms. Client apps can use the public keys (retrived from a WCF service, but without authentication, for revocation purpose) to encrypt and upload files. But if they want to decrypt the file they need to get private key first. You can write a WCF service, publish it to cloud, let it hold the private keys. The clients need to call WCF service, pass authentication to get the private key related to the file. You're now able to leverage multipule built-in authentication mechanisms provided by WCF.

    You may also add a mechanism to revoke the private key in case it's compromised. To do so, you need to notify clients and use WorkerRole to decrypt chunks with old private key and encrypt the files using new public key in the background.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. Windows Azure Platform China Blog: http://blogs.msdn.com/azchina/default.aspx
    Monday, November 15, 2010 7:21 AM