locked
Will Azure storage supports compression in data uploading RRS feed

Answers

  • Hi,

    As far as i know, Currently Azure Storage does not support data pre-compression before uploading. The workaround is compress your data manually, here i give some code snippets for sample:

    Compress:

    byte[] data = Encoding.UTF8.GetBytes(text); 
    var stream = new MemoryStream();
    using (Stream ds = new GZipStream(stream, CompressionMode.Compress)) 
    {
        ds.Write(data, 0, data.Length);    
    }    
    byte[] compressed = stream.ToArray();
    return compressed;

    Uncompress:

    try 
    {
        if (compressedText.Length == 0)
        { 
            return string.Empty;  
        }   
        using (MemoryStream ms = new MemoryStream())
        {
            int msgLength = BitConverter.ToInt32(compressedText, 0);  
            ms.Write(compressedText, 0, compressedText.Length - 0); 
            byte[] buffer = new byte[msgLength];   
            ms.Position = 0; 
            using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress))
            { 
                zip.Read(buffer, 0, buffer.Length);     
            } 
            return Encoding.UTF8.GetString(buffer);   
        }  
    }
    catch(Exception)
    { 
        string.Empty;
    }

    If you think this is necessary for your situation, please try to post your idea as the feature request to Microsoft:

    http://www.mygreatwindowsazureidea.com/forums/34192-windows-azure-feature-voting

    Thanks for understanding.

    BR,

    Arwind


    Please mark the replies as answers if they help or unmark if not. If you have any feedback about my replies, please contact msdnmg@microsoft.com Microsoft One Code Framework

    • Marked as answer by Arwind - MSFT Thursday, May 31, 2012 8:41 AM
    Thursday, May 24, 2012 4:06 AM

All replies

  • Hi,

    As far as i know, Currently Azure Storage does not support data pre-compression before uploading. The workaround is compress your data manually, here i give some code snippets for sample:

    Compress:

    byte[] data = Encoding.UTF8.GetBytes(text); 
    var stream = new MemoryStream();
    using (Stream ds = new GZipStream(stream, CompressionMode.Compress)) 
    {
        ds.Write(data, 0, data.Length);    
    }    
    byte[] compressed = stream.ToArray();
    return compressed;

    Uncompress:

    try 
    {
        if (compressedText.Length == 0)
        { 
            return string.Empty;  
        }   
        using (MemoryStream ms = new MemoryStream())
        {
            int msgLength = BitConverter.ToInt32(compressedText, 0);  
            ms.Write(compressedText, 0, compressedText.Length - 0); 
            byte[] buffer = new byte[msgLength];   
            ms.Position = 0; 
            using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress))
            { 
                zip.Read(buffer, 0, buffer.Length);     
            } 
            return Encoding.UTF8.GetString(buffer);   
        }  
    }
    catch(Exception)
    { 
        string.Empty;
    }

    If you think this is necessary for your situation, please try to post your idea as the feature request to Microsoft:

    http://www.mygreatwindowsazureidea.com/forums/34192-windows-azure-feature-voting

    Thanks for understanding.

    BR,

    Arwind


    Please mark the replies as answers if they help or unmark if not. If you have any feedback about my replies, please contact msdnmg@microsoft.com Microsoft One Code Framework

    • Marked as answer by Arwind - MSFT Thursday, May 31, 2012 8:41 AM
    Thursday, May 24, 2012 4:06 AM
  • Adding to Arwind's answer, if you're trying to compress CSS/JS/PNG files to save on bandwidth and storage make sure that you set the Content-Encoding property to GZip so that when they are fetched through the browser (or any other device or application which understands what to do with GZIP compressed content), they are decompressed by the browser.

    Hope this helps.

    Thanks

    Gaurav

    Thursday, May 24, 2012 4:35 AM
  • Hi Gaurav,

    What you suggested is a way to enable compression in data downloading if I'm not mistaken. My goal is to find the fastest way to upload the data, many times just plan text files, to Azure. It's a low-hanging-fruit to speed up data uploading to Azure Blob Storage service, so I want to see if it can be supported by Azure Blob Storage. 

    Tuesday, May 29, 2012 2:43 AM
  • Not really. What I'm trying to say is that if you have uploaded a compressed text file in order to view that file let's say in a browser you need to tell the browser that the contents of this file are compressed using GZIP compression. That you do by setting the content-encoding property of the blob.

    To give you an example, try these two links:

    https://cerebrataqa.blob.core.windows.net/msdn-test/compressed-file-gzip-encoding.txt

    https://cerebrataqa.blob.core.windows.net/msdn-test/compressed-file-no-gzip-encoding.txt

    When I uploaded these two files using our Cloud Storage Studio tool, I compressed them using GZIP. After they are uploaded, I went and removed the content-encoding property of the 2nd blob. When you access these two files, the first one displays correct content while the 2nd one displays gibberish text.

    Regarding your 2nd question, I don't think the Blob Storage API would be able to do that for you as they are server side APIs and even if they do support compression, it would be at the server side which defeats the whole purpose you're seeking. It has to be done at the client side before the file being uploaded. I think the better place would be to introduce this functionality in Storage Client library.

    Hope this helps.

    Thanks

    Gaurav

    Tuesday, May 29, 2012 3:27 AM
  • I'm asking a way similar to HTTP automatic compression. So on the client side, if I can compress data and set the content-type or encoding to be gzip or deflate, can azure blob storage service decompress the blobs automatically for me? It seems that IIS server can have a config flag that says if http compression is  supported or not. however, if I understand it right, that flag just controls the compression on the server side in http response.

    basically, I want to find a way that we don't need to write code to compress data on each client, and then on the server side to decompress it after it pulls data from blob storage.

    Do you know if it can be done?

    Thanks

    Tuesday, May 29, 2012 6:16 AM
  • I see. I think the answer is no because Blob Storage is just storage (and not an IIS server). It will just store the data the way you send it to. You would need to write the code for doing this.

    Hope this helps.

    Tuesday, May 29, 2012 6:24 AM
  • Sorry, can you clarify what you mean by "decompress the blobs automatically"?  What would the result of this be?  Thanks!

    -Jeff

    Wednesday, May 30, 2012 9:56 AM
  • I'd like to wake this thread up if I may because I think the notion of uploading a compressed package of files to Azure BLOB Storage and being able to access the files within there is very valid.

    Here's my scenario. I have ~0.5TB of .csv files that I need to upload into BLOB storage, if those files were compressed then they'd be a lot smaller than 0.5TB. I would like to compress them, upload the compressed packages, then one of 2 things happen:

    1. Azure provide a mechanism of uncompressing those files once they have landed in Azure BLOB storage
    2. The Azure BLOB Storage API exposes the contents of the compressed file by abstracting away the actual compressed file. When someone requests one of the files inside the compressed package then Azure uncompresses it and delivers the ucompressed file. To my mind this is the same as what Windows Explorer does (and has done for many years).

    I believe this to be a totally valid scenario and I'm a little surprised that Azure does not support it (hopefully someone will immediately reply telling me I'm wrong and that Azure *does* support it).

    So, to a question. Does anyone know of a method for uploading compressed files into Azure BLOB Storage and uncompressing them once they arrive?

    TIA
    Jamie


    ObjectStorageHelper<T> – A WinRT utility for Windows 8 | http://sqlblog.com/blogs/jamie_thomson/ | @jamiet | About me
    Jamie Thomson

    Thursday, April 24, 2014 7:32 AM
  • Jamie,

    Azure Blob Storage still does not support it :(. Think of it in this way - Azure Blob Storage is purely a storage service and does not have any idea about the objects you store in there; for it, the object you store is a byte array - nothing more, nothing less. It doesn't know whether they are compressed or not. The onus of compressing and decompressing lies on you (I mean developer/consumer). If you want to store the file compressed, you compress it on your end and send it to storage. If you want to work on files which needs to be decompressed first, you would need to do that. Now web browsers are a different breed (believe it or not, they are a bit intelligent). They see content-encoding header in response stream and if they find "GZIP" in there, they automatically decompress the contents and show you the actual contents.

    So long story short, it still doesn't support it and you will have to do it on your own.

    Hope this helps.

    Gaurav

    Thursday, April 24, 2014 7:42 AM