none
GZipStream decompression issue RRS feed

  • Question

  • WE use GZipStream for compression and decompression of MemoryStream data. Decompressed data will be input to DataContract serializer, which converts XML to .NET objects. Recently we encountered a problem during DataContract deserialization. When we analyzed the decompressed data, it had a bad character in it. What could be reason for getting bad character after decompression?

    public static Stream GetStreamFromBlob(MyBlob blob)
           {
               Stream originalStream = blob.GetStream(BlobMode.Open, FileAccess.Read);
               Stream decompressedStream = new MemoryStream();
               using (var zipStream = new GZipStream(
                   originalStream,
                   CompressionMode.Decompress))
               {
                   try
                   {
                       zipStream.CopyTo(decompressedStream);
                   }
                   catch (InvalidDataException)
                   {
                       originalStream.Position = 0;
                       originalStream.CopyTo(decompressedStream);
                   }
     
                   decompressedStream.Position = 0;
                   return decompressedStream;
               }
           }

    Part of decompressed data:

    <a:anyType z:Id="i18569" i:type="b:CallIdElem%nt" xmlns:b="MyNamespace">

    It should have been:

    <a:anyType z:Id="i18569" i:type="b:CallIdElement" xmlns:b="MyNamespace">


    • Edited by Redfangz21 Tuesday, July 10, 2018 6:58 AM
    Tuesday, July 10, 2018 6:49 AM

All replies

  • Maybe the data were affected on compression or earlier stages?

    • Edited by Viorel_MVP Tuesday, July 10, 2018 7:26 AM
    Tuesday, July 10, 2018 7:26 AM
  • We don't know what happened in the earlier stages. When this error occurred customer regenerated the data and everything was fine. 
    Tuesday, July 10, 2018 8:12 AM
  • If the customer regenerated the file and everything worked fine I would think that something changed in the underlying data such that once regenerated invalid data was no longer included

    We use GZip compression pretty heavily and our code looks very similar to yours and we've never seen the compression itself cause any invalid characters, its most likely something in the underlying data.

    Tuesday, July 10, 2018 3:46 PM
  • Hi Redfangz21,

    >>What could be reason for getting bad character after decompression? 

    1.The format of gzip can not be used to compress more than 4 GB. 

    2.The following code seems to have no problem (if you don't do it carefully), but when you test, there is something wrong when it compressed byte[] is <4K. 

    	public static byte[] Compress(byte[] data) 
    	{ 
    	    MemoryStream stream = new MemoryStream(); 
    	    GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress); 
    	    gZipStream.Write(data, 0, data.Length); 
    	
    	    return stream.ToArray(); 
    	} 
    	
    	public static byte[] Decompress(byte[] data) 
    	{ 
    	    MemoryStream stream = new MemoryStream(); 
    	
    	    GZipStream gZipStream = new GZipStream(new MemoryStream(data), CompressionMode.Decompress); 
    	
    	    byte[] bytes = new byte[4096]; 
    	    int n; 
    	    while ((n = gZipStream.Read(bytes, 0, bytes.Length)) != 0) 
    	    { 
    	        stream.Write(bytes, 0, n); 
    	    } 
    	
    	    return stream.ToArray(); 
    	}
    

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Tuesday, July 17, 2018 1:00 AM
    Moderator