locked
Assign parts of binary array to different variable types RRS feed

  • Question

  • ISSUE

    I am using BinaryReader to load a log file into an a byte array. The problem I have is that after I assign the file's content to a binary array, how do I efficiently parse the array into variables when the log is composed of numerous fields of different types and lengths (some data fields in the log that is assigned to the array are 4 byte UNIX timestamps, some are 2 byte int16 little-endian integers, some are a char, etc.).

     

    I have written nasty methods to locate and read a chunk of binary data from the middle of a binary array (e..g, 4 bytes that are a field that represent the number of widgets sold in New Mexico), cast the chunk of data into its appropriate type (e.g., cast from binary to Int32), and then assign it to a variable (e.g., an Int32 variable called widgetsSoldInNewMexico), but the method is made from scratch and really ugly/inefficient.  Also, I would need a "nasty" created from scratch method for each of the data types.  Ug. Isn't there a better way to solve my issue using an array related or other method from the library?

     

    [side note] I see many programs that assign/parse fields of data from a log file when the file is read and assigned to variables (example: Int32 timeStampField = binaryReader.ReadInt32(); char maleOrFemale = binaryReader.ReadChar; etc.).  In essence, these programs parse the data file into correct type variables as the file is read using BinaryReader.  I do not want to do this - I would rather read the entire file into a single binary array and then parse, error check and process the fields of data within the array rather than doing hundreds/thousands of reading a single field, assign the filed to its correct variable, error check, and process (why thrash the hard drive when I can have a single stream read and assignment to a single array and then do the parsing/error checking later after I close the binary reader?).

     

    Thanks for your time, consideration and assistance.

    Sunday, February 3, 2008 7:36 PM

Answers

  •  Kyle French wrote:

     

    [side note] I see many programs that assign/parse fields of data from a log file when the file is read and assigned to variables (example: Int32 timeStampField = binaryReader.ReadInt32(); char maleOrFemale = binaryReader.ReadChar; etc.).  In essence, these programs parse the data file into correct type variables as the file is read using BinaryReader.  I do not want to do this - I would rather read the entire file into a single binary array and then parse, error check and process the fields of data within the array rather than doing hundreds/thousands of reading a single field, assign the filed to its correct variable, error check, and process (why thrash the hard drive when I can have a single stream read and assignment to a single array and then do the parsing/error checking later after I close the binary reader?).

     

     

    You can just read all of your binary data in at one go,  put it into a memory stream and then let the binary reader read from the memory stream (by passing it into the BinaryReader constructor) rather than reading from the file, this way you don't need to write your own type conversion i.e.:

     

    Code Snippet

    //Get binary data - would get this from reading the file

    MemoryStream ms = new MemoryStream();

    BinaryWriter bw = new BinaryWriter(ms);

    bw.Write(2);

    bw.Write(false);

    byte[] data = ms.ToArray();

     

    //Read the binary data back

    ms = new MemoryStream();

    ms.Write(data, 0, data.Length);

    BinaryReader r = new BinaryReader(ms);

    ms.Seek(0, SeekOrigin.Begin);

    Console.WriteLine(r.ReadInt32());

    Console.WriteLine(r.ReadBoolean());

     

     

     

    Mark.

    Sunday, February 3, 2008 8:57 PM
  • I agree with Mark.  The only problem with either his approach or your current one is that you will require enough memory to hold it all.  For large files this can be an issue.  You can therefore swap out the MemoryStream with BufferedStream to get the same effect but only have a small portion of the file in memory at a time.  This gives you the benefits of fast reads and the use of a reader.

     

    In the rare cases where a reader won't work (say you have to do a checksum calculation or something) then the alternative approach is to use BitConverter.  This class takes a byte array and does the conversion to various types.  You still have to track the index within the array yourself but otherwise it works for everything but strings.  Even readers don't work right with strings as the reader assumes strings are length-prefixed which is not normally the case.  For fixed length buffers and/or strings you'll have to read the data in using a byte array and then use BitConverter to convert it.  For strings take into account the extra space.  For example a C++ program is likely to pad the buffer with zeroes.  Since zeroes mean nothing to .NET you'll have to use something like TrimEnd to remove the extraneous zeros from the string.

     

    Michael Taylor - 2/3/08

    http://p3net.mvps.org

     

     

    Sunday, February 3, 2008 10:41 PM

All replies

  •  Kyle French wrote:

     

    [side note] I see many programs that assign/parse fields of data from a log file when the file is read and assigned to variables (example: Int32 timeStampField = binaryReader.ReadInt32(); char maleOrFemale = binaryReader.ReadChar; etc.).  In essence, these programs parse the data file into correct type variables as the file is read using BinaryReader.  I do not want to do this - I would rather read the entire file into a single binary array and then parse, error check and process the fields of data within the array rather than doing hundreds/thousands of reading a single field, assign the filed to its correct variable, error check, and process (why thrash the hard drive when I can have a single stream read and assignment to a single array and then do the parsing/error checking later after I close the binary reader?).

     

     

    You can just read all of your binary data in at one go,  put it into a memory stream and then let the binary reader read from the memory stream (by passing it into the BinaryReader constructor) rather than reading from the file, this way you don't need to write your own type conversion i.e.:

     

    Code Snippet

    //Get binary data - would get this from reading the file

    MemoryStream ms = new MemoryStream();

    BinaryWriter bw = new BinaryWriter(ms);

    bw.Write(2);

    bw.Write(false);

    byte[] data = ms.ToArray();

     

    //Read the binary data back

    ms = new MemoryStream();

    ms.Write(data, 0, data.Length);

    BinaryReader r = new BinaryReader(ms);

    ms.Seek(0, SeekOrigin.Begin);

    Console.WriteLine(r.ReadInt32());

    Console.WriteLine(r.ReadBoolean());

     

     

     

    Mark.

    Sunday, February 3, 2008 8:57 PM
  • I agree with Mark.  The only problem with either his approach or your current one is that you will require enough memory to hold it all.  For large files this can be an issue.  You can therefore swap out the MemoryStream with BufferedStream to get the same effect but only have a small portion of the file in memory at a time.  This gives you the benefits of fast reads and the use of a reader.

     

    In the rare cases where a reader won't work (say you have to do a checksum calculation or something) then the alternative approach is to use BitConverter.  This class takes a byte array and does the conversion to various types.  You still have to track the index within the array yourself but otherwise it works for everything but strings.  Even readers don't work right with strings as the reader assumes strings are length-prefixed which is not normally the case.  For fixed length buffers and/or strings you'll have to read the data in using a byte array and then use BitConverter to convert it.  For strings take into account the extra space.  For example a C++ program is likely to pad the buffer with zeroes.  Since zeroes mean nothing to .NET you'll have to use something like TrimEnd to remove the extraneous zeros from the string.

     

    Michael Taylor - 2/3/08

    http://p3net.mvps.org

     

     

    Sunday, February 3, 2008 10:41 PM