none
Exchange compression question RRS feed

  • Question

  • There is some info missing in the DIRECT2 encoding description in [MS-OXCRPC]. The document [MS-DRSR] also describes the compression algoritm. The 'Bitmask' subsection of 4.1.10.5.19.2 in this document (which corresponds to section 3.1.7.2.2.1 in [MS-OXCRPC]) has the extra text below.

    I find that Exchange compression follows what's described in [MS-DRSR]. What I do not understand is the need for the extra 'end of stream' bit. This seems redundant. When you cleanly consume all the bytes from the compressed buffer, you are done. Why is there a need to check another bit?

    In the case where all 32 bits in the last bitmask have exactly been consumed, I have found a new bitmask that has the value 0xFFFFFFFF (this is as described below).

    Missing text:

    The bitmask must also contain a "1" in the bit following the last encoded element, to indicate the end of the compressed data. For example, given a hypothetical 8-bit bitmask, the string "ABCABCDEF" should be compressed as (0,0)A(0,0)B(0,0)C(3,3)D(0,0)E(0,0)F. Its bitmask would be b'00010001' (0x11). This would indicate three bytes of data, followed by metadata, followed by an additional 3 bytes, finally terminated with a "1" to indicate the end of the stream.
    The final end bit is always necessary, even if an additional bitmask has to be allocated. If the string in the above example was "ABCABCDEFG", for example, it would require an additional bitmask. It would begin with the bitmask b'00010000', followed by the compressed data, and followed by another bitmask with a "1" as the next bit to indicate the end of the stream.

    Thanks,

    Hari

    Tuesday, May 24, 2011 11:48 PM

Answers

All replies

  • Hi Hari,

     

    Thanks for your question.

     

    Someone from my team will contact you to start working on this request.

     

    Thanks and regards,


    SEBASTIAN CANEVARI - MSFT Escalation Engineer Protocol Documentation Team
    Wednesday, May 25, 2011 12:06 AM
  • Thanks Sebastian. Here is a sample compressed input & the corresponding uncompressed output. You can see that the last 4 bytes of the input are 0xFFFFFFFF - this is a new bitmask which seems to be redundant. If the input had stopped at the 00 before the first FF things would have worked just fine.

    compressed = 47 bytes
    The last 4 bytes, 0xffffffff are the metadata bitmask
    00 d5 02 00 08 00 01 01 01 00 1e 00 10 00 01 0c
    6d 00 03 00 1c 18 00 7f 00 52 1d 7c 00 6c 38 00
    5d ff 00 6d 00 00 00 1d 00 00 00 ff ff ff ff

    uncompressed = 72 bytes
    08 00 01 01 01 00 1e 00 10 00 01 0c 6d 00 00 00
    00 00 00 00 1c 00 00 00 10 00 01 0c 6d 00 00 00
    00 00 00 00 1d 00 00 00 10 00 01 0c 6c 00 00 00
    5d 00 00 00 1c 00 00 00 10 00 01 0c 6d 00 00 00
    6d 00 00 00 1d 00 00 00

    Wednesday, May 25, 2011 12:13 AM
  • It would good to clarify that 4 and 2-byte values in the compressed input are little endian. That's not obvious from reading the algorithm description.

    Hari

    Wednesday, May 25, 2011 12:15 AM
  • Hi Hari,

    I have an open issue on this too, so looking at the outcome of this

    Can you confirm that what you see in reality (i.e. on the wire or with test clients) matches [MS-DRSR] rather than [MS-OXCRPC]?

    Brad

     

    Wednesday, May 25, 2011 1:38 AM
  • Hi Brad,

    The example buffer I posted was received from an exchange server, so what's on the wire does match MS-DRSR.

    Having said that I have only verified that the second paragraph in the missing text posted previously matches what's on the wire - in the second example, all 8 bits in the last bitmask have been used up, so a new (extra) bitmask is allocated so that the 'end of stream' 1 bit can be output. Now in reality the mask is 32-bit, so the corresponding extra bitmask is 0xFFFFFFFF.

    In the example buffer, if the compressed input had been 43 bytes (i.e. without the extra bitmask), it would have worked fine. The first version of my code interpreted the 'end of stream' 1 bit as metadata & looked for the metadata in the input buffer, but the end of the buffer had been reached. That caused an error which I had to investigate further - which led to finding about MS-DRSR.

    The 'fix' was to check for end of stream after reading the bitmask. This does work but the end of stream bit seems to only serve to create additional unnecessary work for the decompressor. Of course I may be missing something here.

    If the decompressor stopped when the output buffer filled up (with EcDoRpcExt2 you do know the uncompressed length, so you know when the output buffer has filled), you can get away with not even reading the extra bitmask, but that was not how I had written the code. To me decompression is a success only if you consume all the input bytes - which means you have to deal with the extra bitmask.

    Now going to the example in the first paragraph in the missing text, after processing the first 7 bits, you would have consumed all the input, so there is no need to even look at the 8th bit which is supposed to indicate end of stream. So I have not verified that the bit following the last processed bit in the last bitmask is 1.

    Hari

    Wednesday, May 25, 2011 5:43 AM
  • Hari,

    I am still looking into this issue and should have an answer for you shortly.

     

    Friday, June 10, 2011 3:15 PM
  • Hari,

    I am still investigating this issue for you.

    Tuesday, June 14, 2011 8:15 PM
  • Hari,

    When an additional bitmask does not have to be allocated, it is necessary to mark the end of the stream.

    • Proposed as answer by King Salemno Monday, June 20, 2011 3:08 PM
    Monday, June 20, 2011 3:08 PM
  • Thanks, but unfortunately it does not answer my question. I was asking why you need the end of stream bit. Let's say you don't have this bit. If the input is not garbage, you will reach the end of input cleanly, i.e. consume every byte of input. When that happens you are done. So asking the decompressor to check one more bit is unnecessary.

    Microsoft products do send data on the wire that conforms to MS-DRSR, so anyone who writes a decompressor has to deal with hit. So there is no need to discuss this further.

    MS-OXCRPC obviously needs to be updated.

    Tuesday, June 21, 2011 9:54 PM
  • I forward this information on as a suggestion for a future release of the documentation.
    • Marked as answer by King Salemno Tuesday, July 5, 2011 4:51 PM
    Tuesday, July 5, 2011 4:51 PM