downrev.xml contents - how are shapeCheckSum and textCheckSum generated

Question downrev.xml contents - how are shapeCheckSum and textCheckSum generated

  • vendredi 20 janvier 2012 22:26
     
     

    I have an assembly which reads and writes Excel files in various formats. Two of which are .XLS and .XLSX. I am making a bug fix to the support of shapes on worksheets which will allow the developer to set text alignment on a per-paragraph level on shapes. I was able to get this working in the .XLSX format, but it appears to not be supported in the normal .XLS file because this was introduced in Excel 2007. It seems that when shape features introduced in Excel 2007 are saved in an .XLS file, the shape saves what it can in its normal Escher records, but then a specific shape property is written into the shape’s msofbtTertiaryOPT Escher record. This shape property has PID 937 and is known as "metroBlob" in the Microsoft Office Drawing 97-2007 Binary Format Specification. The complex data stored in this property is a zip file which has two main parts: /drs/shapexml.xml and /drs/downrev.xml. I know how to load the former, because it appears to be almost identical to the shape format used in .XLSX files. So my question is about the /drs/downrev.xml. Here is the content of one such file:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <a:downRevStg xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"
                  shapeCheckSum="cPZ9Q5c2YXSj/c5Qa9AHZN==&#xA;"
                  textCheckSum="LEeYkt6dDVe=&#xA;"
                  shapeId="3"
                  fHybridRaster="0"
                  ver="1">
      <a:xlAnchor>
        <a:from row="2" col="5" rowOffset="38100" colOffset="495300"/>
        <a:to row="9" col="7" rowOffset="66675" colOffset="381000"/>
      </a:xlAnchor>
    </a:downRevStg>

    Now, before I get to my question, I’d like to say that some of the information about this element in msdn.com seems to be incorrect. I found these article: http://msdn.microsoft.com/en-us/library/ff530132(v=office.12).aspx, http://msdn.microsoft.com/en-us/library/ff530835(v=office.12).aspx, which both indicate that the downRevStg element has only three possible attributes and a child element name "bounds". This does not seem to be the case, when the file is saved with either Excel 2007 or Excel 2010. However, the Microsoft Office Drawing 97-2007 Binary Format Specification file seems to have the correct information, so I used that.

    Now to my question:

    My main concern is in how I should create the values for shapeCheckSum and textCheckSum. Apparently, if these are written out incorrectly, the entire metroBlob value will be ignored and only the data from the normal Escher records will be used to load the shape. Here are the descriptions of these attributes based on the Microsoft Office Drawing 97-2007 Binary Format Specification:

    • shapeCheckSum – Checksum computed on the shape.  If the checksum doesn‘t match the binary data when the file is loaded back into Office 2007, the XML data is considered out of date and the binary representation is used.
    • textCheckSum – Checksum computed on the text of the shape.  If the checksum doesn‘t match the binary data when the file is loaded back into Office 2007, the XML data is considered out of date and the binary representation of the text is used.

    So how are these checksums computed and what binary data is hashed to compute them? According to the two MSDN articles listed above, each attribute “Contains a base-64 encoded value of the MD4 hash…” This makes sense for the shapeCheckSum attribute, because it is a base-64 encoded 16 byte value. However, the textCheckSum attribute contains a base-64 encoded 8 byte value, and MD4 hashes data into a 16 byte digest, so it doesn’t seem like the textCheckSum value is an MD4 checksum.

    So once again, here are the things I’d like to know:

    • What binary data is being hashed to create the values for shapeCheckSum and textCheckSum?
    • What hashing algorithm is being used to create these values?

    Thank you for your time.

Toutes les réponses

  • vendredi 20 janvier 2012 23:43
     
     

    Hi Michael

     

    Thanks for the query. Someone from our team will get in touch with you.

     

    Thanks.


    Tarun Chopra | Escalation Engineer | Open Specifications Support Team
  • vendredi 27 janvier 2012 11:36
     
     

    Tarun,

    I have exactly the same issue - I need to know how these checksums are calculated. I'm a little concerned that you said someone from the team will get in touch with Michael - can this not be answered out in the open, where everyone can benefit from it?

    Thanks!

  • vendredi 27 janvier 2012 12:51
    Modérateur
     
     

    Hi EclecticMonk,

    This will be answered on the forum as are all the forum questions.  The forum community is also welcome and encouraged to respond on questions.

    Regards,
    Mark Miller
    Escalation Engineer
    US-CSS DSC PROTOCOL TEAM


  • vendredi 23 mars 2012 18:45
     
     
    Has there been any progress on this?
  • mardi 27 mars 2012 16:16
    Modérateur
     
     

    Hi Michael/EclecticMonk,

    Thank you for your patience. This is a complex issue and request. I am indeed actively working on the issue/request with our product group. When we have gathered the information you have requested it will be posted here on the forum.

    Regards,
    Mark Miller
    Escalation Engineer
    US-CSS DSC PROTOCOL TEAM


  • lundi 9 avril 2012 13:39
    Modérateur
     
     

    Hi Michael/EclecticMonk,

    Would each of you please contact me at "dochelp<at>microsoft<dot>"com" (ATTN: Mark Miller).  I have some information to share that isn't quite ready for the forums.

    Regards,
    Mark Miller
    Escalation Engineer
    US-CSS DSC PROTOCOL TEAM

  • mercredi 20 juin 2012 12:47
     
     

    Could you please contact me or may I also contact you ?

    I also need to know this to generate DOCX files that are compatible with Word 2007, when creating a textBox. I understand the general logic of the mc: block, and of the inner ZIP archive, but I can't figure out how to compute this hash.

    Thanks,
    IceNV

    Edit : I found out o:gfxdata wasn't necessary after all, the compatibility works.

    • Modifié IceNV jeudi 21 juin 2012 08:15
    •