2012년 1월 20일 금요일 오후 10:26
I have an assembly which reads and writes Excel files in various formats. Two of which are .XLS and .XLSX. I am making a bug fix to the support of shapes on worksheets which will allow the developer to set text alignment on a per-paragraph level on shapes. I was able to get this working in the .XLSX format, but it appears to not be supported in the normal .XLS file because this was introduced in Excel 2007. It seems that when shape features introduced in Excel 2007 are saved in an .XLS file, the shape saves what it can in its normal Escher records, but then a specific shape property is written into the shape’s msofbtTertiaryOPT Escher record. This shape property has PID 937 and is known as "metroBlob" in the Microsoft Office Drawing 97-2007 Binary Format Specification. The complex data stored in this property is a zip file which has two main parts: /drs/shapexml.xml and /drs/downrev.xml. I know how to load the former, because it appears to be almost identical to the shape format used in .XLSX files. So my question is about the /drs/downrev.xml. Here is the content of one such file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<a:from row="2" col="5" rowOffset="38100" colOffset="495300"/>
<a:to row="9" col="7" rowOffset="66675" colOffset="381000"/>
Now, before I get to my question, I’d like to say that some of the information about this element in msdn.com seems to be incorrect. I found these article: http://msdn.microsoft.com/en-us/library/ff530132(v=office.12).aspx, http://msdn.microsoft.com/en-us/library/ff530835(v=office.12).aspx, which both indicate that the downRevStg element has only three possible attributes and a child element name "bounds". This does not seem to be the case, when the file is saved with either Excel 2007 or Excel 2010. However, the Microsoft Office Drawing 97-2007 Binary Format Specification file seems to have the correct information, so I used that.
Now to my question:
My main concern is in how I should create the values for shapeCheckSum and textCheckSum. Apparently, if these are written out incorrectly, the entire metroBlob value will be ignored and only the data from the normal Escher records will be used to load the shape. Here are the descriptions of these attributes based on the Microsoft Office Drawing 97-2007 Binary Format Specification:
- shapeCheckSum – Checksum computed on the shape. If the checksum doesn‘t match the binary data when the file is loaded back into Office 2007, the XML data is considered out of date and the binary representation is used.
- textCheckSum – Checksum computed on the text of the shape. If the checksum doesn‘t match the binary data when the file is loaded back into Office 2007, the XML data is considered out of date and the binary representation of the text is used.
So how are these checksums computed and what binary data is hashed to compute them? According to the two MSDN articles listed above, each attribute “Contains a base-64 encoded value of the MD4 hash…” This makes sense for the shapeCheckSum attribute, because it is a base-64 encoded 16 byte value. However, the textCheckSum attribute contains a base-64 encoded 8 byte value, and MD4 hashes data into a 16 byte digest, so it doesn’t seem like the textCheckSum value is an MD4 checksum.
So once again, here are the things I’d like to know:
- What binary data is being hashed to create the values for shapeCheckSum and textCheckSum?
- What hashing algorithm is being used to create these values?
Thank you for your time.
2012년 1월 20일 금요일 오후 11:43
Thanks for the query. Someone from our team will get in touch with you.
Tarun Chopra | Escalation Engineer | Open Specifications Support Team
2012년 1월 27일 금요일 오전 11:36
I have exactly the same issue - I need to know how these checksums are calculated. I'm a little concerned that you said someone from the team will get in touch with Michael - can this not be answered out in the open, where everyone can benefit from it?
2012년 1월 27일 금요일 오후 12:51중재자
This will be answered on the forum as are all the forum questions. The forum community is also welcome and encouraged to respond on questions.
US-CSS DSC PROTOCOL TEAM
- 편집됨 Mark Miller_DSCMicrosoft Employee, Moderator 2012년 1월 27일 금요일 오후 6:03
2012년 3월 23일 금요일 오후 6:45Has there been any progress on this?
2012년 3월 27일 화요일 오후 4:16중재자
Thank you for your patience. This is a complex issue and request. I am indeed actively working on the issue/request with our product group. When we have gathered the information you have requested it will be posted here on the forum.
US-CSS DSC PROTOCOL TEAM
2012년 4월 9일 월요일 오후 1:39중재자
Would each of you please contact me at "dochelp<at>microsoft<dot>"com" (ATTN: Mark Miller). I have some information to share that isn't quite ready for the forums.
US-CSS DSC PROTOCOL TEAM
2012년 6월 20일 수요일 오후 12:47
Could you please contact me or may I also contact you ?
I also need to know this to generate DOCX files that are compatible with Word 2007, when creating a textBox. I understand the general logic of the mc: block, and of the inner ZIP archive, but I can't figure out how to compute this hash.
Thanks,Edit : I found out o:gfxdata wasn't necessary after all, the compatibility works.