none
Spreadsheet corruption issue RRS feed

  • Question

  • We are using Open XML SDK 2.0 (the CTP version, I think) in a project to import/export data from a fairly complex Excel template. Models are created from the template and filled out with user data. We created a lot of defined names which we use to strip the data out and write it back in. We have a process that opens the model (using the SDK), extracts (reads) the data items we want, then closes the file. When we open the file in Excel after this process, it reports that the file is corrupt. It is always the same page/sheet. We used the OpenXmlDiff tool to examine the before (good) and after (bad) files. After we READ the data, there are 3 cell references added out of order in the bad file. Since we are not writing to the document at this time, I don't understand how these cell references are being added. Does the SDK try to do some "optimization" when the file is opened and read? At his point it really looks like it is not something in our code.

    Any help here would be greatly appreciated as we are supposed to be going live in a week or two and this is the only issue we have currently. !! :)

    Thanks!

    G

    Thursday, July 8, 2010 11:38 PM

Answers

All replies

  • OK. It's our fault. Sort of. We started the project with sample code. I think we got it from openxmldevelop.org, but I'm not sure it originated there. Anyway, the GetCell() method inserts a cell if you try to reference it and it does not exist. So, our method that was supposed to be just reading the values in the spreadsheet was inserting some cell references. Actually, it did insert the reference. It was just not in the right place, which is causing the corruption. We can invoke the "read" process with a stream or a file reference. The change was being made to the stream but the stream is discarded at the end so the corruption was not being persisted back to the file. The other way, opening a file, extracting the data then closing the file is saving the sheet with the corrupted cell reference. We will look into changing GetCell() so that it does not insert any cell references.

    As before, I welcome any discussion on this matter!

     

    G

    Friday, July 9, 2010 4:18 PM
  • Hello,

    1. If we encounter a file corruption issue, we can use the tool OpenXMLSdkTool.exe installed under,

    C:\Program Files (x86)\Open XML SDK\V2.0\tool

    to open that document. After that, click the Validate button from the toolbar. It should give us where the document corrupts.

     

    2. I would suggest you download the Open XML SDK 2.0 release to go on the project instead of the CTP. You can download it from,

    http://www.microsoft.com/downloads/details.aspx?FamilyID=c6e744e5-36e9-45f5-8d8c-331df206e0d0&displaylang=en

    Besides the SDK, you can also download the SDK 2.0 code snippet,

    http://www.microsoft.com/downloads/details.aspx?familyid=78BEA298-A3F9-44CF-BDE0-B4F30DC986DF&displaylang=en

    In the code snippet, you can find a snippet ("Excel: Get cell value given row and column") about how to retrieve cell value from Excel spreadsheetdocument. The function name is XLGetCellValueRowCol. We can use it to replace the GetCell() function in your project.

    Check this article, http://blogs.msdn.com/b/brian_jones/archive/2009/09/17/open-xml-sdk-code-snippets.aspx

     


    Best regards,
    Ji Zhou - MSFT
    Microsoft Online Community Support


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Thursday, July 15, 2010 10:31 AM
    Moderator