none
OpenXML can't open a file, Word opens it just fine

    Question

  • Please take a look at the file in http://www.windwardreports.com/temp/InvalidPackage.zip

    It has a valid document.xml and Word opens the file fine. But the OpenXML SDK throws an exception:

    DocumentFormat.OpenXml.Packaging.OpenXmlPackageException occurred
      Message="The specified package is invalid. The main part is missing."
      Source="DocumentFormat.OpenXml"
      StackTrace:
           at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load()
           at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.OpenCore(Stream stream, Boolean readWriteMode)
           at DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable, OpenSettings openSettings)
           at DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable)

    Any idea why?

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Monday, May 10, 2010 9:53 PM

All replies

  • Hi Dave,

    Thanks for your question.

    I tried to open your shared document and found that there is a line of comment before the "w:document". That's why the error message shows "The main part is missing." You could remove the comment and then everything works well.

    BTW, I want to know how do you generate this document, for when opening it in Word, I find it is in "Compatibility Mode". I wonder if it is not strictly based on Open XML Format. Is it originated from a ".doc" file?

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu

    Tuesday, May 11, 2010 2:18 AM
  • Our program adds that comment so we can track version numbers. Where should we place it so that the OpenXML SDK does not object to it?

    Also out of curiosity, why does a comment cause an error? Shouldn't that be allowed?

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Tuesday, May 11, 2010 2:41 AM
  • hi, dave

    Thank you for your question.

    It seems it is an issue of 'System.IO.Packaging.Package', not OpenXML SDK.  OpenXML SDK depends System.IO.Packaging.Package to open the zip package.  When adding a comment in the front of the document.xml's head, System.IO.Packaging.Package will fail to get package-level relationships (using GetRelationships()), which tells what parts are contained in this zip package. In my opinion, this causes the problem you have encountered. However, it takes time to look deep into this to see the details.

    As a walk-around, you may add the comments at the end. I tryied this locally, the package could be opened. But I still have concerns that there are potential issues in this way.  

    Thanks,

    Raymond

    • Proposed as answer by Lu Zhang Wednesday, June 02, 2010 9:52 AM
    Tuesday, May 11, 2010 10:02 AM
  • That wasn't it - I removed the comment and still get the error.

    And even with the comment Word 2007 opens the file normally (not compatibility mode) on my system.

    Any other ideas why Word is happy with this file but OpenXML SDK is not?

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Tuesday, May 11, 2010 3:13 PM
  • We have another at http://www.windwardreports.com/temp/GoodDocx.zip - Word opens it find and OpenXML won't open it.

    Can you please tell me what is wrong and how I can get OpenXML to accept a file Word thinks is good?

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Tuesday, May 25, 2010 6:06 PM
  • I am also having this issue.

     

    The file in question opens in Word 2007 with no problems. It also is fully compliant XML according to my XML editor , oxygen.

     

     

    Tuesday, May 25, 2010 11:55 PM
  • The issue may be due to the encoding of your XML files. Make sure the encoding is based on UTF rather than ANSI. Let us know if that works.
    Zeyad Rajabi (MS)
    • Proposed as answer by Lu Zhang Wednesday, June 02, 2010 9:52 AM
    Thursday, May 27, 2010 6:50 PM
  • They absolutely are all UTF - we create them using the .NET XmlWriter. These files are created using our report generation tool and we only get this problem when we convert from DOCX -> XLSX or XLSX -> DOCX. When we take DOCX, add a ton of stuff to it, and do DOCX out - it's fine.

    Would it be possible for one of the Open XML dev team to run it in a debugger where they have source access and they can see what it is choking on? I think that Open XML should handle these files as Word does.

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Thursday, May 27, 2010 7:26 PM
  • Hi Dave,

    I strongly believe Raymond and Zeyad are both right in some way.

    First, it's not an OpenXml SDK problem, it's a Packaging Problem. The Wordprocessing Document uses an underlying Package as container. Opening your files via packaging shows that there is no relationships at package level, so there is no mainpart found.

    Second, as Zeyad mentioned, it seems to me that at least the package .rels-file and document.xml-file are ANSI.

    I don't know how Word handles these files but i experienced that it is very forgiving in respect of invalid documents (not so the productivity tool or packageexplorer - both indicate invalid documents!). If you open and save the documents the relationships will be found. So i would suggest you take a look at the conversion process.

    Friday, May 28, 2010 9:17 AM
  • Hi, David,

    Thank you for your question.

    I think there may be Byte Order Mark (BOM) in this problem. Try to add an XmlWriter directly over a stream as well, but with a new UTF8Encoding instance that has the BOM explicitly turned off. Please refer to this thread for detail: http://blogs.msdn.com/b/marcelolr/archive/2010/03/18/encoding-fun-with-xmlwriter-and-streamwriter.aspx

    Thanks,

    Raymond

     

    • Proposed as answer by Lu Zhang Wednesday, June 02, 2010 9:52 AM
    Friday, May 28, 2010 10:19 AM
  • Hi;

    I just opened up my sample and opened every file in it. Every file starts with:

    <?xml version="1.0" encoding="utf-8"?>

    But you are right that it has the BOM and removing that makes it readable. But that leads to two big questions:

    1. We have generated DOCX/PPTX/XLSX with the BOM forever and it has worked fine elsewhere. Why does it choke for this case?

    2. Why does it not like the BOM? I think that counts as legit XML.

    Anyways, now we can get this working so thank you. But please let the dev team responsible for the packaging code know about this - I think this counts as a bug.

    thanks - dave


    Very funny video - Reporting as a Metaphor
    Wednesday, June 02, 2010 12:56 AM
  • Hi All,

    My requirement is to  open a new document based on the existing template. I have the code below which displays The specified package is invalid. The main part is missing.

    string sourceFile = @"F:\Sinduja\Sample.dotx";

    string destinationFile = @"F:\Sinduja\Sample2.docx";

    int id = 1;

    if (!File.Exists(destinationFile))

    {

    Response.Write("File not found");

    }

    else

    {

    using (WordprocessingDocument myDoc = WordprocessingDocument.Open(destinationFile, true))

    {

    MainDocumentPart mainPart = myDoc.MainDocumentPart;

    //Find content controls that have the name of the source file as

    // an alias value.

    List<SdtBlock> sdtList = mainPart.Document.Descendants<SdtBlock>().Where(s =>

    sourceFile.Contains(s.SdtProperties.GetFirstChild<Tag>().Val.Value)).ToList();

    if (sdtList.Count != 0)

    {

    string altChunkId = "AltChunkId" + id;

    id++;

    AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(

    AlternativeFormatImportPartType.WordprocessingML, altChunkId);

    chunk.FeedData(

    File.Open(sourceFile, FileMode.Open));

    AltChunk altChunk = new AltChunk();

    altChunk.Id = altChunkId;

    //Replace content control with altChunk information

    foreach (SdtBlock sdt in sdtList)

    {

    OpenXmlElement parent = sdt.Parent;

    parent.InsertAfter(altChunk, sdt);

    sdt.Remove();

    }  Thanks in advance..



    sindhuja

    Monday, April 09, 2012 10:43 AM