docx -> single XML file (without using Word2007 app) RRS feed

  • Question

  • Is there a way/tool for generating a complete xml file from a docx (aka zip) file? I know I can do this using SaveAsXML from within Word2007, but need to be able to do it *without* the app. Does the OpenXML SDK provide this capability?


    Bill Cohagan
    Wednesday, March 11, 2009 7:30 PM


All replies

  • Check out the following blog post:

    Zeyad Rajabi
    Zeyad Rajabi (MS)
    • Marked as answer by Bill Cohagan Tuesday, March 17, 2009 1:37 PM
    Thursday, March 12, 2009 5:41 PM
  • Hi Bill,
    Could you provide me with more details/scenarios around why you need to convert DOCX file to complete XML file?
    I am also interested in the bigger picture of your solution. For example, are you using SDK on your server side for document processing? are the original DOCX files generated by Word app? Where do you use the converted XML files?

    I appreciate your efforts and time if you can do me this favor, thanks:)
    Tuesday, March 17, 2009 7:30 AM
  • Goolol
      We have an app that is monitoring a website for changes to doc files that are posted there. These files are created as doc (not docx) files. Eventually (exact time unknown) these will be docx files rather than doc files. Our app is interested in cracking the doc(x) content and putting the information into a database.  The documents are of several "types" each with a well defined format. We are using XSLT to extract the information of interest from the document and it is easier to run transform(s) over a single DOM than to run over several DOMs and then merge results; thus the desire to have a complete document DOM rather than several component DOMs.

      We were of course prepared to deal with the multiple DOM case should the combining prove to be too much work. But we thought it should be straightforward (modulo learning the details of the organization) and hoped there might be an existing tool that would easily solve the problem. The link provided by Zeyad Rajabi appears to be just the Right Thing.


    Bill Cohagan
    Tuesday, March 17, 2009 1:36 PM
  • Thanks for your information:)
    Friday, March 27, 2009 4:37 AM