OpenXML - C# - Accessing Embedded Document Info


  • Hi,

    I am trying to extract info on all embedded documents that reside within a WORD file.

    Specifically I am trying to determine the document name and at what point in the document they are embedded. For example if I read-in a Paragraph and there is an embedded document placed right after the paragraph I need to be able to be able to detect that so I can then assume that that embedded document is related to that paragraph.

    I saw a similar post here but there were no replies... I am hoping I will be able to get one for my issue!!!

    Monday, March 11, 2013 2:50 PM

All replies

  • Hi Hi-Tek,

    I think it's impossible.

    You can try the below steps:

    1. Create an empty document(Document A) and embed another document(Document B) in it.
    2. Change Document A's extension name to .zip and unpack it.
    3. Go into the extracted folder. Open "word" -> "embeddings", you'll see Document B but it's name has been changed to Microsoft_Word_Document1.docx.
    4. In "word" -> "media", you can see Document B's icon.

    I've traversed all folders and xml files in the extracted folder, but I cannot find a xml file which contains Document B's name. Seems that some info of the embedded document has been abandoned by Word. 

    Hope it helps. 

    What's life without whimsy?

    Thursday, March 14, 2013 5:13 AM
  • Hi Hi-Tek

    Can you explain exactly how these documents have been embedded into the "main" document, please?

    Cindy Meister, VSTO/Word MVP, my blog

    Friday, March 15, 2013 5:30 PM