none
Extract document title of embedded object manually RRS feed

  • Question

  • Hello everybody,

    I am looking for a way to extract embedded objects manually from an Open XML file, which is not a big deal, but I cannot find the document titles anywhere in the file.

    In this case I have a docx file with about 40 embedded files. There are Word, Excel and PDF files. I unzipped the docx file and I am able to find all embedded documents. But they all have generic names like "Microsoft_Excel_Worksheet1.xlsx", "Microsoft_Excel_Worksheet4.xlsx", "Microsoft_Word_Document2.docx", and so on...

    I did a full text search on the document titles that I can see when I open the docx file with Word. I looked at the meta data of the unpacked documents and tried to do a delta comparison after changing a document's title. Now I am stuck. Does anyone have an idea?

    Thanks for reading Stefan

    • Edited by S. Linke Thursday, March 29, 2018 9:35 AM Format
    Thursday, March 29, 2018 9:34 AM

Answers

  • Hi S. Linke,

    I try to make a search and find that The document is embedded as a binary OLE object, even though it's an Open XML file. This is because "embedding" is an OLE thing, and OLE is binary. The caption is not stored anywhere in the file "in the clear". Word must be storing it in that binary information.

    So currently, it looks like it is not possible to fetch actual file name of embedded objects from Word document.

    Reference:

    How to get the original name of embedded File from package word with c#

    Embedding Documents/Object in Word by Using the Open XML SDK

    Regards

    Deepak


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by S. Linke Tuesday, April 24, 2018 8:33 AM
    Friday, March 30, 2018 5:25 AM
    Moderator