API for iterating through all types of items in a Word document RRS feed

  • Question

  • I am working on an application for my company where we need to loop through all the objects of a document including tables and paragraphs, preferably in the order that the document is, so that for a document laid out like:

    Here is the a paragraph

    Here is a table

    Here is another paragraph

    I will first encounter "Here is the a paragraph", then the table, then "Here is another paragraph". I cannot find any api within the Microsoft.Office.Interop.Word namespace that would get me this.

    So far it seems like I need to loop through either the paragraphs or the tables. I have thought about looping through the tables, getting all of the ranges inside the tables, then looping through the paragraphs and checking if the paragraph range is inside the table range, but that takes way too much time, especially since I do not see a way to create a hash structure for the ranges to simplify the lookup.

    Is what I want impossible?


    I figured this out (at least I figured out a workable solution) - each Paragraph has a Range, from a Range you can access the "Tables" property to get if you are now in a table, you call also access the "Cells" property to get the current cell information. This is a little clunky, but I think this strategy might work unless someone can offer me a better one.

    Monday, May 9, 2016 1:57 PM


  • Because tables exist in the same 'layer' as the text of whatever Storyrange they're in, looping through all paragraphs is probably the simplest way of returning their content in terms of their relative position in the document. If you're looping though all paragraphs, when the loop hits a table, it will process all paragraphs within each cell in turn, with cells processed across then down. However, tables aren't the only elements you need to be concerned with in a given Storyrange. Your Storyrange may also include InlineShapes, Shapes and Frames, any of which may have text. InlineShapes may exist in any paragraph (including in a table) and Shapes and Frames can be anchored to any paragraph. The processing of Shapes and Frames is not so straightforward, as their physical location on the page is independent of what they're anchored to - they 'float' on the page and can be formatted so that:
    • they're behind the text;
    • they're in-front of the text
    • the text wraps all around them; or
    • the text wraps above & below them.
    In a multi-column page layout, they might even be positioned so as to span the column boundaries. Accordingly, to handle them, you'd probably need to get the paragraph data for the basic Storyrange, then loop though the Shapes and Frames to determine where in the basic Storyrange paragraph data you'd want to insert the Shape and Frame data.

    And then there are elements like Comments, that may be attached to some text but don't have any particular physical location in the document.

    Paul Edstein
    [MS MVP - Word]

    Monday, May 9, 2016 11:43 PM