none
Reading values of content controls via the SDK? RRS feed

  • Question

  • We've got a word document with a variety of Rich Text content controls inserted into a Word 2007 doc using the Developer toolbar.

    Is there a way with the Open XML SDK to iterate through these and read their contents? Ideally I'd like to be able to determine their title and tag as well.

    Many thanks for both the API and the attention,
    -SP
    Monday, August 4, 2008 9:31 PM

Answers

  • I dont think there are any entities to get you the content controls enumeration from the SDK but you can easily get all the values using simple XPATH

     

    A Content Control is internally stored as an tag and the content for the control is specified in the tag,

    1. open the document,

    2. get the stream and load it into xmldocument and

    3. retrieve an XPathNodeIterator by executing an XPATH like:

    Code Snippet
    .//w:sdt/w:sdtContent/w:t

     

     

     

     

    This will give you a collection of XPathNavigators for all the values in each content control.

     

    NOTE: If you want to retrieve the value of an image part or any embedded part, that will be a different case since they are generally stored using relationships and not as plain text or base 64 streams.

     

    Tuesday, August 5, 2008 11:32 AM

All replies

  • I dont think there are any entities to get you the content controls enumeration from the SDK but you can easily get all the values using simple XPATH

     

    A Content Control is internally stored as an tag and the content for the control is specified in the tag,

    1. open the document,

    2. get the stream and load it into xmldocument and

    3. retrieve an XPathNodeIterator by executing an XPATH like:

    Code Snippet
    .//w:sdt/w:sdtContent/w:t

     

     

     

     

    This will give you a collection of XPathNavigators for all the values in each content control.

     

    NOTE: If you want to retrieve the value of an image part or any embedded part, that will be a different case since they are generally stored using relationships and not as plain text or base 64 streams.

     

    Tuesday, August 5, 2008 11:32 AM
  • Thanks niksac - I appreciate the fast response.

    It's too bad that there's no great way I've found to play with the document's content directly through the SDK, but such is life. Besides, an Xml-only interface is still a zillion times better than the alternatives of the past...

    If anyone else is trying to do what I was and wants to strip only content controls with a given tag name, you'll need to use an xPath like this one with nested xPath predicates:

    Code Snippet

    .//w:sdt[w:sdtPr/w:tag[@w:val='YOURTAGNAMEHERE']]/w:sdtContent


    and if you want just the text, you'll probably need

    Code Snippet

    .//w:sdt[w:sdtPr/w:tag[@w:val='YOURTAGNAMEHERE']]/w:sdtContent/w:p/w:r/w:t


    Please note that you can substitute the "Title" instead of the "Tag" for these by replacing "w:tag" with "w:alias" too.
    -SP
    Tuesday, August 5, 2008 5:04 PM