none
Need advice on best approach for automating Word 2010 in C# RRS feed

  • Question

  • I need to automate the insertion and population of tables in an existing Word 2010 document, preferably using VS2010 C#.

    The existing Word "template" document will have placeholders to identity where the various tables are to be inserted.  The output of the automation will be a customized Word document with all its' tabular data embedded (ie, not a link to a data source). The Word document will then likely be converted and distributed as a PDF document.  

    I've found several good articles that describe the insertion of emply tables into a new or existing Word document using the PIA's. 

    But what I don't understand are my options for automating the retrieval of the data displayed in those tables.  For example, a table may need to include data from an Access table by running a query to sort and  summarize the data, with the results then being embedded in the table (not linked).

    I found an article in the Office Dev Center titled Automating Word Tables for Data Insertion and Extraction, but it's really old (Office 2003) and it's not clear which automation technology it's using, or if whatever it's using is still the recommended technology.  Is this an alternative I should consider (and is there an up-to-date version of that article)?

    Thanks for any guidance!

    DadCat

     

    Tuesday, August 14, 2012 6:54 PM

Answers

  • Microsoft's been good at only changing the PIA contents when something changes in the version of Word. I think the commands for accessing table contents in Word 2003 are the same as 2007, 2010 and 2013.

    However, if you don't need the users involvement, and are only going to need to work with documents in the .docx format, I recommend using the Open XML SDK. Both it and Interop can be used with C#, however Open XML doesn't work via Word. It works directly with the file, and as such is much quicker.

    Open XML forums (the sticky threads have good links to get started).

    Edit: I had a glance at "Automating Word Tables for Data Insertion and Extraction". I think it will probably work in other versions of Word, but it's not very readable, is it! If I was going to use Interop, I would use the table.Cell method, and go in a row/column loop extracting each cell in turn.
    • Edited by JosephFox Tuesday, August 14, 2012 7:12 PM
    • Marked as answer by DadCat Tuesday, August 14, 2012 9:28 PM
    Tuesday, August 14, 2012 7:09 PM

All replies

  • Microsoft's been good at only changing the PIA contents when something changes in the version of Word. I think the commands for accessing table contents in Word 2003 are the same as 2007, 2010 and 2013.

    However, if you don't need the users involvement, and are only going to need to work with documents in the .docx format, I recommend using the Open XML SDK. Both it and Interop can be used with C#, however Open XML doesn't work via Word. It works directly with the file, and as such is much quicker.

    Open XML forums (the sticky threads have good links to get started).

    Edit: I had a glance at "Automating Word Tables for Data Insertion and Extraction". I think it will probably work in other versions of Word, but it's not very readable, is it! If I was going to use Interop, I would use the table.Cell method, and go in a row/column loop extracting each cell in turn.
    • Edited by JosephFox Tuesday, August 14, 2012 7:12 PM
    • Marked as answer by DadCat Tuesday, August 14, 2012 9:28 PM
    Tuesday, August 14, 2012 7:09 PM
  • Thank you!

    That was exactly the type of advice I needed. I knew the Open XML SDK existed, but didn't know if it was a fit for what I was trying to do. I'll take a look at it.

    DC

    Tuesday, August 14, 2012 9:28 PM
  • <<Edit: I had a glance at "Automating Word Tables for Data Insertion and Extraction". I think it will probably work in other versions of Word, but it's not very readable, is it! If I was going to use Interop, I would use the table.Cell method, and go in a row/column loop extracting each cell in turn. >>

    <sigh> I wish I'd had you around when I was writing that, ten years ago... Writing one article to cover multiple programming languages/environments and various aspects of a topic is always a challenge!

    FWIW your Open XML SDK suggestion is the best way to go.

    If that's not feasible, then "walking" table cells is a very bad idea if the table is going to be larger than a few cells - it will be slow. Slow enough, that for a table of any appreciable size, Word will appear to crash. So, while "simple and straightforward" may be tempting, it's not going to scale well in this scenario.

    The core of the section "Populating Word Tables with Data" is what's relevant in the article and, in a nutshell, what is says is: Put the data in a delimited string format. It doesn't matter what field delimiter you use, as long as it's not a character in the data. You must use ANSI 13 ("\r") as the record delimiter. Assign this string to a Range object where the table should appear (Range.Text = string). Now use the ConvertToTable method on the Range object. Voilà! Instant (unformatted) table.


    Cindy Meister, VSTO/Word MVP

    Sunday, August 19, 2012 3:02 PM
    Moderator
  • *cough* Didn't notice you were the author, Cindy. I've just had a closer look; the other code examples are great, and the written text. Just on the Extracting Data bit, I said 'not very readable' partly because of the lack of blank lines, partly because my personal preference is for variable names that are longer and more descriptive, and partly because VB isn't my first language. So; as much my issue as yours. And like you say it was a ten years ago.

    I second your assertion that walking table cells is only appropriate for small-scale table extracting.

    Sunday, August 19, 2012 3:49 PM
  • I'm not sure whether I should emote a giggle or a guffaw or a grin <g>

    Now I understand. I was running over the max allowed size with that last bit, so understanding it depends on having read the "Populating" section, first. Just not enough available space (or energy) to repeat all the information again.

    The blank lines bit we can blame on the MSDN editors at the time :-)


    Cindy Meister, VSTO/Word MVP

    Sunday, August 19, 2012 4:49 PM
    Moderator