none
Word Interop SLOW tables... only option is stringtotable? RRS feed

  • Question

  • I've been hacking away on my C# .NET 2.0 application that's using Word 2007. Happy about being done with the report I'm generating in Word, I found an error in the code resulting in all rows not being returned from the database and put into the Word document. Think I was happy when it did...?

    What happened is the report took up to 5 minutes to generate, and I'm on a fairly fast computer. Creating new table-rows through the API obviously does not hold water. I need to rewrite it all to make it go faster. I found an article by Cindy Meister that I followed. It was written for 2003, and none of the suggestions were improving my performance significantly, unfortunately. I was still looking at upwards 5 minutes, and Word is just eating away more and more memory. I'd hoped for a way to just keep pushing in the objects and prevent it from redrawing. Hiding the window didn't seem to speed anything up. I went with the Print view and no pagination and screen update to false, and it's not making a difference.

    This leaves me with just one option ... redo everything to strings, and convert the string to table. I've done it for 1 table, and it did speed things up. But, God ... I've got so many tables I need to do this for. I need to ask: is this really the only option left for me or would I be wasting my time again?

    And if it is the last option, will it slow to a halt again when I'm applying formatting and styles to my now styleless table that was generated? Is is even possible to generate nested tables with this? How do I represent the tab-character in a string that's passed as a field-value in a tab-delimited "row"?

    And is this the only way...?

    It's not every day I get truly upset when programming, but this done it to me.

    (A very sad) Tom

    Wednesday, December 8, 2010 11:05 PM

Answers

  • Hi Tom

    Mmm, the information in that article still holds true. I'm afraid you'll need to re-write your code, although not necessarily to create character-delimited strings. Some observations:

    1. The longer the table, the slower things will get. As mentioned in the article, this mainly has to do with Word having to recalculate layout and page breaks.

    2. Turning off the automatic adjustment of cell widths by creating a table the way Word 97 did is the only thing you can do to speed this up.

    3. Applying formatting after-the-fact will slow things down, but less so. One thing you can do to help with this is to create a Table Style in the document and set it as the default.

    4. Your best bet would probably be to generate the entire report outside the Word application UI. Word 2003 has its own XML Vocabulary, WordProcessingML. With that, a Word document can be created from scratch, using standard XML tools. Since the advent of Office 2007 with the Open XML file format there's also an Open XML SDK and the Open XML SDK forum on MSDN. The Compatibility Pack for Office versions back to 2000 will allow these versions to open an Open XML file for editing. So you'd have the choice generating your reports in WordProcessingML or the Open XML file format, but there's certainly more support for the latter. These file formats were created specifically for the kind of thing you're doing and for that reason I doubt you'll ever see much improvement in this aspect of the object model.

    If you want the efficiency improvements the XML processing of your data into  table format can bring, but still want to use the Word automation interface, this is possible. You can generate the table as valid WordProcessingML and "stream" the data into the document using the Range.InsertXML method.


    Cindy Meister, VSTO/Word MVP
    • Marked as answer by TomABC Thursday, December 9, 2010 5:25 PM
    Thursday, December 9, 2010 10:18 AM
    Moderator

All replies

  • Hi Tom

    Mmm, the information in that article still holds true. I'm afraid you'll need to re-write your code, although not necessarily to create character-delimited strings. Some observations:

    1. The longer the table, the slower things will get. As mentioned in the article, this mainly has to do with Word having to recalculate layout and page breaks.

    2. Turning off the automatic adjustment of cell widths by creating a table the way Word 97 did is the only thing you can do to speed this up.

    3. Applying formatting after-the-fact will slow things down, but less so. One thing you can do to help with this is to create a Table Style in the document and set it as the default.

    4. Your best bet would probably be to generate the entire report outside the Word application UI. Word 2003 has its own XML Vocabulary, WordProcessingML. With that, a Word document can be created from scratch, using standard XML tools. Since the advent of Office 2007 with the Open XML file format there's also an Open XML SDK and the Open XML SDK forum on MSDN. The Compatibility Pack for Office versions back to 2000 will allow these versions to open an Open XML file for editing. So you'd have the choice generating your reports in WordProcessingML or the Open XML file format, but there's certainly more support for the latter. These file formats were created specifically for the kind of thing you're doing and for that reason I doubt you'll ever see much improvement in this aspect of the object model.

    If you want the efficiency improvements the XML processing of your data into  table format can bring, but still want to use the Word automation interface, this is possible. You can generate the table as valid WordProcessingML and "stream" the data into the document using the Range.InsertXML method.


    Cindy Meister, VSTO/Word MVP
    • Marked as answer by TomABC Thursday, December 9, 2010 5:25 PM
    Thursday, December 9, 2010 10:18 AM
    Moderator
  • Just wow ... I'm on .NET 2.0 so I went with WordProcessingML as OpenXML SDK required .NET 3.5, and I haven't styled the tables at all yet, but I'm generating them in 6 seconds now. This is what took several minutes before. I simply write the ML to a StreamWriter and then read it using InsertXML. I'm going to have to learn how to style the tables, and this should bloat the files, but I'm certain it'll not ever take 5 minutes to generate one of these reports again. The ML was pretty easy to get used to having gotten familiar with the Word object model and knowing good old HTML tables.

    Thank you so much!

    Tom

    Thursday, December 9, 2010 5:25 PM
  • Just wow ... I'm on .NET 2.0 so I went with WordProcessingML as OpenXML SDK required .NET 3.5, and I haven't styled the tables at all yet, but I'm generating them in 6 seconds now. This is what took several minutes before. I simply write the ML to a StreamWriter and then read it using InsertXML. I'm going to have to learn how to style the tables, and this should bloat the files, but I'm certain it'll not ever take 5 minutes to generate one of these reports again. The ML was pretty easy to get used to having gotten familiar with the Word object model and knowing good old HTML tables.


    Hi Tom

    Glad that approach worked for you :-)

    Tip for the "styling": Create a small table in the Word application interface (as an end-user) and apply the formatting you want. Save the file as "Word XML" rather than "Word Document".

    Now open the file in an XML editor. (You can also look at it in Word, if you prefer. Just activate the "Confirm conversions on open" in Tools/Options/General, then choose to view it as a "Text" file.)

    This should give you the WordProcessingML syntax you'll need for your code.

    <<I'm going to have to learn how to style the tables, and this should bloat the files>>

    Actually, they won't be any more "bloated" than if you'd format the tables as an end-user. When you insert the XML Word will convert it to its binary file format automatically.


    Cindy Meister, VSTO/Word MVP
    Friday, December 10, 2010 9:13 AM
    Moderator