none
How much faster is it to do things through a Word AddIn? RRS feed

  • Question

  • I'm wondering how much more efficient it would be to execute COM calls from within an AddIn, as opposed to from an external application.

    I'm writing transcription software. Typists may insert text by using hotkeys. Currently this is handled by the main executable. When the message is received the following takes place (each taking approximately one COM call):

    #  The Word Application's selection is retrieved

    #  The selection's duplicate range is retrieved

    #  The range's start property is retrieved

    #  The range's start property is set to be what it was before minus 10 characters

    #  The range's text property is retrieved

    #  The retrieved text is evaluated to determine whether a capital letter is necessary, and preceding spaces, etc. New text is inserted.

    So, about six inter-process calls. I'm wondering whether it would be worth having the main executable send one COM call to the AddIn, and then the AddIn work through the above process. On my average computer (4GB ram, dual core processor) there's only a very slight delay, almost imperceptible, between key press and text insertion....maybe 1/3rd of a second. But it all makes a difference when you're typing 100 words per minute, like some typists are. Roughly how much faster would it be? Could it be twice as fast? It would be unreasonable to ask for precise estimates, but I'm interested in your guesses.

    Both the main executable and AddIn are in unmanaged C++.

    Tuesday, June 26, 2012 1:58 PM

Answers

  • Alright. Both AddIn and executable (a console executable) were built with Visual Studio 2010, release mode, and ran with no debugger attached. As mentioned previously, both AddIn and executable were programmed in unmanaged C++. I imported the type libraries, meaning vftables were used in calling methods/property gets/property puts; this is the way the PIAs usually do it, an the most efficient way of using the Word object model.

    I ran the tests on what I consider an average computer; a two year old laptop, Pentium T4300 (dual core, 64 bit), 4GM RAM, Windows 7, Word 2007.

    I left programs running in the background. I had about 6 web pages open, a filesharing program running, several Explorer windows open, and one instance of Visual Studio open (even though, as stated, I was not running the applications out of Visual Studio). Not good for a scientist, but I'm interested in how things do in the real world. I went back and forth between AddIn/executable, pretty much randomly, rather than doing all AddIn tests at once. There was just one document open in all cases.

    I measured time using the system clock, obtained via the timeGetTime function, which MSDN indicates is acurate to approximately 5ms. I obtained a start time after a pointer to the Word application. Before obtaining a finish time, I ran only the property get/put/method, not even saving the return HRESULT. All variables were initialised to NULL before getting the start time. After getting the finish time I verified that the calls had been successful by checking objects were not NULL. So, the first and most basic test looked like this:

    wchar_t buffer[500]; Word::Selection *pSelection = NULL; DWORD startTime = NULL; DWORD finishTime = NULL; startTime = timeGetTime(); pApp->get_Selection(&pSelection); finishTime = timeGetTime(); _itow_s(finishTime - startTime, &buffer[0], 500, 10); MessageBox(0, buffer, 0, 0); // Display time if (pSelection != NULL) MessageBox(0, L"selection not null!", 0, 0);

    *sigh* the bleedin' code block display has stolen my blank lines again.

    Anyway, the results:

    Action

    COM Calls

    In Process

    Out of Process

    Get application selection (when 0 characters are selected)

    1

    0ms, 0ms, 0ms, 0ms, 0ms, 0ms

    7ms, 13ms, 2ms, 3ms, 2ms, 2ms

    Get application selection, then get selection text (eight characters selected)

    2

    0ms, 0ms, 0ms

    5ms, 4ms, 13ms

    Get application selection, then set selection’s text (eight characters selected, eight characters inserted)

    2

    13ms, 11ms, 23ms

    5ms, 6ms, 6ms

    Get application selection (0 characters), get selection’s range, get range’s start, set range’s start (to be what it was minus 10), get range’s text, insert text (TypeText using the selection and eight characters)

    6

    5ms, 4ms, 3ms, 5ms, 1ms, 4ms

    56ms, 12ms, 9ms, 29ms, 10ms, 9ms

     

    As you can see I ran each test at least 3 times, 6 times for the ones I was particularly interested in.

    I was very surprised how quick it was to execute commands in process, and downright astonished at how quick it was out of process. Does everyone know COM is that fast?

    It is curious that it was faster out of process for changing the text in the selection. I went back and did more tests, which I haven't reported, but the results broadly held. My theory about that is because I had to double click an executable, focus was taken off the Word window, and it didn't redraw until after returning from the call. I was very careful to make sure the selection was visible to me, even after the executable's console launched.

    I was going to do tests with turning screen refresh off, but the numbers are so low I don't think I'll bother. 1/20th of a second, worse case scenario, for what I want to do.... I think if there are efficiency savings to be had with my program, I ought first to look elsewhere.

    • Marked as answer by JosephFox Friday, July 6, 2012 8:59 PM
    Friday, July 6, 2012 8:52 PM

All replies

  • One wonders why, given that you have a Selection, you're playing around with Duplicate and Range. I get the impression the Selection is just the insertion point. In that case, you could move the Selection's start back 10 characters, do all your processing on that basis, then collpase the Selection afterwards.

    Are you disabling screen updating during your processing? That can make a significant difference in processing speed.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Tuesday, June 26, 2012 11:30 PM
  • And besides, you have:

    #  The range's start property is retrieved

    #  The range's start property is set to be what it was before minus 10 characters

    as TWO instructions, retrieving the Start, and THEN moving back 10 characters (two separate instructions), but it does not have to be.

    Selection.Start = Selection.Start - 10

    ONE.  You do not need to retrieve the existing Start as a separate instruction.

    But I have to agree with Paul, I think some reasessment of your use Selection may be called for.  Plus disabling screen updates.


    Word MVP

    Thursday, June 28, 2012 4:25 AM
  • Unfortunately, neither post helps. Partly, though not wholly, because you're not addressing what I asked, which is what the time difference between executing out of process and in process COM calls.

    Macropod, I said 'duplicate', but I just mean the selection's range property. It's one call to retrieve that. I could take that out, and replace it with a call to reset the range selection, but it's the same number of inter-process calls, and that number is the bottleneck. I would also then need to turn off screen updates. If you are not doing things that affect the document, which I am not until the final call, Word does not call a screen update-, the only reason it would is if the user's done something in that time, in which case I would like Word to update the screen. To turn off and on screen updates would add needless COM calls.

    Fumei2, that is two COM calls. It just looks like one in C#. The PIA is wrapping one call for property get and another for property put.

    Edit: I'm basing my assertion that screen updates are not triggered by automation, unless the document is changed, on my experience with doing hefty processing jobs in C++. It may be different using  the PIAs, because I vaguely remember the cursor flickering using them. So, just, turning off screen updates doesn't help in my case.

    • Edited by JosephFox Thursday, June 28, 2012 10:02 AM
    Thursday, June 28, 2012 9:52 AM
  • Hi Joseph

    Given that more than a week has gone by, I'd say that no one frequenting the forum has any data to share and that you'd need to run your own tests in order to determine if there's any time-saving using the different approach.

    I was going to suggest looking at WordOpenXML, but it looks like the actually evaluation you're doing is already being handled "outside" the Word object model? So I'm not sure how much that would bring in this case...


    Cindy Meister, VSTO/Word MVP

    Wednesday, July 4, 2012 9:13 AM
    Moderator
  • Good to see you, Cindy. Yes, I ought to pull my finger out. Does anyone know any good way of measuring these things (in C++)? Is the system clock accurate enough that I could pull times? To compare I'd need to take the time after the last function returns, which is going to be in different processes (in the first instance, the routine will start and end in my main executable, in the second instance the routine will start in my main executable, but end in my AddIn).

    I once tried modifying an open document with WordOpenXML, but I kept getting access violations. Is it possible? At the very least, WordOpenXML is designed for processing documents that aren't in use.

    Edit: Nevermind about 'how to measure time differences in C++', timeGetTime seems exactly what I need. It can be called in any process and is accurate to 5ms. Unlike C#/VB functions, I might gloat, which are often out by 50ms.
    • Edited by JosephFox Friday, July 6, 2012 9:49 AM
    Friday, July 6, 2012 9:38 AM
  • Alright. Both AddIn and executable (a console executable) were built with Visual Studio 2010, release mode, and ran with no debugger attached. As mentioned previously, both AddIn and executable were programmed in unmanaged C++. I imported the type libraries, meaning vftables were used in calling methods/property gets/property puts; this is the way the PIAs usually do it, an the most efficient way of using the Word object model.

    I ran the tests on what I consider an average computer; a two year old laptop, Pentium T4300 (dual core, 64 bit), 4GM RAM, Windows 7, Word 2007.

    I left programs running in the background. I had about 6 web pages open, a filesharing program running, several Explorer windows open, and one instance of Visual Studio open (even though, as stated, I was not running the applications out of Visual Studio). Not good for a scientist, but I'm interested in how things do in the real world. I went back and forth between AddIn/executable, pretty much randomly, rather than doing all AddIn tests at once. There was just one document open in all cases.

    I measured time using the system clock, obtained via the timeGetTime function, which MSDN indicates is acurate to approximately 5ms. I obtained a start time after a pointer to the Word application. Before obtaining a finish time, I ran only the property get/put/method, not even saving the return HRESULT. All variables were initialised to NULL before getting the start time. After getting the finish time I verified that the calls had been successful by checking objects were not NULL. So, the first and most basic test looked like this:

    wchar_t buffer[500]; Word::Selection *pSelection = NULL; DWORD startTime = NULL; DWORD finishTime = NULL; startTime = timeGetTime(); pApp->get_Selection(&pSelection); finishTime = timeGetTime(); _itow_s(finishTime - startTime, &buffer[0], 500, 10); MessageBox(0, buffer, 0, 0); // Display time if (pSelection != NULL) MessageBox(0, L"selection not null!", 0, 0);

    *sigh* the bleedin' code block display has stolen my blank lines again.

    Anyway, the results:

    Action

    COM Calls

    In Process

    Out of Process

    Get application selection (when 0 characters are selected)

    1

    0ms, 0ms, 0ms, 0ms, 0ms, 0ms

    7ms, 13ms, 2ms, 3ms, 2ms, 2ms

    Get application selection, then get selection text (eight characters selected)

    2

    0ms, 0ms, 0ms

    5ms, 4ms, 13ms

    Get application selection, then set selection’s text (eight characters selected, eight characters inserted)

    2

    13ms, 11ms, 23ms

    5ms, 6ms, 6ms

    Get application selection (0 characters), get selection’s range, get range’s start, set range’s start (to be what it was minus 10), get range’s text, insert text (TypeText using the selection and eight characters)

    6

    5ms, 4ms, 3ms, 5ms, 1ms, 4ms

    56ms, 12ms, 9ms, 29ms, 10ms, 9ms

     

    As you can see I ran each test at least 3 times, 6 times for the ones I was particularly interested in.

    I was very surprised how quick it was to execute commands in process, and downright astonished at how quick it was out of process. Does everyone know COM is that fast?

    It is curious that it was faster out of process for changing the text in the selection. I went back and did more tests, which I haven't reported, but the results broadly held. My theory about that is because I had to double click an executable, focus was taken off the Word window, and it didn't redraw until after returning from the call. I was very careful to make sure the selection was visible to me, even after the executable's console launched.

    I was going to do tests with turning screen refresh off, but the numbers are so low I don't think I'll bother. 1/20th of a second, worse case scenario, for what I want to do.... I think if there are efficiency savings to be had with my program, I ought first to look elsewhere.

    • Marked as answer by JosephFox Friday, July 6, 2012 8:59 PM
    Friday, July 6, 2012 8:52 PM
  • Hi Joseph

    Thanks for sharing your results :-)

    <<It is curious that it was faster out of process for changing the text in the selection.>>

    Yes, as I scanned your table of results that surprised me, as well.

    It seems you're OK, but the problem was going through the back of my mind, off and on yesterday and it dredged up a couple of things that may or may not help you (or someone else who may come across this thread):

    1. Fastest would still be VBA, running in-process. If you really need to speed things up, a DLL to provide the out-of-process processing that's called by VBA, and the code in VBA to manipulate the object model, would probably be the optimum.

    2. Even though we're all generally told that early-binding is faster than late-binding, this apparently doesn't always apply to C++. Somewhere on the Microsoft site there's a KB article that says using what I've learned to call PInvoke can be much faster for C++ code as it seems to work more directly with the object libraries.


    Cindy Meister, VSTO/Word MVP

    Friday, July 6, 2012 10:27 PM
    Moderator