none
C++ Excel Automation slower for 64 bit RRS feed

  • Question

  • On Windows 7 using Visual C++, my boss wants to know why Excel takes so much longer to load up the data on 64 bit as opposed to 32 bit.  I haven't a clue.  I set up code to time how long it takes for every function and section of code so I could see what's causing the slow down.  It times everything, but because there are some many functions and sections, I made it so that it would only report those places that took more than 1000 ms.  Here are some typical results:

    For 64 bits:
    CXLRange::AttachDispatch() took 2277 ms, nCol = 1
    CXLRange::SetValue() took 2340 ms, nCol = 1
    CXLApplication:Run0() takes 4882 ms
    CXLWorkbook:SetSaved() : 2247 ms
    Total time taken by Excel activities: 11996 ms

    For 32 bits:
    Preliminaries took 562 ms
    Start of do loop took 234 ms
    CXLRange::SetValue() took 187 ms, nCol = 1
    End of do loop took 172 ms
    CXLApplication:Run0() takes 2590 ms
    Total time taken by Excel activities: 4010 ms                                                                                                                                                                                 As you can see from this example, the 64 bit version takes 3 times as long as the 32 bit version.  Functions like Run0, SetSaved(), SetValue() and even AttachDispatch() take longer under 64 bit.  Any explanation for this?  Any way to speed things up?  (note that there is a for loop with nCol starting as 0.)  Thanks for any reply, Gary

    Thursday, November 15, 2018 8:23 PM

All replies

  • Are you using the 64 bit or the 32 bit version of Excel?
    Thursday, November 15, 2018 11:30 PM
  • Thanks very much for your reply.  I produced the numbers above with our software running under the debugger with 32 bit Excel.  However, I'm told that running under 64 bit Excel is no better.  That is, you have to sit and wait for appx 30 seconds for the numbers in the cells Excel to show with the 64 bit Windows whether Excel is 64 or 32 bits (we have a dual boot system.)  However, when running on a 32 bit system, the cells are all populated within a couple of seconds.  I could produce timing numbers in release mode with identical software and inputs if you think that would be helpful.  I wonder if others have the same problem or it there is something I'm doing wrong that can be fixed.  I'll try different situations, such as ordering the calls differently...

    Note:  there is an error in my email above:  It should read: "...it would only report those places that took more than 100 ms."

    Gary

    Monday, November 19, 2018 5:34 PM
  • I definitely think that you shouldn't measure performance while running code under the debugger or with a debug build.

    I can't speak to specific performance measurements but I would expect any mismatch in the bitness between the COM client and Excel to take longer.  The COM runtime would have to do some work to marshal data and interfaces between the 32 and 64 bit worlds that it wouldn't need to do if bitness was the same.

    Monday, November 19, 2018 5:51 PM
  • Now I've been able to run some tests on a dual-boot machine with 32-bit Excel 2016 on 32 bit Windows 7 and 64-bit Excel 2016 on 64 bit Windows 7.  Visual C++ is not installed on either system, but I made release software that records to a file the times spent on individual sections and the total time doing Excel activities through Automation.  The total time takes about 5-6 times longer with 64 bit than 32 bit.  It does take longer when launching Excel than if Excel is already running, but always longer on 64 bit.  For example, CXLApplication::Run0() takes a little more than half of a second on 32 bit, but slightly over 4 seconds on 64 bit.  CXLWorkbook::SetSaved() takes less than a tenth of a second on 32 bit, but nearly 2 seconds on 64 bit.  Since the time differences are due to calls to the Excel Automation functions, I'm not sure how to go about finding the cause and fixing it.  Any ideas very welcome!
    Tuesday, November 27, 2018 8:36 PM
  • I don't know the reason for the timing differences that you've noted.

    One unrelated suggestion -

    I'm guessing that you are using MFC classes created by class wizard from the Excel type library.  MFC implements the methods in these classes through IDispatch which is not as efficient as calling the methods of dual interfaces through the vtable.

    If you use the #import directive instead it will create headers that you can use to call methods on Excel's dual interfaces without using IDispatch.

    Tuesday, November 27, 2018 8:47 PM
  • Thanks for your replies.  I am using MFC classes created by class wizard.  It's pretty old code, written originally for Excel 97.  There was a change for Excel 2007, in order to use 16384 columns instead of 256, but other than that, the code is the same.  Is there some documentation on the automation functions, describing what each function does and includes sample code and what needs to change for later versions of Excel?  Maybe the source code needs to be updated.

    I can't understand why it works fine on some computers, however.  The computers on which Excel 2016 slows down are CompactPCI with 64 bits Windows 7.   Since we sell boards that fit into the CompactPCI chassis, we definitely want it too work reasonably fast on CompactPCI.  (The slowdown occurs even if the only board in the chassis is the CPU.)

    I'm not familiar with vtables, or how to control another program by using them.  Where would I find documentation on that?  I did create a MFC Excel Application class using Class Wizard on my Visual Studio 2015.  I found that it created an import directive:

    #import "C:\\Program Files (x86)\\Microsoft Office\\Root\\Office16\\EXCEL.EXE" no_namespace

    I'm not sure why that's necessary and wouldn't it limit the code to working with 32 bit Office16?  What if a customer was using a different version of Excel?

    Any answers to these questions greatly appreciated.

    Wednesday, December 5, 2018 6:00 PM
  • Microsoft Office applications expose dual interfaces for automation.  A brief explanation of dual interfaces and how they can be used to allow calling methods of an interface in two different ways is at Dual Interfaces

    Do an internet search on "vtable" and you will find an abundance of explanations. :)

    Compatibility issues faced by code that automates different versions of Excel are unrelated to the manner in which a COM object's methods are called.  They arise when Microsoft changes the Excel object model between versions.

    Thursday, December 6, 2018 10:38 AM
  • Thanks for your replies.  I noticed that the link you cited says that early binding offers a much higher performance, so I'm trying to implement it using the #import directive as you suggested.  Maybe by using early binding Excel will respond much faster.  So I made a new MFC dialog Automation project which compiles fine until I add the #import directive I mentioned above.  The good news is that a "excel.tlh" and "excel.tli" files are created.  I assume these will tell me the functions I can use to call Excel.  I will be able to use the same functions as I use in the current code with IDispatch because I can match the numbers.  But the bad news is that there are over 100 compiler errors (Visual Studio Professional 2015) regarding the excel.tlh file created.  The first three are all the same error on different lines of excel.tlh:

    error C2504: '_IMsoDispObj': base class undefined

    And I see some code in excel.tlh around that line:

    ****

    //
    // Type library items
    //

    struct __declspec(uuid("000c0310-0000-0000-c000-000000000046"))
    Adjustments : _IMsoDispObj
    {  ...

    ****

    Do I need some more include files?  (My stdafx.h has the files the Visual Studio C++ puts in for an MFC Automation new project.)  I did find some helpful links such as

    https://docs.microsoft.com/en-us/previous-versions/8etzzkb6(v=vs.140)

    but nothing that helps with all these error messages.  Any ideas greatly appreciated!

    Thursday, December 13, 2018 7:11 PM
  • You probably also need to #import MSO.dll  before the #import directive for excel.
    Thursday, December 13, 2018 8:21 PM
  • Thanks for your very quick reply.  I now have

    #import "C:\\Program Files (x86)\\Microsoft Office\\Root\\vfs\\ProgramFilesCommonX86\\Microsoft Shared\\OFFICE16\\mso.dll"
    #import "C:\\Program Files (x86)\\Microsoft Office\\Root\\Office16\\EXCEL.EXE" no_namespace

    in my code.  However, upon compilation I still get over 100 errors, including the

    "error C2504: '_IMsoDispObj': base class undefined"

    mentioned above.  You wouldn't happen to know of a source for sample code, would you?

    Thursday, December 13, 2018 9:04 PM
  • Did you try adding the no_namespace attribute to the #import for MSO.dll?
    Friday, December 14, 2018 1:52 AM
  • If you want documentation on Excel automation do a search in MSDN for Excel Object Model.  The examples are in Visual Basic for Applications, but all the details are there.

    I wish I could help you with the C++, but back then I used Visual Basic 6.0 to work with Office Automation.  I do remember back then that the original mechanism for office automation was very slow, but was dramatically improved with the next version which I believe was Office 8.0.  But, I don't know why the 64 bit version would be that much slower...just the extra marshaling wouldn't cause processing 2 or 3 times slower.  In fact, I would have assumed that the 64-bit version would run faster especially if there was a lot of floating point math.

    As for your comment about #import "C:\\Program Files (x86)\\Microsoft Office\\Root\\Office16\\EXCEL.EXE" you are correct that you shouldn't be using that.  In VB I would use a function called GetObject() or CreateObject() that would allow code to use different versions of Excel.  I believe the parameter should be "Excel.Application".

    I found the following C++ example by searching the web for "example of c++ createobject Excel.Application"

    / Initialize COM for this thread...
       CoInitialize(NULL);
    
       // Get CLSID for our server...
       CLSID clsid;
       HRESULT hr = CLSIDFromProgID(L"Excel.Application", &clsid);
    
       if(FAILED(hr)) {
    
          ::MessageBox(NULL, "CLSIDFromProgID() failed", "Error", 0x10010);
          return -1;
       }
    
       // Start server and get IDispatch...
       IDispatch *pXlApp;
       hr = CoCreateInstance(clsid, NULL, CLSCTX_LOCAL_SERVER, IID_IDispatch, (void **)&pXlApp);
       if(FAILED(hr)) {
          ::MessageBox(NULL, "Excel not registered properly", "Error", 0x10010);
          return -2;
       }
    I hope this helps.

    Friday, December 14, 2018 4:28 AM
  • I put together a quick and dirty demo for you.  The demo was created with Excel 2013 ( 64 bit ) and simply creates a new worksheet, writes "RLWA32" into cell A1 and then saves the worksheet in the user's document folder.

    There is absolutely no error checking performed.  You can get it from RLWA32Excel.zip

    Friday, December 14, 2018 2:18 PM
  • Thank you very much for your sample code. It works fine! I've been able to convert most of my "late binding" code to "early binding".  I have a remaining question about sending an array of numbers to Excel using early binding.  Here it is in late binding:

    range.SetValue(COleVariant(saRet)))

    where "saRet" is a COleSafeArray, containing the numbers to write to Excel.  SetValue takes a VARIANT.

    In my new early binding code I have the following to get a pointer to the range:

    CComPtr<_Worksheet> pSheet;
    CComPtr<Range> pRange;
    hr = pSheet->get_Range(v,vOptional, &pRange);

    But pRange doesn't have a SetValue(VARIANT &propVal) function.  I notice in your sample code you use
      
    CComDispatchDriver pd = pRange;
    hr = pd.PutPropertyByName(L"Value2", &vA1);

    to write the string "Value2" to Excel.  Should I use pd.PutProperty(DISPID dwDispID, VARIANT *pVar)?  If so, how do I get the DISPID?

    Thanks again,
    Gary

    Tuesday, March 19, 2019 6:56 PM
  • The code in the sample passed the property name and a pointer to the variant that contained the value to be placed in the property.  The sample used ATL's CComVariant class, which simplifies using variants.  In this case, the variant contained a BSTR (i.e., "RLWA32").  CComVariant handled allocation of the BSTR and freeing of the BSTR to avoid leaks when the CComVariant object's destructor was called..

    Many of the interfaces in the Excel Object Model are dual interfaces and so can be used with early binding through their vtable.  However, Excel's Range interface is a dispinterface, not a dual interface.  This means that it only contains the methods of the IDispatch interface and must be used with late binding.

    The CComDispatchDriver class simplifies calling the methods of IDispatch through the encapsulated IDispatch interface pointer.  CComDispatchDriver can be used with property DISPIDs or property names.  For example, when used with a property name to get or set the property's value, internally it uses IDispatch to get the DISPID for the property, creates the necessary parameters and passes them to IDispatch::Invoke.

    You can determine the Excel::Range dispinterface properties and methods (including their DISPIDs) by examining its type library with the oleview.exe sdk utility.

    For example, oleview.exe showed the following idl for the Value2 property of Excel's Range dispinterface -

            [id(0x0000056c), propget, helpcontext(0x00023359)]
            VARIANT Value2();
            [id(0x0000056c), propput, helpcontext(0x00023359)]
            void Value2([in] VARIANT rhs);
    


    • Edited by RLWA32 Tuesday, March 19, 2019 9:06 PM
    Tuesday, March 19, 2019 7:44 PM
  • For some reason, I didn't get any email notification that the reply above came.  It would have saved me a lot of trouble and floundering!  However, I finally figured out what the CComDispatchDriver was doing in your sample code and found that I could use PutPropertyByName or simply PutProperty(6 [ or 0x56c], (COleVariant)saRet) (where saRet is the SafeArray holding the data).  When I converted our late binding code to early binding however, I found that there was no increase in speed.  It's one of those cases where the surgery was successful but the patient died.  Or something like that.  Anyway, thanks for your help.  I now know something about early binding, which has been interesting.  Next up:  I will make an attempt to integrate some managed code that controls Excel into our managed code.  Surely there's a way out of this problem! Thank you very much for your help.
    Thursday, March 28, 2019 6:02 PM
  • I'm interested in this thread.  We supply an Excel OLE API interface in our runtime.  I suspect that our API layer was generated around the Office '97 timeframe as you are reporting.  What I am observing is that when I run customer code that uses our API things slowed down considerably when we were forced to move our API from Visual Studio 6.0 to Visual Studio 2013.  (Yes it's that old).   At any rate I found that by changing the target files from *.xls to *.xlsx the API speed up on the customers target application in our production environment.  The "Production" environment is Windows Server 2012, with Windows 7 embedded clients.  The "offline" environment however is where I see a *massive* slowdown.  Running Windows7, the code still works, but is maybe 10-100x slower.

    Our API environment for users is using MFC libraries (still) not COM (that I am aware of), and we build on the v120 x64 platform toolset.  I suspect that the underlying OLE Excel API layer we provide our customers is based on old stuff.. I can't change that as that is fixed... (I can lobby for change)... what I can do is explore why it's slower and see if I can change anything in the runtime environment to speed up the performance... and if I can identify a root cause, I can submit a bug against OUR API too improve/change how we run.  I don't have clean source access to how we implemented our Excel API as I'm field support... so I'm just trying to understand the issue from the 30,000 foot layer.  

    What I may need to do is a benchmark program bug test case where I time various cell I/O activity on in our production environment and our emulation environment and force our developers to figure it out.  (They should likely move to the latest API model for EXCEL... and likely have not revisited the old Excel 97 code if it still works.)

    Some of the variables include what specific version(s) of EXCEL are installed on the target machines.  That could be a factor.  The offline environment (my laptop) has multiple versions of EXCEL, and the production environments are fixed.  So I have to rule out different releases of Excel, and figure out even how the system picks one if there are multiple versions available.  Of course I don't want to do this... as this performance issue is just one of the things I'm working on.  May we live in interesting times.

    -R


    Wednesday, May 8, 2019 5:11 PM