I Would like to drag your attention at this question asked on LinkedIn forum. Please have a look.
I have a custom EXCEL 2007 AddIn developed using VSTO 2010 on .net 4.0 with C# .
It's a very complex solution, dealing with large amount of data that will interact with EXCEL engine through PIAs and is required to show extreme performance.
Some of the techniques adopted to improve the performance are as follows:-
1. Use Range objects to work than EXCEL Cells.
2. Copy data into Value2 property associated with ColumnRange of the ListObject.
3. Set Calc property to Manual and ScreenUpdating to false during setting values to Range or working with EXCEL objects.
Also, cell editor for certain columns contain WPF based UserControls.
At this time, I am not looking to replace this with third party product or develop the addin in C++ .
I will like to hear more from experts on the other techniques to improve upon performance under the assumptions that large amount of data needs to be assigned to EXCEL ranges and there will be about 17 to 20 such diff. data profiles. 90% of the data will be numeric, 5 % string / text.
There are few stylesheets that are assigned to Table objects present on Worksheets.
e.g. Table Header has a diff. style sheet and each alternate rows have diff. color.
The data received from Source is in XML format.
Any tips , suggestions are greatly appreciated.
Thanks for posting in the MSDN Forum.
I think your issue seems have few relationship with Excel development is it right? I think what you want can be approach via improve your algorithm or use better Office suit. If you want to improve your algorithm please show me the key code for further research.
Have a good day,
Tom Xu [MSFT]
MSDN Community Support | Feedback to us
I don't know if you are still reading this, but I have been working on something quite similar. Here are some things I found to speed up performance:
- Take everything out of Excel in bulk using either Value, or Value2 like you described.
- Convert everything to rectangular arrays
- Run through arrays with Parallel library. Use Parralel Aggregation if needed to aggregate data.
- Write everything back as an Array using Value, or Value2 once again.
So, the only real thing I am doing different from you is the Parallel library. Your other suggestions were all spot on. If you do decide to do parallelism like I suggest it is imperative that you pull out all your data needed from excel before hand as excel will block multiple threads from pulling data at the same time causing threaded code to actually run slower than non-threaded.
Without a more precise understanding of what you are trying to do, all suggestions will be very generic.
I am curious tho, how did you use wpf for cell editors? That piqued my curiosity.
//Will write code for food
As Tim mentioned below, reading and writing in bulk is definitely a huge performance booster. In a similar vein, try to make as little calls to any Interop object as possible (e.g., if you need to look up a cell's value for an "if/else-if" statement 10 times, store the value into a local variable first).
One other thing: always make sure to turn off screen updating between the start and end of your operations. Just make sure to turn screen updating back on at the end, and to surround your code with try-catch-finally, where the finally is what turns the screen updating back on -- otherwise you can render the Excel application's UI unresponsive, even after your application completes its process. But you'd be surprised at how much quicker Excel automation works when it doesn't need to be updating its UI all the time.
Using the combination of those pointers, I've often had to read and write from tables that are several thousand rows long and several hundred columns wide, and still achieve reasonable performance.
Michael Zlatkovsky | Program Manager, Visual Studio Tools for Office & Apps for Office