Data virtualization in WPF and beyond
-
Tuesday, May 19, 2009 5:49 PM
Data Virtualization in WPF and beyond
Introduction
How do you show a 100,000-item list in WPF? Anyone who tried to deal with such a volume of information in a WPF client knows that it takes some careful development in order to make it work well.
Getting the data from where it is (a remote service, a database) to where it needs to be (your client) is one part of the problem. Getting WPF controls to display it efficiently is another part. This is especially true for controls deriving from ItemsControl like ListView and the newly released DataGrid, since these controls are likely to be served large data sets.
One can question the usefulness of displaying hundreds of thousands of rows in a ListView. There is, however, always one good reason: the customer requests it. And the customer is king, even if the reasoning behind the request is slightly flawed. So, faced with this challenge, what can we do as WPF developers to make both the coding and user experience as painless as possible?
As of .NET 3.5SP1, this is what you can do today to improve performance in ItemsControl and derivatives:
- Make the number of UI elements to be created proportional to what is visible on screen using VirtualizingStackPanel.IsVirtualizing="True".
- Have the framework recycle item containers instead of (re)creating them each time, by setting VirtualizingStackPanel.VirtualizationMode="Recycling".
- Defer scrolling while the scrollbar is in action by using ScrollViewer.IsDeferredScrollingEnabled="True". Note that this only improves perceived performance, by waiting until the user releases the scrollbar thumb to update the content. However, we will see that it also improves actual performance in the scenarios described below.
All these things take care of the user interface side of the equation. Sadly, nothing in WPF takes care of the data side. Data virtualization is on the roadmap for a future release of WPF, but will not be available in the upcoming .NET 4.0, according to Samantha MSFT (http://www.codeplex.com/wpf/Thread/View.aspx?ThreadId=40531).
All is not lost, however. I will show you various ways to have your favorite ItemsControl scroll through hundreds of thousands, even millions of items with little effort. Of course, every solution has a price tag, but for most situations it will be acceptable. Promised!
My “solutions” for data virtualization in WPF relies on two key insights and two usage assumptions. The two key insights are:
1. It is possible to automatically construct for an instance of any type T an equivalent lightweight object which, at least for WPF’s binding engine, is indistinguishable from T in most binding scenarios involving binding to properties of T.
2. ItemsControl’s access patterns for its item source are highly predictable and need at any time only a fraction of the entire data set. The size of this data set is proportional to the number of visible rows, not to the total number of rows in the data set.
Two approaches are derived from these two key insights: the item virtualization approach, where individual objects are loaded on demand, and the collection virtualization approach, where the entire data set is virtualized. These two approaches virtually (pun intended) split this article in 2 parts.
The usage assumptions are:
1. In the presence of a large number of items, the users will not look at each and every one of them at the same time.
2. Scenarios involving a large number of items are predominantly read-only. If there’s any editing to be done, it will not take place in the ItemsControl holding the large data set.
If usage assumption 1 is valid, we only need to load what the user needs to see. This assumption is already exploited by VirtualizingStackPanel’s IsVirtualizing and VirtualizationMode modes, but it’s valid for the data side of the equation as well. Therefore, we can concentrate on techniques that load small amounts of data efficiently.
If usage assumption 2 is valid, we can ignore scenarios where users start editing large data sets in-place. In-place editing with all the bells and whistles (cancellable, transaction safe) has its own set of problems and solutions that is outside the scope of this article.
You can read the article at http://home.scarlet.be/thehive/DataVirtualization.pdf
Feedback is welcome.
Vincent
All Replies
-
Wednesday, June 03, 2009 10:08 PMVery nice solution and well written document.
-
Sunday, July 26, 2009 7:55 AMHi
I'm dealing with a wpf project where i have to display more than 10000 items in datagrid. After a long googling searching for datavirtualization i end up on your solution but unfortunatly it seems that the pdf file is no longer available. Please can you send me the pdf file? I'm strongly interesting in your solution.
Thank -
Sunday, July 26, 2009 12:54 PM
-
Wednesday, August 05, 2009 6:32 AMThere is a comparison between Paul McClean's solution and mine by Bea Stollnitz: http://bea.stollnitz.com/blog/?p=344And to Francois: the PDF file has always been available.
-
Tuesday, September 01, 2009 8:09 AMVery nice solution. That has helped me a lot! Thank you!
-
Monday, April 12, 2010 8:28 AM
-
Thursday, December 02, 2010 8:48 PM
Hello vvdb,
Very nice explaination. However pdf file is not available at link provided. CAn you please update the link.
Thanks!

