WPF D3DImage performs good on XP, bad on Vista
-
Saturday, August 30, 2008 6:48 PM
Hi Group,
I inject normal unmanaged C++ Direct3D (dx9) on a WPF D3DImage through interop.
I followed the msdn d3dimage walkthrough on http://msdn.microsoft.com/en-us/library/cc656785.aspx .
I used the August 2008 DirectX SDK, .NET 3.5 SP1, with Visual Studio 2008SP1 on Vista64SP1.
I made x86 and x64 binaries.I set it up to have transparency (I need that).
It runs fine, with about 3% CPU on WinXP (x86). It's SP2, so I have WindowsXP-KB937106-x86-ENU installed to get the transparency to work. On Vista however (what theoratically should run faster), I get 50% CPU, on both x64 and x86 binaries. Bug in the platform? One difference, is that the example uses Direct3DCreate9Ex on Vista and Direct3DCreate9 on XP (since Direct3DCreate9Ex should give better performance). Changing the code to always use Direct3DCreate9 didn't improve a thing.
Any help appreciated, Dave
The project (94kB zip, solution + code + binaries) is over here: http://rapidshare.de/files/40349833/D3DWpfSprite.zip.html- Edited by Dave_FF Saturday, August 30, 2008 6:53 PM Clarified the contents of the zip
Answers
-
Wednesday, September 03, 2008 8:04 PM
Sorry, upon further investigation, this is just a side effect of how we do layered windows on Vista. If you make it a normal window, the CPU usage is very small. If you leave it a layered window and replace the D3DImage with a BitmapSource that you invalidate every frame, you'll see the big CPU usage again.- Marked As Answer by Marco Zhou Friday, September 05, 2008 10:01 AM
-
Friday, September 05, 2008 11:05 PM
Jeremiah --
I don't think this is a D3DImage + layered specific issue. In my little test, if I used a bitmap and invalidated it I got essentially the same perf. The increased CPU is due to how we have to handle layered windows on Vista.
When using a layered window, we have to present with GDI instead of D3D. To present with GDI, we need a DC. On XP, we can just call GetDC on the hardware surface and hand that off. On the vast majority of drivers out there, that will be hardware accerelated. On Vista with WDDM, GDI is not hardware accelerated. If we give GDI that same DC it is painfully slow. The fastest thing to do is call GetRenderTargetData() to read it into software and give GDI a DC around those bits. Thus, on Vista, layered window performance is dependent upon things like how well you card can GRTD() and the card's bus bandwidth.
- Proposed As Answer by Brendan Clark - MSFT Friday, September 05, 2008 11:12 PM
- Marked As Answer by Dave_FF Sunday, September 07, 2008 1:04 PM
All Replies
-
Tuesday, September 02, 2008 9:35 PMCould you show us your DXDiag log on Vista? Using 9Ex is the correct thing to do on Vista and good performance with 9Ex requires your card to support D3DDEVCAPS2_CAN_STRETCHRECT_FROM_TEXTURES and D3DCAPS2_CANSHARERESOURCE as well.
Also, make sure you have the latest drivers on Vista as the ones that shipped in-box were not very good. -
Wednesday, September 03, 2008 8:04 PM
Sorry, upon further investigation, this is just a side effect of how we do layered windows on Vista. If you make it a normal window, the CPU usage is very small. If you leave it a layered window and replace the D3DImage with a BitmapSource that you invalidate every frame, you'll see the big CPU usage again.- Marked As Answer by Marco Zhou Friday, September 05, 2008 10:01 AM
-
Friday, September 05, 2008 1:42 AMI also see performance problems with D3DImage and layered windows in Vista. Is there a workaround (besides disabling layered windows) or a fix in sight?
-Jer
-
Friday, September 05, 2008 12:23 PMThanks Jordan, for being clear on this issue.
I'm also very interested in workarounds or future fixes.
For now, I'll abandon the WPF/D3DImage route, and go for a managed form with some api calls for transparency color keying, and use gdi+ for drawing (my currect project only was using its drawing in direct3d because of hoped-for performance benefit).
Kind regards, Dave -
Friday, September 05, 2008 11:05 PM
Jeremiah --
I don't think this is a D3DImage + layered specific issue. In my little test, if I used a bitmap and invalidated it I got essentially the same perf. The increased CPU is due to how we have to handle layered windows on Vista.
When using a layered window, we have to present with GDI instead of D3D. To present with GDI, we need a DC. On XP, we can just call GetDC on the hardware surface and hand that off. On the vast majority of drivers out there, that will be hardware accerelated. On Vista with WDDM, GDI is not hardware accelerated. If we give GDI that same DC it is painfully slow. The fastest thing to do is call GetRenderTargetData() to read it into software and give GDI a DC around those bits. Thus, on Vista, layered window performance is dependent upon things like how well you card can GRTD() and the card's bus bandwidth.
- Proposed As Answer by Brendan Clark - MSFT Friday, September 05, 2008 11:12 PM
- Marked As Answer by Dave_FF Sunday, September 07, 2008 1:04 PM
-
Sunday, September 07, 2008 1:04 PMSome more info:
The whole reason I needed a D3DImage layered window anyway, is because of unwanted AddDirtyRect optimization.
I now made a non-wpf overlay frame (see my previous post), which I made mouse-transparent and resizing with the parent WPF form, and toggle always-on-top on Activate/Deactivate (maybe I can do something neater with SetWindowPos, but it'll do for now). I use colorkeying on this "airspace" form, because I have to draw many small 2D animations (60fps) on top of the complex WPF window. I draw normal GDI+ on the airspace form, with 200 small invalidateRects all over the screen. This performs excellent ; 3% CPU on Vista x64SP1, 3% on XP SP2. (B.t.w., it's a dual boot system - so equal hardware - with latest NVidea drivers. NVidia GForce 7600GS.).
I think I *could* have gotten it to perform good inside a single WPF window, if it wasn't for WPF's AddDirtyRect optimalization. WPF optimizes the dirty rectangles, and tries to do it smart by making less, but bigger, rectangles. This can be seen in Perforator. This results in *many* complex WPF objects to be rerendered, which in turn ups the CPU to max. I hope WPF will make the AddDirtyRect optimization optional (by setting an "Exact" or "ManualOptimization" mode on the System.Windows.Media.Imaging.WriteableBitmap and the System.Windows.Interop.D3DImage classes).
For now, I revert to a non-WPF airspace overlay. Too bad WPF isn't the "answer to everything" yet ;)
Cheers, Dave
P.S. Jordan, can't find anything on D3DDEVCAPS2_CAN_STRETCHRECT_FROM_TEXTURES and D3DCAPS2_CANSHARERESOURCE in the dxdiag txt output. Will look further for this (mb with another tool).

