none
WPA is sooooo slow and uses 3 times more RAM compared to xperfview. RRS feed

  • Question

  • When I open the same trace (60 sec) in WPA.ee and xperfview I can see that WPA uses 3 times of RAM compared to xperfview. The performance is now a nightmare. Every action causes a not responding message :'(

    In the BUILD videos the new tool was described as faster and better comapred to xperfview. I see the complete opposite. It is slow, font is blurry due to WPF, graphs are blurry, too and evrything is slow and casues a large CPU usage:

     

    Process Stack Count: Sampled Profile Count Count: Thread: CSwitch Count: Thread: ReadyThread Count: Thread: End Rundown Count: FileIo: Create Count: FileIo: Close Count: FileIo: Read Count: FileIo: Write Count: FileIo: QueryInfo Count: Microsoft-Windows-RPC/RpcClientCall/win:Start Count: Microsoft-Windows-RPC/RpcServerCall/win:Start Count: Microsoft-Windows-RPC/RpcClientCall/win:Stop Count: Microsoft-Windows-RPC/RpcServerCall/win:Stop Count: Unknown
      |- microsoft.performance.dataengine.dll!Microsoft.Performance.DataEngine.GeneratedProjectedComparer`5[System.__Canon;Microsoft.Performance.DataEngine.ArrayIndexer`1[System.__Canon];System.__Canon;Microsoft.Performance.DataEngine.IdentityFunction`1[System.__Canon];Microsoft.Performance.DataEngine.Internal.ReferenceComparer`1[System.__Canon]]::Compare 17103 18055 932 20 0 0 0 0 0 0 0 0 0 0 0
      |- ?!? 11376 12630 623 631 0 0 0 0 0 0 0 0 0 0 0
      |- microsoft.performance.dataengine.dll!Microsoft.Performance.DataEngine.Algorithms+_ComparerUncheckedContext`1[Microsoft.Performance.DataEngine.GeneratedProjectedComparer`5[System.__Canon;Microsoft.Performance.DataEngine.ArrayIndexer`1[System.__Canon];System.__Canon;Microsoft.Performance.DataEngine.IdentityFunction`1[System.__Canon];Microsoft.Performance.DataEngine.Internal.ReferenceComparer`1[System.__Canon]]]::_Pred 8574 9042 462 6 0 0 0 0 0 0 0 0 0 0 0

    "A programmer is just a tool which converts caffeine into code"

    Monday, September 26, 2011 8:06 PM

All replies

  • Hi Andre,

    did you use any "special" filter or "just" a runtime trace over 60 sec ?

    Thanks and Best regards

    Rainer


    RPA
    Tuesday, September 27, 2011 4:57 PM
  • Yeah, I'm not too happy with WPA either. Thankfully, xperf.exe is still available. I find myself using both now.
    Tuesday, September 27, 2011 7:10 PM
  • @Rainer

    this is a trace with a higher ISR cpu usage.

    When I open it with xperfview 1.1GB is used. When I open the same trace with WPA and drag the ISR graph to view, WPA uses 4GB! Now WPA became extremely slow.

    Every action I now run causes "not responding" messages and a very high CPU usage :'(

    @Albert

    yeah I also hope that the old tools will still be part of v5.


    "A programmer is just a tool which converts caffeine into code"

    Tuesday, September 27, 2011 8:27 PM
  • Hi Andre,

    How much memory do you have on the system you are using WPA on? Unfortunately the version of WPA that made it to the Developer Preview bits does have a significantly larger working set than xperfview. We've been addressing this over the past few months, have made some very significant improvements, and are continuing to strive towards matching or in some cases even improving on the xperfview working set. Some of these changes were fundamental enough that they were deemed too risky to be put into the external release so close to their creation, so you'll be getting them in Beta.

    Note that the system I've used at BUILD to do live WPA demos had 8GB of RAM which is why everything was very responsive. In fact, when memory pressure isn't an issue, WPA blocks a lot less on the UI thread (xperfview is notorious for ghosting).

    Please keep in mind - this is pre-alpha quality - these bits are a developer preview for you to see what's coming up, not yet use for production. There is much more work for us to do before we can effectively substitute xperfview with WPA in terms of its performance and the team is very hard at work on achieving that very goal. Things to expect comes Beta are fully responsive UI (all work done on background threads), multi-threaded graph instantiation (if you have multiple cores, WPA will use each core as you instantiate graphs - xperf is always single threaded, asynchronous symbol loading that lets you work while symbols load, and much more :))

    Can you share out that trace so we could try and repro locally what you are seeing and then compare against latest internal pre-Beta bits to see if this is already covered or we need to make additional fixes?

    Regarding the fuzziness of the text you are seeing. Can you tell me what graphics adapter, driver, and monitor you are using. Also what DPI settings you have and whether you enabled or disabled anti-aliasing?

    We really appreciate that you are giving these preview bits a whirl and are very interested in all of your feedback to help improve WPA to the point that by Beta you'll not want to use xperfview and will opt for WPA in your daily work :-)

    Thanks,  Michael


    • Edited by Michael_Milirud Tuesday, September 27, 2011 9:11 PM Added a request for trace
    Tuesday, September 27, 2011 9:03 PM
  • I have an AMD Phenom II X4, 8 GB DDR3 RAM and an AMD/ATI Radeon HD 5770 and a LG Monitor with 1920*1080 Pixel. I still use IE8 because of the ugly fond with Direct2D (which WPF4 uses, too).

    I made a Heap/VAlloc-Trace with WPRUI for you. Check it please.


    "A programmer is just a tool which converts caffeine into code"

    Tuesday, September 27, 2011 10:44 PM
  • I took a look at your trace using both xperfview and WPA. The working set of WPA with ISR graph instantiated is about the same as xperfview with ISR summary table opened. Recall that WPA instantiates the table as soon as you copy the graph over. So WPA experience is similar to XPerfView experience with a single summary table opened for that graph.

    Now, the exciting part - try opening two ISR summary tables on your trace with XPerfView - you'll need another 2GB for that summary table. With WPA you can copy as many graphs as you want and your working set won't be growing much.

    So today WPA is equivalent to XPerfView with Summary Table open, and unlike XperfView can actually let you work with multiple instances of that table without exploding in working set. For Beta we are looking into making WPA be even better than XPerf in terms of the working set requirements, so it always takes less memory than WPA. =)

    I'd say that for traces of this size (~1GB and larger) you'll need to wait for Beta to truly be effective with WPA (unless you have a lot of RAM), but you can use WPA very effectively on smaller traces (300-500MB) with these Developer Preview bits.

    Thanks,  Michael


    Wednesday, September 28, 2011 11:29 PM
  • ok, also reduce the CPU usage. every click causes not responding messages. I test it with the beta again.
    "A programmer is just a tool which converts caffeine into code"

    Thursday, September 29, 2011 12:16 PM
  • You'll definitely see more optimized bits comes Beta. The demos we've done at BUILD were all live on real traces and didn't suffer from lags. Please try the tool on smaller traces (under 500MB) until Beta comes out. We are looking for usability feedback. What features do you like functionally and what is missing? Assume perf will be as good as xperfview and try to think of how we could improve your daily analysis workflow in WPA.

    Thanks,  Michael

    Tuesday, October 4, 2011 10:01 PM
  • Please try the tool on smaller traces (under 500MB) until Beta comes out.

    Here is the problem. WPR includes too much data ;)

    We are looking for usability feedback. What features do you like functionally and what is missing? Assume perf will be as good as xperfview and try to think of how we could improve your daily analysis workflow in WPA.

    Like:

    • PDB generation for .net ngen images
    • taskbar progress  icon of ETL loading
    • no hanging GUI when loading PDBs
    • Sessions

    Hate:

    ·     blurry graphs

    ·     slow (drawing) speed caused by the ugly WPF crap. Using WinRT would be better, but no this is reserved for the ugly Metro App crap :'( I'm on a desktop and not on a phone or tablet *angry*

    ·     no control over WPR(UI) and the Assessment Tool

    ·     still no way to create own profiles. You added a sample, but how do I create my own?

    ·     Waste of vertical space in the "Analysis view". I had difficulties to view the summary graphs in xperfview now I have half of the space and it is much more difficult so look at traces.

    ·     No real new feature/progress since v4.8 which helps me solving my issues :(

    Missing:

    • VS templates to create own ETW providers/ETW apps
    • create own "resolver" apps to look at traces and check for issue I can define. This is also interesting for users like Rainer and Albert. If customers of their tools have perf issues it would be nice to create traces and define a way that the traces are analysed automatically for issue based on my configuration.
    • help to trace hibernation/sleep resume issue. This tracing is the most complicated thing becasue WPA/xperfview shouws nothing usefull and the summury.xml only shows you the data but not why things are slow.

     


    "A programmer is just a tool which converts caffeine into code"

    Wednesday, October 5, 2011 4:58 PM
    1. Do you mind sharing out screenshots of your blurry graphs?
    2. Also, if you are seeing significant rendering performance issues, it'd be awesome if you could capture a small video of it. There is a number of tools available that let you do it including Expression Blend Screen Recorder, Camtasia Screen Recorder, etc. 
    3. What type of control are you interested in with WPR?
    4. What do you mean by the Assessment Tool? Are you referring to the Windows Assessment Console (WAC)?
    5. WPR profile is just an XML file with a WPRP file extension. Make a copy of the sample one, then update as you need. The schema is publically documented at http://msdn.microsoft.com/en-us/library/hh448221.aspx
    6. Waste of vertical space is interesting feedback, since we've actually saved vertical space by removing duplicate timelines present in xperfview. Can you provide an xperfview to wpa comparison screenshot to show what you mean.
    7. As far as new features, managed and JScript symbol decoding are new. Also ability to graph anything you want and ability to synchronize selection between graphical and tabular views across multiple graphs. Also diagnostic console and integration with ADK assessment. There is of course lots more - these are the top things I think one doing perf analysis would get excited about :) Do you have any specific features in mind you were hoping to see in this release?
    8. As far as help with authoring ETW providers, that's out of scope for WPT. You'll need to refer to ETW documentation and sample code for that. This might be a good place to start http://msdn.microsoft.com/en-us/magazine/cc163437.aspx.
    9. Creating your own ADK assessments is supported. This may be a good place to start learning about ADK: http://www.microsoft.com/downloads/details.aspx?FamilyID=0E670ADC-FB1D-4D8B-B311-65C75F9ED63E&displaylang=e&displaylang=en
    10. ADK includes an entire assessment called FAS that helps you assess every on/off transition including hibernate/sleep. It also raises issues for you that you can then view in WPA right inside the traces that assessment captures. Please try that and let me know how well it worked for you.

    Thanks,  Michael

    Thursday, October 27, 2011 11:03 PM
  • 1. blurriness only exists under Windows 8, it is ok in Win7 (it only occurs if you resize the Window and the graps have to be redrawn. this is a WPF issue)

    3. capturing user mode providers is not really possible and WPR doesn't allow me to control which providers I want to capture. Why do I need so many data if I only want to capture 2 things? This results in larger ETL files and perf issues. Also I can't control the file- and buffersize.

    4. yes I refer to the Windows Assessment Consol.

    5. this is too userunfriendly, include a GUI builder to generate profiles!

    6. look at the picture of Rainer and see the disaster:

    http://social.msdn.microsoft.com/Forums/de-DE/wptkv5/thread/d869d1da-dc0f-496d-bceb-79d23ad87473#3951c861-f453-4062-a9f8-4c6f5be692fb

    Only scrollbars overall. You have to scroll all the time to see some data. What a terrible UX. And next to "Analysis" there is a lot of wasted space which is missing in the graph/summary data view.

    8. this is something you also should provide. The best viewer is useless if it is such an hassle to use ETW tracing in programs.Have you never wondered why nobody uses it?

    9. this small PDF explains nothing.

    10. those assessments don't really work (stop working with error messages even if they finished correctly) and require so many reboot that this is unacceptable to use.

    Also the perf issues seems to be caused because you load the complete ETL and generate the summary data. In xperfview I only select small internals and only in 1% the complete interval. I think that's why WPA uses soooo much more memory compared to xperfview.


    "A programmer is just a tool which converts caffeine into code"

    Friday, November 25, 2011 1:51 PM