none
Any stack walking limitation in xperf? RRS feed

  • Question

  • Hi,

    I'm using xperf to profile a triangle-ray intersecting prototype. All are fine except that I found with /O2 VC9 compiler option I'll always get one-level call stack for the functions I wrote in the prototype. Compiler-generated or CRT functions don't have this issue and I can get a full call stack with them.

    Here are some outputs copied from xperfview:

    With /O1 VC9 option:

    [Root]
       ntdll.dll!_RtlUserThreadStart
       ntdll.dll!__RtlUserThreadStart
       kernel32.dll!BaseThreadInitThunk
       intersect.exe!__tmainCRTStartup
       intersect.exe!main
       intersect.exe!ComputeIntersections
       intersect.exe!IntersectBVH1_
    
    
    [Root]
       ntdll.dll!_RtlUserThreadStart
       ntdll.dll!__RtlUserThreadStart
       kernel32.dll!BaseThreadInitThunk
       intersect.exe!__tmainCRTStartup
       intersect.exe!main
       intersect.exe!fscanf
       intersect.exe!vfscanf
       intersect.exe!_input_l
       intersect.exe!_fassign_l
       intersect.exe!_atoflt_l
       intersect.exe!__strgtold12_l
    
    
    

    With /O2 VC9 option:

    [Root]
       intersect.exe!IntersectBVH1_ (the weird one) 
    
    [Root]
       ntdll.dll!_RtlUserThreadStart
       ntdll.dll!__RtlUserThreadStart
       kernel32.dll!BaseThreadInitThunk
       intersect.exe!__tmainCRTStartup
       intersect.exe!main
       intersect.exe!fscanf
       intersect.exe!vfscanf
       intersect.exe!_input_l
       intersect.exe!_fassign_l
       intersect.exe!_atoflt_l
       intersect.exe!__strgtold12_l
    Is it a known limitation of stack walking? How could I get a full call stack from a release build (\O2)?

     

    Thank you very much!

    -MLJack

    Monday, April 5, 2010 1:01 PM

Answers

  • Hi MLJack,

    Compile your code with /O2 /Oy-.

    If using the Dev2008 GUI, go to C++ \Optimization and set Omit Frame Pointers to No.

    This will disable frame pointer optimization (use of the EBP register as another general register) and provide the stack walk code with the frame of each function called.

    You are welcome.

    Osiris

    • Proposed as answer by osirispedroso Thursday, April 8, 2010 5:27 PM
    • Marked as answer by mljack Tuesday, April 13, 2010 5:35 PM
    Thursday, April 8, 2010 5:26 PM

All replies

  • Hi MLJack,

    Compile your code with /O2 /Oy-.

    If using the Dev2008 GUI, go to C++ \Optimization and set Omit Frame Pointers to No.

    This will disable frame pointer optimization (use of the EBP register as another general register) and provide the stack walk code with the frame of each function called.

    You are welcome.

    Osiris

    • Proposed as answer by osirispedroso Thursday, April 8, 2010 5:27 PM
    • Marked as answer by mljack Tuesday, April 13, 2010 5:35 PM
    Thursday, April 8, 2010 5:26 PM
  • Note The binaries to be used for the data collection must be compiled with Frame Pointer Omission optimization (FPO) disabled. Disabling FPO allows Windows Performance Analyzer to collect complete sets of call stack data. Windows binaries from Vista onward are compiled with FPO disabled. The Windows Client Performance Team recommends that all binaries, including release images, be compiled with FPO disabled. By compiling with FPO disabled developers will have complete access to call stacks and events generated by a process.
    Friday, August 3, 2012 2:10 AM