.NET Framework Developer Center > .NET Development Forums > Building Development and Diagnostic Tools for .Net > Trouble with VectoredExceptionHandler and ICorDebugProcess->Stop method
Ask a questionAsk a question
 

AnswerTrouble with VectoredExceptionHandler and ICorDebugProcess->Stop method

  • Wednesday, June 17, 2009 8:44 AMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hi, I have a trouble with ICorDebugProcess. My application set hook for catch SEH exceptions with call AddVectoredExceptionHandler in unmamaged code. Also I attach to my own process with ICorDebug. In the moment when SEG exception is occuring I try to save some information about state of .Net threads. For it I call ICorDebugProcess->Stop and that happen to freeze my process. This problem happen only if my application is console application. Is there any constraints to use ICorDebug in console applications? Where I can find additional information about problems like this? Thanks, Igor

Answers

  • Friday, June 19, 2009 5:32 PMRick ByersMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Yeah, you really can't rely on the exception code to be a reliable differentiator between managed and unmanaged exceptions. Eg., a NullReferenceException will show up first as an SEH AV exception in JITted code. See this blog entry for some details: http://blogs.msdn.com/clrteam/archive/2009/05/14/how-clr-maps-seh-exceptions-to-managed-exception-types.aspx.  Are you really sure your application works well for a variety pf complex non-console apps?  I'd expect that calling Stop from within a VEH on a thread that was running managed code would result in deadlocks at other times as well (we're trying to syncrhonize the process and expect that thread to be able to run - we'll just keep waiting for it).  It's a little non-deterministic though because if the thread is already at a "GC safe" location or out in native code, then we'll consider it to have already been synchronized (and may just hard-suspend the thread). 

    One other way to get managed callstacks is the System.Diagnostics.StackTrace API, but again that won't work well while you have a thread stopped at abitrary native locations (in your VEH).  In fact in the ilmit it's impossible to reliably get a managed stack trace if a thread is stopped at arbitrary native locations (for example, if it's currently inside mscorwks.dll while mutating some critical CLR data structure needed by the stack trace).

    The ICorProfile API is the most powerful way to get managed stack traces from within a single process.  It's designed to let a sampling profiler stop the process at near arbitrary points and get managed stack traces.  I suspect (but don't know for sure) that you could use this from within a VEH, but I'm sure it would take some work. 

    The simplest way to just record all managed exceptions is using ICorDebug from a 2nd process (could be one you launch that turns around and attaches back to it's parent). Mike has a sample here: http://blogs.msdn.com/jmstall/archive/2005/07/28/print_exceptions.aspx. Most native exceptions will turn into managed exceptions when they hit managed code, so this might be sufficient.  If you also want to get native stack-trace info for SEH exceptions occuring in unmanaged code, than just using DbgHelp to get a native-only stacktrace from a VEH might be good enough.

    If you really want to stich-together managed and native stacks at arbitrary (native SEH) locations then I think your only reliable options are to write an out-of-process mixed-mode debugger (like VS) ot use the ICorProfile APIs in-process (but the latter may have some limitations I'm not familiar with).  Mike describes this in some details here: http://blogs.msdn.com/jmstall/archive/2005/08/10/inprocess-interop-callstacks.aspx.  But I think he's not thinking about the ICorProfile possibility.

    I hope this helps,
       Rick
  • Monday, June 22, 2009 5:16 PMDavid BromanMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Hi, Igor.  Just FYI, if you want to investigate using a profiler to do the stack walk in-process, take a look at this to get an idea of what's involved:
    http://msdn.microsoft.com/en-us/library/bb264782.aspx

    The profiling API will walk the managed frames for you, but you're responsible for walking the native frames yourself.  The above article shows how to stitch everything together.

    There are indeed restrictions on when the stack walking API (ICorProfilerInfo2::DoStackSnapshot) will succeed.  If it detects that the stack is unwalkable or the runtime is not in a safe spot to run the stack walking code, it will return an error.  Relevant to you, there are windows of time during exception handling where the stack is unwalkable.  I'm not sure if VEH's are contained in those windows or not.  But if you do opt for using the profiling API, you get to set up callbacks at interesting times, like "ICorProfilerCallback2::ExceptionThrown", where a stack walk is indeed safe.

    Thanks,
    Dave

All Replies

  • Wednesday, June 17, 2009 2:31 PMKarel ZikmundMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I don't believe there is any restriction/constraint to use ICorDebug in console applications. CLR in general doesn't care what type of application you are running (as soon as it is pure managed or mixed-mode application).
    The freeze is most probably caused by a deadlock (likely in CLR). Did you try to attach debugger to your application to find out what it does when it is 'frozen'?

    Don't you try by any chance to run managed code in your exception handler? That is apparently discouraged (http://msdn.microsoft.com/en-us/library/ms679274(VS.85).aspx - see community content).

    -Karel
  • Wednesday, June 17, 2009 5:42 PMRick ByersMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Hi Igor,
    Karel is right that there shouldn't be any restrictions with console processes (in fact most of our ICorDebug testing takes places on console processes). I suspect there's something else that's making the difference (maybe single vs. multi-threaded?).  When you say your processes freezes, you mean the target (debugee) process, not the one from which you're calling ICorDebugProcess::Stop in (debugger), right?  That's the expected behavior of Stop - it "stops" the target process (actually just the managed threads) so that you can safely inspect it.  You should be able to call ICorDebugProcess::Continue to get it running again once you're done inspecting it.

    Here's a couple experiments I'd suggest to help us narrow down the problem:
    - Does your code that uses ICorDebug work properly on a simple managed app (i.e. that doesn't setup a VEH)?
    - Can you use an existing managed debugger like Visual Studio or MDbg to successfully attach, inspect and detach from the console program you're having trouble with?

    That should help us decide if the VEH stuff is relevant or not, and how closely to look at your usage of ICorDebug.

    Rick

  • Thursday, June 18, 2009 12:08 PMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    >>>  Did you try to attach debugger to your application to find out what it does when it is 'frozen'?
    Yes. Of course. It`s deadlock, but I cannot understand why it happens.

    It`s stack where exception occured

      ntdll.dll!_KiFastSystemCallRet@0()  
      ntdll.dll!_ZwWaitForSingleObject@12()  + 0xc bytes 
      kernel32.dll!_WaitForSingleObjectEx@12()  + 0x8b bytes 
      kernel32.dll!_WaitForSingleObject@8()  + 0x12 bytes 
      mscordbi.dll!CordbProcess::StopInternal()  + 0x12fb2 bytes 
      mscordbi.dll!CordbProcess::Stop()  + 0x25 bytes 
     myDll.dll!DebugManagedCallback::SaveInformationInternal(unsigned long threadId=2916, bool specifiedThreadOnly=true, BaseQA::MemoryStream2 & memStream={...})  Line 160 + 0x2c bytes C++
      myDll.dll!myDll::ModuleIntf::FaultFilter(_EXCEPTION_POINTERS * Exception=0x0012f0d8)  Line 743 C++
      myDll.dll!VectoredHandler(_EXCEPTION_POINTERS * ExceptionInfo=0x0012f0d8)  Line 165 + 0x9 bytes C++


    It`s stack of Debugger thread:

    ntdll.dll!_KiFastSystemCallRet@0()  
      ntdll.dll!_ZwWaitForMultipleObjects@20()  + 0xc bytes 
      kernel32.dll!_WaitForMultipleObjectsEx@20()  - 0x48 bytes 
      kernel32.dll!_WaitForMultipleObjects@16()  + 0x18 bytes 
      mscorwks.dll!DebuggerRCThread::MainLoop()  + 0x89 bytes 
      mscorwks.dll!DebuggerRCThread::ThreadProc()  + 0xc0 bytes 
      mscorwks.dll!DebuggerRCThread::ThreadProcStatic()  + 0x46 bytes 
      kernel32.dll!_BaseThreadStart@8()  + 0x37 bytes 


    >>> Don't you try by any chance to run managed code in your exception handler?
    No. I use debug interfaces only.   
  • Thursday, June 18, 2009 12:22 PMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    >>>  When you say your processes freezes, you mean the target (debugee) process, not the one from which you're calling ICorDebugProcess::Stop in (debugger), right? 
    No. I`ve made native dll. Into this dll I call AddVectoredExceptionHandler and create CLR Debugger and attach to own process. 

    >>> should be able to call ICorDebugProcess::Continue to get it running again once you're done inspecting it.
    :) Yes. I call ICorDebugProcess::Continue  in any method of ICorDebugManagedCallback.

  • Thursday, June 18, 2009 8:30 PMRick ByersMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Ah, you're trying to use ICorDebug to debug your own process!  In CLR 2.0 and above you can't reliably do that (we tried to enable a restricted form of it in 1.x but it was a bug-farm and still didn't let you control execution).  Debugging is always invasive into the target process, and so you may interfere with the code you need to run to do the debugging.  In order to really control a process, you need to be controlling it from another process.  For example, the process may stop when holding a critical OS resource like the OS loader lock, and now ICorDebug might need that resource itself.  Or since you're hooking the VEH, you may throw an exception from a thread thats currently implementing ICorDebug and get into trouble with re-entrency (eg. StopInternal waits on another thread to perform some work - if you've hijacked that thread via the VEH than it may be deadlocked there). 

    Also, ICorDebug relies on being able to run code in the target process and synchronize with it.  In your case I suspect that an exception was raised on a managed thread triggering your VEH and then you're blocking that thread by waiting for Stop to complete.  But Stop tries to syncrhonize all the managed threads and get them stopped at a safe place (eg. so you can do a function evaluation or safely inspect CLR data structures, etc.).  That syncronization is going to block until you return from your VEH (since the CLR doesn't know the thread has been hijacked away into native code).

    In CLR 4.0 (since we now use the OS debugging APIs) it'll fail outright trying to attach to your own process.  I'm surprised we don't have a check to prevent that in CLR 2.0 (assuming that's what you're using).

    Anyway, if you want to build a tool that monitors the process, stops it when something interesting happens and inspects it state, then that tool should live in a separate process.  I'm still not sure you want to use the VEH - if you just use ICorDebug you'll get notifications whenever a managed exception is thrown (via the Exception2 callback).  If you want to stop of native events as well, then you may want to implement a mixed-mode debugger (so you get native DEBUG_EVENTs as well).

    Alternately, if you really want to run your diagnostics/inspection code in process then the profiling API (ICorProfile) may be a better tool for you.  It's a lower-level higher-performance form of diagnostics API.  It doesn't have the same ability as the debugger API to alter program execution (stop at breakpoints etc.) but in exchange you can instrument IL code before it's compiled and get your code invoked directly at all sorts of interesting locations.

    I hope this helps.  If neither of these options seem to fit your scenario, then perhaps if you explain what you're trying to achieve a little bit I may be able to help more.
       Rick
  • Friday, June 19, 2009 8:54 AMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    The my task is get all managed stacks when occured both type of exceptions (SEG and CLR). Now I catch exceptions in VEH handler and check ExceptionCode, if ExceptionCode equal 0xe0434f4d (CLR exception) then I skip processing this exception. Immediately after that I get ICorDebugManagedCallback::Exception notification and save all needed info. If ExceptionCode differs from CLR Exception, I try to stop CLR Debugger and get all information at this moment.

    The different is that if throw SEH exception (divide by zero for example) in native code this scenario work fine any time, but if this exception was raised in managed code this scenario work only own application isn`t console application.

    Is there any other way to get managed stacks without use managed debugger?

  • Friday, June 19, 2009 5:32 PMRick ByersMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Yeah, you really can't rely on the exception code to be a reliable differentiator between managed and unmanaged exceptions. Eg., a NullReferenceException will show up first as an SEH AV exception in JITted code. See this blog entry for some details: http://blogs.msdn.com/clrteam/archive/2009/05/14/how-clr-maps-seh-exceptions-to-managed-exception-types.aspx.  Are you really sure your application works well for a variety pf complex non-console apps?  I'd expect that calling Stop from within a VEH on a thread that was running managed code would result in deadlocks at other times as well (we're trying to syncrhonize the process and expect that thread to be able to run - we'll just keep waiting for it).  It's a little non-deterministic though because if the thread is already at a "GC safe" location or out in native code, then we'll consider it to have already been synchronized (and may just hard-suspend the thread). 

    One other way to get managed callstacks is the System.Diagnostics.StackTrace API, but again that won't work well while you have a thread stopped at abitrary native locations (in your VEH).  In fact in the ilmit it's impossible to reliably get a managed stack trace if a thread is stopped at arbitrary native locations (for example, if it's currently inside mscorwks.dll while mutating some critical CLR data structure needed by the stack trace).

    The ICorProfile API is the most powerful way to get managed stack traces from within a single process.  It's designed to let a sampling profiler stop the process at near arbitrary points and get managed stack traces.  I suspect (but don't know for sure) that you could use this from within a VEH, but I'm sure it would take some work. 

    The simplest way to just record all managed exceptions is using ICorDebug from a 2nd process (could be one you launch that turns around and attaches back to it's parent). Mike has a sample here: http://blogs.msdn.com/jmstall/archive/2005/07/28/print_exceptions.aspx. Most native exceptions will turn into managed exceptions when they hit managed code, so this might be sufficient.  If you also want to get native stack-trace info for SEH exceptions occuring in unmanaged code, than just using DbgHelp to get a native-only stacktrace from a VEH might be good enough.

    If you really want to stich-together managed and native stacks at arbitrary (native SEH) locations then I think your only reliable options are to write an out-of-process mixed-mode debugger (like VS) ot use the ICorProfile APIs in-process (but the latter may have some limitations I'm not familiar with).  Mike describes this in some details here: http://blogs.msdn.com/jmstall/archive/2005/08/10/inprocess-interop-callstacks.aspx.  But I think he's not thinking about the ICorProfile possibility.

    I hope this helps,
       Rick
  • Monday, June 22, 2009 5:16 PMDavid BromanMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Hi, Igor.  Just FYI, if you want to investigate using a profiler to do the stack walk in-process, take a look at this to get an idea of what's involved:
    http://msdn.microsoft.com/en-us/library/bb264782.aspx

    The profiling API will walk the managed frames for you, but you're responsible for walking the native frames yourself.  The above article shows how to stitch everything together.

    There are indeed restrictions on when the stack walking API (ICorProfilerInfo2::DoStackSnapshot) will succeed.  If it detects that the stack is unwalkable or the runtime is not in a safe spot to run the stack walking code, it will return an error.  Relevant to you, there are windows of time during exception handling where the stack is unwalkable.  I'm not sure if VEH's are contained in those windows or not.  But if you do opt for using the profiling API, you get to set up callbacks at interesting times, like "ICorProfilerCallback2::ExceptionThrown", where a stack walk is indeed safe.

    Thanks,
    Dave
  • Tuesday, June 23, 2009 8:14 AMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I think I`ll try to use ICorProfile API.

    Thanks very much,
      Igor.
  • Tuesday, June 23, 2009 8:35 AMIgor Panaev Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    One more question. Is there way to determine in VEH handler that exception was raised in CLR code?

     

    Thanks,

       Igor 

  • Tuesday, June 23, 2009 3:13 PMRick ByersMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Probably not a completely reliable one - the VEH runs first before any CLR filters have run, so the CLR hasn't even had a chance yet to decide if the exception belongs to it.  However, you can approximate the logic the CLR will use with some heuristics that will probably be pretty good in practice.  Eg., if the exception was raised from managed code (eg. ICorProfilerInfo::GetFunctionFromIP succeeds) or it has a CLR exception code (CLR v2 and CLR v4 have different codes by the way), then that should be pretty close.  It might still miss some obscure cases like divide-by-zero being raised from a lightweight codegen method (not really a "function" as far as ICorProfiler is concerned).

    The good news is that you will soon find out if you guessed wrong (so you may just want to capture some information but delay processing it until you know for sure).  The ICorPrfoiler exception callbacks Dave mentions will be raised only for managed exceptions.

    Rick