none
What would cause an access violation in mscorwks.dll Object::GetTrueMethodTable? RRS feed

  • Question

  • I have a dotnet application that is infrequently throwing an access violation (0xC0000005) coming from mscorwks.dll. I have a mini-dump showing the details, and the method throwing the exception is Object::GetTrueMethodTable(). What types situations would cause this behavior? I would like to identify the problem and fix it, or at least narrow down causes to speed up my troubleshooting process.

    I am running dotnet 3.5 SP1 on WES (Windows Embedded Standard) 2009. The majority of my application is managed C# code. There are some underlying C++ components. The Exceptions pass directly through my try-catch blocks and are caught by the default Windows handler (Dr. Watson?). Two events are posted to the event log. The first cites: ".NET Runtime version 2.0.50727.3053 - Fatal Execution Engine Error (7A097706) (80131506)". at 8:52:04 PM on 10/12/2011 with source of ".NET Runtime". The event ID is 1023.

    The second event is logged directly after and cites: "Faulting application [MyApp].exe, version 3.0.0.15, stamp 4e9628c0, faulting module mscorwks.dll, version 2.0.50727.3053, stamp 4889dc18, debug? 0, fault address 0x0000f290.". The event ID is 1000. The source is ".NET Runtime 2.0 Error Reporting".

    I used ntbackup.exe to capture a copy of the resulting mini-dump and ran it in Visual Studio to determine that the faulting method is Object::GetTrueMethodTable. The stack trace from the faulting thread is as follows:

    > mscorwks.dll!Object::GetTrueMethodTable() + 0x6 bytes
      mscorwks.dll!VirtualCallStubManager::ResolveWorkerStatic() + 0x52 bytes
      mscorwks.dll!_ResolveWorkerAsmStub@0() + 0x33 bytes
      mscorwks.dll!ObjectNative::WaitTimeout() + 0x196 bytes
      mscorwks.dll!MetaSig::HasRetBuffArg() + 0x5 bytes
      mscorwks.dll!MethodDesc::CallDescr() + 0x15a bytes
      mscorwks.dll!MethodDesc::CallTargetWorker() + 0x1f bytes
      mscorwks.dll!MethodDescCallSite::CallWithValueTypes() + 0x1a bytes
      mscorwks.dll!ThreadNative::KickOffThread_Worker() + 0x11a bytes
      mscorwks.dll!Thread::DoADCallBack() - 0x1411c3 bytes
      mscorwks.dll!Thread::ShouldChangeAbortToUnload() - 0x14033b bytes
      mscorwks.dll!Thread::ShouldChangeAbortToUnload() - 0x140415 bytes
      mscorwks.dll!Thread::ShouldChangeAbortToUnload() - 0x140289 bytes
      mscorwks.dll!ManagedThreadBase::KickOff() + 0x13 bytes
      mscorwks.dll!ThreadNative::KickOffThread() + 0xd6 bytes
      mscorwks.dll!Thread::intermediateThreadProc() + 0x46 bytes
      kernel32.dll!_BaseThreadStart@8() + 0x37 bytes

    The disassembly from the point of origin is as follows:

    Object::GetTrueMethodTable:
    79E7F28A push ebp
    79E7F28B mov ebp,esp
    79E7F28D push ecx
    79E7F28E mov eax,dword ptr [ecx]
    79E7F290 test dword ptr [eax],1000000h
    79E7F296 jne Object::GetTrueMethodTable+0Eh (79EB4204h)
    79E7F29C leave
    79E7F29D ret

    To give an idea about rate of occurance, the application has been installed and running on 4 machines that have been running 24 hours a day for about a month, but the problem has only occured 3 times (twice on one machine within 2 days). The error signiture is the same in each case, but I only have a dump from the last occurance. I have searched related content on these forums and not found much content specifically related to this situation.

    Monday, October 17, 2011 8:57 PM

Answers

  • If native code dereferences null, the native code would be on call stack. The stack you provided above says that CLR tries to make a vritual call on a managed object, but the managed object has corrupted header (pointer to MethodTable is NULL).

    -Karel

    • Marked as answer by nano-mitton Friday, October 21, 2011 6:35 PM
    Thursday, October 20, 2011 3:45 PM
    Moderator

All replies

  • It could be HW error, something corrupting managed heap memory (e.g. incorrect PInvoke or some native code), or a bug in CLR.

    What you could do:
       * Collect more dumps and look for similar patterns (call stack, memory address, etc.)
       * Verify GC heap in your dump (sos extension to windbg - command !VerifyHeap)

    -Karel

    Thursday, October 20, 2011 1:38 AM
    Moderator
  • Karel, thank you for the reply. I am in the process of gathering more (full) dump files (this time via windbg). I tried to check the integrity of the heap as you described and eventually concluded I could not since my dump is more of a "mini dump" which was auto created by the default Windows debugger. After loading sos.dll into windbg I got the following:

    0:028> .load C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos
    ------------------------------------------------------------
    sos.dll needs a full memory dump for complete functionality.
    You can create one with .dump /ma <filename>
    ------------------------------------------------------------
    0:028> !verifyheap
    -verify will only produce output if there are errors in the heap
    Error requesting details
    Unable to build snapshot of the garbage collector state

    I'm assuming this output does not imply an error, but the lack of an ability to detect errors. In the process of getting this sos command to work, I was able to output more helpful information from "!Analyze -v", "~* e !CLRStack", and "~* !DumpStack". First I had to download symbols from online and run windbg from the production machine that the dump was created on. I'll need some time to digest some of this output, but I found parts of the output from "!Analyze -v" interesting enough to share first:

    MANAGED_STACK: !dumpstack -EE
    OS Thread Id: 0xb88 (28)
    TEB information is not available so a stack size of 0xFFFF is assumed
    Current frame:
    ChildEBP RetAddr  Caller,Callee
    0910e64c 792d6c74 (MethodDesc 0x790fbde4 +0x44 System.Threading.ThreadHelper.ThreadStart())
    0910e734 792d6c74 (MethodDesc 0x790fbde4 +0x44 System.Threading.ThreadHelper.ThreadStart())
    0910e770 792d6c74 (MethodDesc 0x790fbde4 +0x44 System.Threading.ThreadHelper.ThreadStart())
    0910f138 792d6cf6 (MethodDesc 0x791939dc +0x66 System.Threading.ThreadHelper.ThreadStart_Context(System.Object))
    0910f144 792e019f (MethodDesc 0x7910276c +0x6f System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object))
    0910f158 792d6c74 (MethodDesc 0x790fbde4 +0x44 System.Threading.ThreadHelper.ThreadStart())

    MANAGED_OBJECT_NAME:  System.ExecutionEngineException
    FAULTING_THREAD:  00000b88
    BUGCHECK_STR:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE_INVALID_POINTER_READ_WRONG_SYMBOLS
    PRIMARY_PROBLEM_CLASS:  NULL_CLASS_PTR_DEREFERENCE
    DEFAULT_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE
    LAST_CONTROL_TRANSFER:  from 79e95153 to 79e7f290


    Nano-Mitton
    Thursday, October 20, 2011 3:35 AM
  • PRIMARY_PROBLEM_CLASS:  NULL_CLASS_PTR_DEREFERENCE
    DEFAULT_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE

    How about a native method trying to use a "callback handle" that was given from managed code as "new IntPtr(null)", would that cause an Access Violation in Object::GetTrueMethodTable? That would match the null class pointer dereference tags that I found while extracting info from the dump (see quote from above). That could also match why the exception thread had nothing other than thread start calls.

    I found a spot in my code that provides a null wrapped by an IntPtr, but recreating the original situation will be difficult since the native code is provided by another company. If the theory matches the observations, I can be creative about my testing and consider options like creating a smal test jig in which native code tries to execute a null reference provided by some managed code.


    Nano-Mitton
    Thursday, October 20, 2011 2:07 PM
  • If native code dereferences null, the native code would be on call stack. The stack you provided above says that CLR tries to make a vritual call on a managed object, but the managed object has corrupted header (pointer to MethodTable is NULL).

    -Karel

    • Marked as answer by nano-mitton Friday, October 21, 2011 6:35 PM
    Thursday, October 20, 2011 3:45 PM
    Moderator
  • If native code dereferences null, the native code would be on call stack.

    Karel, your point about the callstack made me rexamine what was going on. In windbg, I ran "!dumpstack" on the exception thread and got a large stack of 230 lines. The top most 130 lines come after the exception was first thrown and looks to be all cleanup and launching drwatson (the default debugger of our WinOS configuration).

    This more full stack includes the contents of the smaller managed stack I provided earlier as a result from running "!dumpstack -EE". This intruiged me so I ran "!help dumpstack" in windbg and found that the "-EE" option shows just the managed functions on the current thread. This is the stack that shows almost nothing other than ThreadStart calls. Three of those ThreadStart calls came after the exception was thrown, so I'm not all that interested in those.

    The remaining 200 lines of the larger stack dump (before the exception) include the contents of the very first call stack I provided (via MSVS) and a lot of other low level calls. One of the first few "low level" lines includes a method from a HW component I use (specifically one of their files under the windows system32 folder). I have a strong feeling that this thread was spawned under-the-hood by a top level HW method I called on a seperate managed thread. That top level method was still in a wait state for the method to complete. This is evidence enough for me to contact the HW supplier and see if they can further diagnose this problem. Thank you for your help!


    Nano-Mitton
    Friday, October 21, 2011 6:30 PM