none
Finalizer Thread being blocked - Probably related to COM objects RRS feed

  • Question

  • Hello,

    The application I'm working on seems to be leaking objects, at some point in time. (undeterministic)

    Investigating further, and after having made many memory dumps, and having analyzed them with WinDbg, I have narrowed down the problem to the finalizer being blocked.

    More specifically, analyzing the finalizer thread call stack, I can see it contains a call on "GetToSTA".

    Here is the unmanaged call stack:

       2  Id: 26ac.8d8 Suspend: 1 Teb: 7ffdd000 Unfrozen
    ChildEBP RetAddr  Args to Child             
    0096f250 7c822124 77e6bad8 00000e60 00000000 ntdll!KiFastSystemCallRet
    0096f254 77e6bad8 00000e60 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
    0096f2c4 77e6ba42 00000e60 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
    0096f2d8 776c54a7 00000e60 ffffffff 0096f3dc kernel32!WaitForSingleObject+0x12
    0096f2f4 77789905 00170dc8 03e7e5e8 00000000 ole32!GetToSTA+0x6f
    0096f314 77787ed7 0096f3dc 0096f4dc 00190320 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb
    0096f3f4 77695349 00190320 0096f4ec 0096f4dc ole32!CRpcChannelBuffer::SendReceive2+0xc1
    0096f460 776c4ea6 00190320 0096f4ec 0096f4dc ole32!CAptRpcChnl::SendReceive+0xab
    0096f4b4 77ce127e 00000001 0096f4ec 0096f4dc ole32!CCtxComChnl::SendReceive+0x91
    0096f4d0 77ce13ca 068155bc 0096f518 0600016e rpcrt4!NdrProxySendReceive+0x43
    0096f8b8 77ce11bd 77676298 77679608 0096f8f0 rpcrt4!NdrClientCall2+0x206
    0096f8d8 77c53a12 0000000c 00000008 0096f970 rpcrt4!ObjectStublessClient+0x8b
    0096f8e8 776c4bdb 068155bc 0096f924 0096fa24 rpcrt4!ObjectStubless+0xf
    0096f970 7771288b 00164728 7a001314 0096fa18 ole32!CObjectContext::InternalContextCallback+0x126
    0096f9c0 79f541d3 00164738 7a001314 0096fa18 ole32!CObjectContext::ContextCallback+0x85
    0096fa0c 79f54127 00164738 00000000 00000000 mscorwks!CtxEntry::EnterContextOle32BugAware+0x2b
    0096fb2c 79f53f99 79f53ede 0096fb98 b760d545 mscorwks!CtxEntry::EnterContext+0x2db
    0096fb60 7a03e362 00164168 b760d595 00000001 mscorwks!RCWCleanupList::ReleaseRCWListInCorrectCtx+0xc4
    0096fbb0 79f53df7 b760d5d1 0014b738 0014b738 mscorwks!RCWCleanupList::CleanupAllWrappers+0x77
    0096fbf4 79f41014 b760d39d 0014b738 00000000 mscorwks!SyncBlockCache::CleanupSyncBlocks+0xec
    0096fdb8 79ee8dd4 0096fe6c 0096feb4 00000000 mscorwks!Thread::DoExtraWorkForFinalizer+0x40
    0096fdc8 79ed8a2c 0096feb4 00000000 00000001 mscorwks!WKS::GCHeap::FinalizerThreadWorker+0xc6
    0096fdd8 79ed89ca 0096feb4 0096fe60 79f55262 mscorwks!Thread::UserResumeThread+0xfb
    0096fe6c 79ed88f1 0096feb4 b760d08d 00000000 mscorwks!Thread::DoADCallBack+0x355
    0096fea8 79ed988c 0096feb4 00000000 00148628 mscorwks!Thread::DoADCallBack+0x541
    0096fed0 79ed9857 79ee8d2c 00000008 79ed96b5 mscorwks!ManagedThreadBase_NoADTransition+0x32
    0096fedc 79ed96b5 79ee8d2c b760d131 00000000 mscorwks!ManagedThreadBase::FinalizerBase+0xb
    0096ff14 79ed8e36 00000000 8efea224 00000000 mscorwks!WKS::GCHeap::FinalizerThreadStart+0xbb
    0096ffb8 77e6608b 001573b8 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49
    0096ffec 00000000 79ed8df0 001573b8 00000000 kernel32!BaseThreadStart+0x34


    Then, looking at the parameters and after issuing a dc command to peer at memory, I can see the thread number the finalizer is trying to reach (thread 20f0).

    This 20f0 thread is a STA thread. According to what I read here and there, this is probably because the finalizer is trying to finalize a COM object.

    I don't know if this is relevant to the discussion, but the thread the finalizer is trying to reach has been created using Jeffrey Richter's Power Threading library, and more specifically his CallBackThreadPool class. This class is using IO Completion Ports behind the scenes. When looking at this thread at different points in times, using dumps and the !threads command, I can see that it is sometimes MTA and sometimes STA.

    Last thing: I have made two different memory dumps, at 2 minutes intervals. On both these memory dumps, the finalizer was blocked on "GetToSTA", on the same thread (finalizer thread trying to reach the thread 20f0). And yet, between these dumps, and looking at the thread 20f0 call stack, i can see that this thread was not especially blocked, and had the opportunity to reply several requests. (I can see that thanks to a log file I am using).

    So, I am a bit puzzled, and don't really know where to look at in order to solve the problem I am facing.

    Is it possible that the 20f0 thread is tagged as STA, and yet, isn't able to pump messages? (I am not familiar with the COM world)
    How comes this thread is sometimes STA and somtimes MTA?
    Is there a possibility to know exactly which COM object the finalizer thread is trying to finalize? So that I can investigate and perhaps discover that this particuliar COM object hasn't properly been "disposed" using Marshal.ReleaseComObject()?

    Any help would be greatly appreciated.

    Thanks.
    Saturday, October 25, 2008 2:07 AM

All replies

  • This is all like it ought to work.  The finalizer must switch context to an STA thread to call the IUnknown::Release() method for an apartment threaded COM object.  Using RPC like the stack shows is the normal way of doing this.  What happens next is probably your real problem.  If the COM object takes its sweet time cleaning up, the finalizer timeout may trip.  2 seconds isn't much.  You'll then have a bunch of objects that aren't getting finalized, clogging up the garbage disposal.

    Not sure how you'd find the COM interface that causes this problem.  It would be the one that does a lot of stuff when its reference count reaches zero.

    Hans Passant.
    Saturday, October 25, 2008 2:44 AM
    Moderator
  • Hello,

    Thanks for your reply.
    Of course, if the COM object had to take 2 seconds to clean up, and this would then release the memory, this would be fine.
    But in my situation, that's far more dramatic:
    The faulty thread (20f0) the finalizer thread is tryng to reach never "respond" to the finalizer thread and yet still goes on responding to other requests. (I have seen that by doing two memory dumps at two minutes intervals).
    And even worse, when the faulty thread (20f0) is not used anymore, it just sits idle, while the finalizer is still blocked. Here is below the second memory dump for this faulty thread:

    0:019> !threads -special
    ThreadCount: 26
    UnstartedThread: 0
    BackgroundThread: 25
    PendingThread: 0
    DeadThread: 0
    Hosted Runtime: no
                                          PreEmptive   GC Alloc           Lock
           ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
       0    1 2500 00153fd0      6020 Enabled  00000000:00000000 00148628     0 STA
       2    2  8d8 0014b738      b220 Enabled  00000000:00000000 00148628     0 MTA (Finalizer)
       4    4 1ea8 001994d8    80a220 Enabled  00000000:00000000 00148628     0 MTA (Threadpool Completion Port)
       5    6 1e3c 001a4b48   180b220 Enabled  00000000:00000000 00148628     0 MTA (Threadpool Worker)
       8    7 1e34 03e917b0   200b220 Enabled  00000000:00000000 00148628     0 MTA
       9    8 1b40 00208f68   200b220 Enabled  00000000:00000000 00148628     0 MTA
      10    9  bb8 0022bd70      b220 Enabled  1245d2d4:1245ea40 00148628     0 Ukn
      11    a 27f0 00227570      b220 Enabled  00000000:00000000 00148628     0 STA
      12    c 18c0 03f2d350   200b220 Enabled  00000000:00000000 00148628     0 MTA
      13    d 1f58 067abf48   180b220 Enabled  126ec4e4:126eccc0 00148628     0 MTA (Threadpool Worker)
      15    e 155c 067c05b8      b220 Enabled  00000000:00000000 00148628     0 STA
      16    f 17ec 067c3628      b220 Enabled  00000000:00000000 00148628     0 STA
      17   10 1cc8 00207050      b220 Enabled  124804a8:12480a40 00148628     0 STA
      18   11 202c 00208240      b220 Enabled  00000000:00000000 00148628     0 STA
      19   12 20f0 067e8980      b220 Enabled  12504404:12504a40 00148628     0 STA
      20   13 2034 067ea218      b220 Enabled  1244ded8:1244ea40 00148628     0 STA
      21   14 1b94 067eb488      b220 Enabled  00000000:00000000 00148628     0 STA
      22   15 1aec 067ec6f8      b220 Enabled  00000000:00000000 00148628     0 MTA
      23   16 1f28 067ee490      b220 Enabled  1246a754:1246aa40 00148628     0 STA
      24   17 21c4 067efe50      b220 Enabled  124f074c:124f0a40 00148628     0 STA
      25   18 2780 067f1950      b220 Enabled  12515aa0:12516a40 00148628     0 Ukn
      26   19 1424 067f2d50      b220 Enabled  00000000:00000000 00148628     0 STA
      27   1a 1fc0 067f5028      b220 Enabled  00000000:00000000 00148628     0 MTA
      29    5 17d4 001a42a0   180b220 Enabled  00000000:00000000 00148628     0 MTA (Threadpool Worker)
      30    b 1ca0 03eda0d0   880b220 Enabled  12504de4:12506a40 00148628     0 MTA (Threadpool Completion Port)
      33    3  664 00198f08   880b220 Enabled  12622c18:12624a40 00148628     0 MTA (Threadpool Completion Port)

           OSID     Special thread type
        1   1a4c    DbgHelper
        2    8d8    Finalizer
        3   1cb0    Gate
        4   1ea8    Timer
        5   1e3c    ThreadpoolWorker
       13   1f58    ThreadpoolWorker
       29   17d4    ThreadpoolWorker
       30   1ca0    IOCompletion
       33    664    IOCompletion


    0:019> kb 250
    ChildEBP RetAddr  Args to Child             
    0971f0c0 7c821bf4 77e66142 00000480 0971f174 ntdll!KiFastSystemCallRet
    0971f0c4 77e66142 00000480 0971f174 0971f108 ntdll!NtRemoveIoCompletion+0xc
    0971f0f0 0342069b 00000480 0971f178 0971f174 kernel32!GetQueuedCompletionStatus+0x29
    WARNING: Frame IP not in any known module. Following frames may be wrong.
    0971f128 0419e6cd 0971f17c 011c8aec 00000000 0x342069b
    0971f22c 793d70fb 05985f54 793608fd 0971f260 0x419e6cd
    0971f234 793608fd 0971f260 0971f28c 05985f54 mscorlib_ni+0x3170fb
    0971f248 793d71dc 05985f20 00000000 067e8980 mscorlib_ni+0x2a08fd
    0971f260 79e7be1b 00000000 00000000 0971f2f0 mscorlib_ni+0x3171dc
    0971f270 79e7bd9b 0971f340 00000000 0971f310 mscorwks!CallDescrWorker+0x33
    0971f2f0 79e7bce8 0971f340 00000000 0971f310 mscorwks!CallDescrWorkerWithHandler+0xa3
    0971f42c 79e7bbd0 79692e88 0971f5a8 0971f4c0 mscorwks!MethodDesc::CallDescr+0x19c
    0971f444 79e802f4 79692e88 0971f5a8 0971f4c0 mscorwks!MethodDesc::CallTargetWorker+0x20
    0971f458 7a0786cb 0971f4c0 be87dacd 067e8980 mscorwks!MethodDescCallSite::CallWithValueTypes_RetArgSlot+0x18
    0971f65c 79ed8a2c 0971f7d0 00000001 067e8980 mscorwks!ThreadNative::KickOffThread_Worker+0x15d
    0971f66c 79ed89ca 0971f748 0971f6f4 79f55262 mscorwks!Thread::UserResumeThread+0xfb
    0971f700 79ed88f1 0971f748 be87d919 00000001 mscorwks!Thread::DoADCallBack+0x355
    0971f73c 7a0e4d96 0971f748 00000001 00000000 mscorwks!Thread::DoADCallBack+0x541
    0971f764 7a0e4dad 00000001 7a078575 0971f7d0 mscorwks!Thread::DoADCallBack+0x575
    0971f778 7a07a99e 00000001 7a078575 0971f7d0 mscorwks!ManagedThreadBase::KickOff+0x13
    0971f814 79ed8e36 03e82958 00000000 00000000 mscorwks!ThreadNative::KickOffThread+0x230
    0971ffb8 77e6608b 067c12b0 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49
    0971ffec 00000000 79ed8df0 067c12b0 00000000 kernel32!BaseThreadStart+0x34

     and:

    0:019> kb 250
    ChildEBP RetAddr  Args to Child             
    0971f0c0 7c821bf4 77e66142 00000480 0971f174 ntdll!KiFastSystemCallRet
    0971f0c4 77e66142 00000480 0971f174 0971f108 ntdll!NtRemoveIoCompletion+0xc
    0971f0f0 0342069b 00000480 0971f178 0971f174 kernel32!GetQueuedCompletionStatus+0x29
    WARNING: Frame IP not in any known module. Following frames may be wrong.
    0971f128 0419e6cd 0971f17c 011c8aec 00000000 0x342069b
    0971f22c 793d70fb 05985f54 793608fd 0971f260 0x419e6cd
    0971f234 793608fd 0971f260 0971f28c 05985f54 mscorlib_ni+0x3170fb
    0971f248 793d71dc 05985f20 00000000 067e8980 mscorlib_ni+0x2a08fd
    0971f260 79e7be1b 00000000 00000000 0971f2f0 mscorlib_ni+0x3171dc
    0971f270 79e7bd9b 0971f340 00000000 0971f310 mscorwks!CallDescrWorker+0x33
    0971f2f0 79e7bce8 0971f340 00000000 0971f310 mscorwks!CallDescrWorkerWithHandler+0xa3
    0971f42c 79e7bbd0 79692e88 0971f5a8 0971f4c0 mscorwks!MethodDesc::CallDescr+0x19c
    0971f444 79e802f4 79692e88 0971f5a8 0971f4c0 mscorwks!MethodDesc::CallTargetWorker+0x20
    0971f458 7a0786cb 0971f4c0 be87dacd 067e8980 mscorwks!MethodDescCallSite::CallWithValueTypes_RetArgSlot+0x18
    0971f65c 79ed8a2c 0971f7d0 00000001 067e8980 mscorwks!ThreadNative::KickOffThread_Worker+0x15d
    0971f66c 79ed89ca 0971f748 0971f6f4 79f55262 mscorwks!Thread::UserResumeThread+0xfb
    0971f700 79ed88f1 0971f748 be87d919 00000001 mscorwks!Thread::DoADCallBack+0x355
    0971f73c 7a0e4d96 0971f748 00000001 00000000 mscorwks!Thread::DoADCallBack+0x541
    0971f764 7a0e4dad 00000001 7a078575 0971f7d0 mscorwks!Thread::DoADCallBack+0x575
    0971f778 7a07a99e 00000001 7a078575 0971f7d0 mscorwks!ManagedThreadBase::KickOff+0x13
    0971f814 79ed8e36 03e82958 00000000 00000000 mscorwks!ThreadNative::KickOffThread+0x230
    0971ffb8 77e6608b 067c12b0 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49
    0971ffec 00000000 79ed8df0 067c12b0 00000000 kernel32!BaseThreadStart+0x34


    0:019> !clrstack
    OS Thread Id: 0x20f0 (19)
    ESP       EIP     
    0971f140 7c82ed54 [NDirectMethodFrameStandaloneCleanup: 0971f140] LMProject.Win32.UnsafeNativeMethods.GetQueuedCompletionStatus(SafeIOCompletionPortHandle, UInt32 ByRef, IntPtr ByRef, IntPtr ByRef, UInt32)
    0971f15c 0419e6cd LMProject.Threading.IOCompletionPort.GetStatus(Int32, Boolean ByRef, System.Threading.WaitCallback ByRef, System.Object ByRef)
    0971f1a0 0419e082 LMProject.Threading.ThreadPool.ThreadPoolFunction()
    0971f234 793d70fb System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
    0971f23c 793608fd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    0971f254 793d71dc System.Threading.ThreadHelper.ThreadStart()
    0971f478 79e7be1b [GCFrame: 0971f478]

    That's only when the 20f0 thread dies (after some time it is unused, as defined by a timeout in the Power threading library) that the finalizer gets unblocked. In the meanwhile, the memory keeps growing, and if the server goes on doing his tasks before the thread is released (which is always the case in our production, as the server is preloading lots of data, using DB connections and building objects, etc), the memory reachs the 1.6 Gigs and dies with an OOM exception.

    So, that's why I would like find a solution, and perhaps have a grip on the COM object that seems to cause problems. (or find any other roundabout solution)

    Thank you.
    Saturday, October 25, 2008 10:17 AM
  • Hello again.

    I made further tests, in order to try to understand the philosophy behind these GetToSta calls.

    More specifically, I have created a COM component (using ATL + C++) that's supposed to be using the STA threading model.

    Basically, all the tries I made are based on the following pattern:

    I spawn a new thread from the main thread.
    In this new thread, I create a new instance of the COM component.
    Then, in this thread, I sleep.
    And then, at last, from the main thread, I call GC.Collect() followed by a GC.WaitForPendingFinalizers()

    Here are different configurations and what happens:

    I tag the new thread with STA.
    In this situation, the call to WaitForPendingFinalizers() from the main thread doesn't do anything special, as the COM object isn't candidate to finalization.

    I tag the new thread with STA. And I set MyComObject = null before sleeping in the spawned thread.
    Here, the call to WaitForPendingFinalizers() calls a GetToSta() (trying to reach the spawned STA thread) and blocks.

    I tag the new thread with STA, call Marshal.ReleaseComObject and set MyComObject = null. Here, the GetToSta isn't called at all and the WaitForPendingFinalizers() doesn't block.

    I can notice as well that when the STA thread is spawned, two other threads are created by the CLR.

    I tag the new thread as MTA. Here, I can notice that upon COM object creation, a new thread (STA) is created behind the scenes.
    Then, when setting MyComObject = null, I can see that the finalizer issues a GetToSTA() to the new underlying STA thread. But nothing is blocking.

    I tag the new thread as STA, issue a Marshal.ReleaseComObject() and set MyComObject = null. Then, I can see that the finalizer thread doesn't issue any GetToSta() at all, and that it doesn't block.

    All that seems logical to me.

    And so, coming back to my problem, I can roughly understand why the finalizer is blocking:

    My threadpool (based on completion ports) issues MTA threads. Then, during their lifetime, these MTA threads are turned into STA (I would be happy if I could understand and / or prevent that switch from happening). For some reason, I think that some of the COM objects were not properly released (no Marshal.ReleaseComObject() has been called). As a consequence, when the finalizer is run, it tries to reach the containing thread, but without success => deadlock!

    Does all this make sense?
    How could I go further in order to solve my problem? What else can I investigate?

    How could I prevent my Thread Pool threads from being turned into STA threads during their lifetime? Or how could I detectwhich of my COM objects are not properly released? (if they had been released after use, I think that the finalizer thread wouldn't need to call GetToSTA). Or how could I detect which object the finalizer is trying to reach.

    Thank you.




    Sunday, October 26, 2008 2:05 AM
  • Hello,

    I'm still trying to find the reason of this "deadlock".

    For now, what I can conclude is that some STA threads haven't freed some COM resources, and when these resources go out of scope, after the next GC, the finalizer tries to free them, trying to reach the STA thread. However, the STA thread is doing other tasks (including waiting on IO completion port), and so can't reply to the GetToSTA call from the finalizer thread. (in order to force the STA thread to reply to the finalizer, I suppose I could induce a Wait)

    However, in order to solve my problem, I need to find which COM resources are not well freed (so that if I free them OK before unreferencing them, the finalizer will never try to reach the containing thread).
    Is there any way for me to intepret the arguments leading to a call to GetToSTA. Or perhaps know which object the finalizer thread is trying to "finalize"?

    Otherwise, I don't know where to investigate in order to understand my problem better.

    Other question: is it possible for a thread to be switched from MTA to STA during its life? Because that's what seems to happen to my threads over their lifetime.

    Thanks for any help.


    Sunday, October 26, 2008 11:59 PM