none
Troubleshooting 8 to 12 seconds hang in GC try_allocate_more_space RRS feed

  • Question

  • Hi all,

    I have a long-running, heavily multithreaded .NET application which hangs sporadically (i.e. about once a day) for about 8 to 12 seconds, and that pause is not acceptable because of required cyclic TCP communication.

    OS: Windows 7 x64
    .NET version: 4.7.2 with update KB4344146
    Process in question is running as 32-bit

    I managed to capture an ETL trace of the pause using PerfView and observed the following:

    • Process memory information before pause:
      Working set mem: 172 MiB, Private mem: 147 MiB, GC.GetTotalMemory(false): 56 MiB, GC count: 33921/3585/326
    • Process memory information after pause:
      Working set mem: 172 MiB, Private mem: 147 MiB, GC.GetTotalMemory(false): 52 MiB, GC count: 33922/3585/326   
    • The GCStats view does not show any long GC pause - longest pause is 13.5 msec.
    • Around the critical time there are only some Gen 0 collections happening.
    • All my threads that should do something useful seem to be spinning in a function called clr!WKS::gc_heap::try_allocate_more_space.
    • There is one thread which has a different call stack after try_allocate_more_space - see below.
    • Also during the pause, the idle process seems to consume CPU time in a function called ntoskrnl!RtlLookupFunctionTable all the time (but I am not sure if I interpreted that correctly) - full stack trace also below.

    Can anyone help me to resolve this issue?

    I could provide the whole ETL trace (~ 300 MB) to individuals on request.

    Now here is the full stack trace of the interesting thread which somehow seems to be blocked inside GC:

    Name                                                                              	 Inc %	     Inc	Inc Ct	Exc %	Exc	Exc Ct	Fold	Fold Ct	                             When	          First	           Last
     ROOT                                                                             	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
    + Process32 LLMCalibrationLine (3976) Args:                                       	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
     + Thread (4552) CPU=73ms                                                         	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
      + ntdll!_RtlUserThreadStart                                                     	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
       + ntdll!__RtlUserThreadStart                                                   	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
        + kernel32!BaseThreadInitThunk                                                	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
         + clr!Thread::intermediateThreadProc                                         	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
          + clr!ThreadNative::KickOffThread                                           	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
           + clr!ManagedThreadBase_FullTransitionWithAD                               	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
            + clr!ManagedThreadBase_DispatchOuter                                     	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
             + clr!ManagedThreadBase_DispatchMiddle                                   	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
              + clr!ManagedThreadBase_DispatchInner                                   	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
               + clr!ThreadNative::KickOffThread_Worker                               	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                + clr!MethodDescCallSite::CallTargetWorker                            	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                 + clr!CallDescrWorkerWithHandler                                     	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                  + clr!CallDescrWorkerInternal                                       	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                   + mscorlib.ni!ThreadHelper.ThreadStart                             	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                    + mscorlib.ni!ExecutionContext.Run                                	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                     + mscorlib.ni!ExecutionContext.Run                               	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                      + mscorlib.ni!ExecutionContext.RunInternal                      	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                       + mscorlib.ni!ThreadHelper.ThreadStart_Context                 	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                        + ugs.automotive.datalink.lin2base!LinTracer.TracingThreadProc	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                         + mscorlib.ni!WaitHandle.WaitAny                             	 100,0	8.710,797	     3	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.709,228
                          + clr!JIT_NewArr1                                           	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |+ clr!AllocateArrayEx                                      	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          | + clr!Alloc                                               	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |  + clr!WKS::GCHeap::Alloc                                 	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |   + clr!WKS::gc_heap::allocate_more_space                 	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |    + clr!WKS::gc_heap::try_allocate_more_space            	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |     + clr!WKS::gc_heap::trigger_gc_for_alloc              	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |      + clr!CLREventBase::Set                              	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |       + kernelbase!SetEvent                               	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |        + ntdll!NtSetEvent                                 	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |         + ntdll!LdrInitializeThunk                        	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |          + ntdll! ?? ::FNODOBFM::`string'                 	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |           + wow64!Wow64LdrpInitialize                     	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |            + wow64!RunCpuSimulation                       	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |             + wow64cpu!Thunk0Arg                          	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |              + wow64cpu!CpupSyscallStub                   	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |               + ntoskrnl!NtSetEvent                       	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |                + ntoskrnl!KeSetEvent                      	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |                 + ntoskrnl!KiSwapContext                  	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |                  + ntoskrnl!SwapContext_PatchXRstor       	 100,0	8.707,631	     2	  0,0	  0	     0	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,061
                          |                   + BLOCKED_TIME                          	 100,0	8.707,606	     1	100,0	8.707,606	     1	   0	      0	 99999999999999999999999999999999	 93.946.998,430	 93.955.706,037
                          |                   + CPU_TIME                              	   0,0	   0,024	     1	  0,0	0,024	     1	   0	      0	 _______________________________.	 93.955.706,037	 93.955.706,061
                          + clr!WaitHandleNative::CorWaitMultipleNative               	   0,0	   3,166	     1	  0,0	  0	     0	   0	      0	 _______________________________0	 93.955.706,061	 93.955.709,228

    And the stack trace of the idle process:

    Name                                                                   	Inc %	     Inc	Inc Ct	Exc %	   Exc	Exc Ct	Fold	Fold Ct	                             When	          First	           Last
    + Process64 Idle (0) Args:                                             	  4,3	9.818,798	    36	  0,0	     0	     0	   0	      0	 9AA9999999999999999999999999999A	 93.945.990,430	 93.955.809,228
    |+ Thread (0) CPU=6956ms                                               	  4,3	9.818,798	    36	  0,0	     0	     0	   0	      0	 9AA9999999999999999999999999999A	 93.945.990,430	 93.955.809,228
    | + ntoskrnl!KiIdleLoop                                                	  4,3	9.818,798	    36	  0,0	     0	     0	   0	      0	 9AA9999999999999999999999999999A	 93.945.990,430	 93.955.809,228
    |  + ntoskrnl!KiRetireDpcList                                          	  4,3	9.774,802	    33	  0,0	     0	     0	   0	      0	 8AA9999999999999999999999999999A	 93.945.990,430	 93.955.809,228
    |  |+ ndis!ndisInterruptDpc                                            	  4,3	9.725,809	    31	  0,0	     0	     0	   0	      0	 8A98999999999999999999999999999A	 93.945.990,430	 93.955.809,228
    |  ||+ module e1d62x64.sys <<e1d62x64!?>>                              	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||+ ndis!NdisMIndicateReceiveNetBufferLists                        	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  ||| + ndis!ndisMDispatchReceiveNetBufferLists                       	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||  + ndis!ndisMIndicateNetBufferListsToOpen                       	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||   + tcpip!FlReceiveNetBufferListChain                           	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||    + ntoskrnl!KeExpandKernelStackAndCalloutEx                   	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||     + tcpip!FlReceiveNetBufferListChainCalloutRoutine           	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||      + tcpip!IpFlcReceivePreValidatedPackets                    	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||       + tcpip!TcpPreValidatedReceive                            	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||        + tcpip!TcpMatchReceive                                  	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||         + tcpip!TcpTcbReceive                                   	  4,3	9.659,444	    27	  0,0	     0	     0	   0	      0	 8A989999999999999999999999999998	 93.945.990,430	 93.955.776,860
    |  |||          + tcpip!TcpTcbCarefulDatagram                          	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |+ tcpip!Template_pqq                                  	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          | + ntoskrnl!EtwWrite                                  	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |  + ntoskrnl!EtwWriteEx                               	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |   + ntoskrnl!EtwpEventWriteFull                      	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |    + ntoskrnl!EtwpGetStackExtendedHeaderItem         	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |     + ntoskrnl!RtlWalkFrameChain                     	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |      + ntoskrnl!RtlpWalkFrameChain                   	  3,9	8.851,089	     6	  0,0	     0	     0	   0	      0	 0_289999999999999999999999999997	 93.946.113,855	 93.955.726,865
    |  |||          |       + ntoskrnl!RtlpLookupFunctionEntryForStackWalks	  3,9	8.831,091	     5	  0,0	     0	     0	   0	      0	 __289999999999999999999999999997	 93.946.713,793	 93.955.726,865
    |  |||          |       |+ ntoskrnl!RtlLookupFunctionTable             	  3,9	8.753,099	     3	  0,0	     0	     0	   0	      0	 __079999999999999999999999999997	 93.946.713,793	 93.955.726,865
    |  |||          |       ||+ CPU_TIME                                   	  3,9	8.753,099	     3	  3,9	8.753,099	     3	   0	      0	 __079999999999999999999999999997	 93.946.713,793	 93.955.726,865
    |  |||          |       |+ CPU_TIME                                    	  0,0	  77,992	     2	  0,0	77,992	     2	   0	      0	 __20____________________________	 93.946.723,792	 93.946.961,768
    |  |||          |       + CPU_TIME                                     	  0,0	  19,998	     1	  0,0	19,998	     1	   0	      0	 0_______________________________	 93.946.113,855	 93.946.133,853
    |  |||          + tcpip!TcpTcbFastDatagram                             	  0,3	 771,922	    19	  0,0	     0	     0	   0	      0	 6A7____________________________1	 93.946.022,864	 93.955.776,860
    |  |||          + tcpip!TcpCompleteTcbSend                             	  0,0	  36,434	     2	  0,0	     0	     0	   0	      0	 1_______________________________	 93.945.990,430	 93.946.223,844
    |  ||+ module e1q62x64.sys <<e1q62x64!?>>                              	  0,0	  66,364	     4	  0,0	     0	     0	   0	      0	 0_0____________________________1	 93.946.003,867	 93.955.809,228
    |  |+ module iastora.sys <<iastora!?>>                                 	  0,0	  36,996	     1	  0,0	     0	     0	   0	      0	 __o1____________________________	 93.946.908,773	 93.946.945,770
    |  |+ module acpi.sys <<acpi!?>>                                       	  0,0	  11,997	     1	  0,0	     0	     0	   0	      0	 ___0____________________________	 93.946.973,769	 93.946.985,765
    |  + ntoskrnl!PoIdle                                                   	  0,0	  43,996	     3	  0,0	     0	     0	   0	      0	 1__0____________________________	 93.946.057,862	 93.946.973,769

    /Edit: Note that the numbers in the stack traces copied from PerfView use a comma as decimal separator because of my local settings.

    Regards,

    Christian Hornung


    • Edited by Christian Hornung Friday, September 14, 2018 4:24 PM Added note about decimal separator in stack traces.
    Friday, September 14, 2018 4:19 PM

All replies

  • Hi Christian Hornung,

    Thank you for posting here.

    According to your description, I do not find the special GC change in stack trace. I could not make sure what cause the error. I suggest to use Windbg to debug, it would provide more details for this Garbage Collection and Performance.

    Please check the MSDN document below to debug.

    https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/performance

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, September 17, 2018 7:05 AM
    Moderator
  • Hi Wendy,

    Thank you for your reply.

    I already checked the MSDN document you provided, but I failed to apply it to this issue. There is no out-of-memory problem and the ETL trace does not indicate too frequent or too long garbage collection pauses (outside the 8 second pause, Gen 1 is collected about every second with pause times from 2 to 10 msec, which is no problem at all).

    Could you please elaborate on how I can use Windbg to debug this issue?

    I see the following problems:

    • Currently my application can only detect the pause after it has happened, since all code execution seems to be stalled during the pause. How can I set a breakpoint that triggers during the pause?
    • Assuming I managed to break the application in the "stalled" state, how can I proceed from there? Since the ETL trace suggests that it stalls somewhere inside GC (but not during an actual garbage collection run, since I do not see corresponding GC ETW events in the trace), I cannot imagine how to debug that without having access to the source code.

    Thanks for your assistance.

    Regards,

    Christian

    Monday, September 17, 2018 12:03 PM
  • Hi Christian Hornung,

    You could download the Windbg from the link below.

    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools

    For the Tutorial of Windbg, you could refer to the link below.

    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/getting-started-with-windbg

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, October 9, 2018 7:48 AM
    Moderator