locked
BSOD upon FwpsCompleteOperation0 on Vista x64 SP1 RRS feed

  • Question

  • There were a bunch of posts related to a bug that was fixed post Vista RTM, but it seems that this is a little bit different and it happened on Vista SP1:

    vertarget
    Windows Server 2008/Windows Vista SP1 Kernel Version 6001 (Service Pack 1) MP (2 procs) Free x64
    Product: WinNt, suite: TerminalServer SingleUserTS
    Built by: 6001.18000.amd64fre.longhorn_rtm.080118-1840
    Machine Name:
    Kernel base = 0xfffff800`01849000 PsLoadedModuleList = 0xfffff800`01a0edb0
    Debug session time: Fri May 15 17:56:15.819 2009 (GMT-7)
    System Uptime: 0 days 0:25:27.882

    IRQL_NOT_LESS_OR_EQUAL (a)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If a kernel debugger is available get the stack backtrace.
    Arguments:
    Arg1: 0000000000000078, memory referenced
    Arg2: 000000000000000c, IRQL
    Arg3: 0000000000000001, bitfield :
        bit 0 : value 0 = read operation, 1 = write operation
        bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
    Arg4: fffff800018d329b, address which referenced memory

    Debugging Details:
    ------------------


    WRITE_ADDRESS:  0000000000000078

    CURRENT_IRQL:  c

    FAULTING_IP:
    nt!KeInsertQueueApc+4b
    fffff800`018d329b 498712          xchg    rdx,qword ptr [r10]

    Child-SP          RetAddr           Call Site
    fffffa60`019c0238 fffff800`0189e12e nt!KeBugCheckEx
    fffffa60`019c0240 fffff800`0189d00b nt!KiBugCheckDispatch+0x6e
    fffffa60`019c0380 fffff800`018d329b nt!KiPageFault+0x20b
    fffffa60`019c0510 fffffa60`03b029e2 nt!KeInsertQueueApc+0x4b
    fffffa60`019c0570 fffff800`018a0705 afd!AfdRestartConnect+0x212
    fffffa60`019c05f0 fffffa60`023e9c0f nt!IopfCompleteRequest+0x315
    fffffa60`019c06b0 fffffa60`00ec4cae tdx!TdxConnectConnectionTlRequestComplete+0x1f3
    fffffa60`019c0730 fffffa60`00ec2937 tcpip!TcpCreateAndConnectTcbComplete+0x1de
    fffffa60`019c07f0 fffffa60`00f07f26 tcpip!TcpShutdownTcb+0x217
    fffffa60`019c08b0 fffffa60`00ec735d tcpip! ?? ::FNODOBFM::`string'+0x38433
    fffffa60`019c0a00 fffffa60`00f623e7 tcpip!TcpCreateAndConnectTcbInspectComplete+0x2bd
    fffffa60`019c0aa0 fffffa60`00f64a8c tcpip!WfpAlepAuthorizeCompletionCommon+0x27
    fffffa60`019c0ad0 fffffa60`00f6f0a9 tcpip!WfpAlepAuthorizeConnectCompletion+0xec
    fffffa60`019c0bb0 fffffa60`00dd2172 tcpip!WfpAleCompleteOperation+0x139
    fffffa60`019c0c10 fffffa60`03a14665 fwpkclnt!FwpsCompleteOperation0+0x1e
    fffffa60`019c0c40 fffff800`01b072cf mfewfpk!+0xxxxxx
    fffffa60`019c0cc0 fffff800`018ab066 nt!IopProcessWorkItem+0x27
    fffffa60`019c0cf0 fffff800`01ac1de3 nt!ExpWorkerThread+0x11a
    fffffa60`019c0d50 fffff800`018d8536 nt!PspSystemThreadStartup+0x57
    fffffa60`019c0d80 00000000`00000000 nt!KiStartSystemThread+0x16

    FOLLOWUP_IP:
    tdx!TdxConnectConnectionTlRequestComplete+1f3
    fffffa60`023e9c0f 413bff          cmp     edi,r15d

    SYMBOL_STACK_INDEX:  6

    SYMBOL_NAME:  tdx!TdxConnectConnectionTlRequestComplete+1f3

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: tdx

    IMAGE_NAME:  tdx.sys

    DEBUG_FLR_IMAGE_TIMESTAMP:  47919a85

    FAILURE_BUCKET_ID:  X64_0xA_tdx!TdxConnectConnectionTlRequestComplete+1f3

    BUCKET_ID:  X64_0xA_tdx!TdxConnectConnectionTlRequestComplete+1f3

    Followup: MachineOwner
    ---------

    According to my findings the sequence of events was the following:

    1) My callout got invoked at auth_connect_v4 and pended the operation
    2) The calling thread terminated (do not know why)
      0: kd> !thread 0xfffffa80`00ca5bb0
      THREAD fffffa8000ca5bb0  Cid 0414.0748  Teb: 0000000000000000 Win32Thread: 0000000000000000 TERMINATED
      Not impersonating
      DeviceMap                 fffff880058d3870
      Owning Process            0       Image:         <Unknown>
      Attached Process          fffffa80040e9c10       Image:         svchost.exe
      Wait Start TickCount      93138          Ticks: 4646 (0:00:01:12.593)
      Context Switch Count      31049            
      UserTime                  00:00:00.000
      KernelTime                00:00:00.546
      Win32 Start Address 0x000007fefaf1c660
      Stack Init 0 Current fffffa6006412870
      Base fffffa6006413000 Limit fffffa600640d000 Call 0
      Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
    3) My FWPM_LAYER_ALE_AUTH_CONNECT_V4_DISCARD classify function was invoked
    4) After some time (do not know how long) my callout called FwpsCompleteOperation0 from a work item
    5) BSOD in afd.

    Needless to say this is not reproducible, as a matter of face I have seen this problem only once, but this is enough to trigger the alarm.

    So ... I have the following questions:
    1) Is this a known problem and has it been fixed?
    2) What should a callout do in cases like the one above - my current implementation is to always call FwpsCompleteOperation0 in order to free the resources acquired with FwpsPendOperaiton0.
    3) Do you need a kernel memory dump or since this is a problem with an already released OS I should contact the support team?

    Thanks in advance!
    Monday, May 18, 2009 10:05 PM

Answers

  • I do not know what the call stack was when ALE_AUTH_CONNECT_V4_DISCARD was invoked. I am speculating that the discard call happened based on per connection information my driver stores.

    Where should I upload the dump?


    Further analysis showed that the _KAPC object (passed to KeInsertQueueApc) has
    its Thread member set to NULL.

    0: kd> kv
    *** Stack trace for last set context - .thread/.cxr resets it
    Child-SP RetAddr : Args to : Call Site
    fffffa60`019c0510 fffffa60`03b029e2 : 00000000`ffffffff fffffa80`015347d0
    fffffa80`014db010 fffff880`49646641 : nt!KeInsertQueueApc+0x4b
    fffffa60`019c0570 fffff800`018a0705 : fffffa80`015347d0 fffffa80`01f4f300
    fffffa80`0398f000 fffffa80`01534933 : afd!AfdRestartConnect+0x212
    fffffa60`019c05f0 fffffa60`023e9c0f : fffffa80`01518820 fffffa80`0398f002
    fffffa80`0242ca80 fffffa60`00f4a295 : nt!IopfCompleteRequest+0x315
    fffffa60`019c06b0 fffffa60`00ec4cae : 00000000`c0000120 fffffa80`0398f002
    fffffa80`01505580 00000000`00000000 : tdx!TdxConnectConnectionTlRequestComplete+0x1f3
    fffffa60`019c0730 fffffa60`00ec2937 : 00000000`00000000 00000000`00000000
    00000000`c0000120 00000000`00000000 : tcpip!TcpCreateAndConnectTcbComplete+0x1de
    fffffa60`019c07f0 fffffa60`00f07f26 : fffffa80`01505580 fffffa80`01505580
    fffffa80`0414a520 00000000`00000000 : tcpip!TcpShutdownTcb+0x217
    fffffa60`019c08b0 fffffa60`00ec735d : fffffa60`00001cf1 00000000`000009c0
    fffffa80`048b2130 00000000`00000000 : tcpip! ?? ::FNODOBFM::`string'+0x38433
    fffffa60`019c0a00 fffffa60`00f623e7 : 00000000`00000001 fffffa80`01ef5300
    fffffa80`01587d30 fffffa80`01ef5300 : tcpip!TcpCreateAndConnectTcbInspectComplete+0x2bd
    fffffa60`019c0aa0 fffffa60`00f64a8c : fffffa80`0444c840 00000000`00000000
    00000000`00000006 00000000`00000000 : tcpip!WfpAlepAuthorizeCompletionCommon+0x27
    fffffa60`019c0ad0 fffffa60`00f6f0a9 : fffffa80`048b2130 fffffa80`014237b0
    fffffa80`01587d30 fffffa60`03a0b65f : tcpip!WfpAlepAuthorizeConnectCompletion+0xec
    fffffa60`019c0bb0 fffffa60`00dd2172 : fffffa80`01ef72c0 fffffa80`014237b0
    00050333`000f0038 00000000`00000000 : tcpip!WfpAleCompleteOperation+0x139
    fffffa60`019c0c10 fffffa60`03a14665 : fffffa80`01ef72c0 fffffa60`03a261e0
    fffffa80`014237b0 00000000`00000038 : fwpkclnt!FwpsCompleteOperation0+0x1e
    fffffa60`019c0c40 fffff800`01b072cf : fffffa80`01ef72c0 fffffa80`038c5260
    fffffa80`038c5260 fffff800`019db8f8 : mfewfpk!CompleteOutboundConnectWorker+0x105
    fffffa60`019c0cc0 fffff800`018ab066 : fffff800`01b072a8 fffff800`019db801
    fffffa80`00c80700 00000000`00000001 : nt!IopProcessWorkItem+0x27
    fffffa60`019c0cf0 fffff800`01ac1de3 : fffffa80`01587d30 00000000`00000000
    fffffa80`00c80720 00000000`00000080 : nt!ExpWorkerThread+0x11a
    fffffa60`019c0d50 fffff800`018d8536 : fffffa60`005ec180 fffffa80`00c80720
    fffffa60`005f5d40 00000000`00000001 : nt!PspSystemThreadStartup+0x57
    fffffa60`019c0d80 00000000`00000000 : 00000000`00000000 00000000`00000000
    00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16
    0: kd> dq fffffa60`019c0510
    fffffa60`019c0510 fffffa80`015347d0 00000000`00000000
    fffffa60`019c0520 fffffa80`014db010 fffffa60`03ae87e2
    fffffa60`019c0530 00000000`00000000 00000000`00000078
    fffffa60`019c0540 fffffa80`014cec02 fffffa80`01ccfd70
    fffffa60`019c0550 fffffa80`014cec50 00000000`00000000
    fffffa60`019c0560 fffffa80`01ccfd70 fffffa60`03b029e2
    fffffa60`019c0570 00000000`ffffffff fffffa80`015347d0
    fffffa60`019c0580 fffffa80`014db010 fffff880`49646641
    0: kd> dq fffffa80`01ccfd70+40
    fffffa80`01ccfdb0 00000000`00580012 00000000`00000000
    fffffa80`01ccfdc0 00000000`00000000 00000000`00000000
    fffffa80`01ccfdd0 fffffa60`03acfbc0 fffffa60`03ad9900
    fffffa80`01ccfde0 00000000`00000000 00000000`00000000
    fffffa80`01ccfdf0 00000000`00000000 00000000`00000000
    fffffa80`01ccfe00 00000000`00000000 00000000`00100090
    fffffa80`01ccfe10 00000000`00000000 00000000`00000000
    fffffa80`01ccfe20 00000000`00000000 00000000`00000000
    0: kd> dt nt!_KAPC (fffffa80`01ccfd70+40)
    +0x000 Type : 0x12 ''
    +0x001 SpareByte0 : 0 ''
    +0x002 Size : 0x58 'X'
    +0x003 SpareByte1 : 0 ''
    +0x004 SpareLong0 : 0
    +0x008 Thread : (null)
    +0x010 ApcListEntry : _LIST_ENTRY [ 0x00000000`00000000 - 0x0 ]
    +0x020 KernelRoutine : 0xfffffa60`03acfbc0 void
    afd!AfdConnectApcKernelRoutine+0
    +0x028 RundownRoutine : 0xfffffa60`03ad9900 void
    afd!AfdConnectApcRundownRoutine+0
    +0x030 NormalRoutine : (null)
    +0x038 NormalContext : (null)
    +0x040 SystemArgument1 : (null)
    +0x048 SystemArgument2 : (null)
    +0x050 ApcStateIndex : 0 ''
    +0x051 ApcMode : 0 ''
    +0x052 Inserted : 0 ''

    If we look at the _KTHREAD object:

    kd> dt nt!_KTHREAD
    +0x000 Header : _DISPATCHER_HEADER
    +0x018 CycleTime : Uint8B
    +0x020 QuantumTarget : Uint8B
    +0x028 InitialStack : Ptr64 Void
    +0x030 StackLimit : Ptr64 Void
    +0x038 KernelStack : Ptr64 Void
    +0x040 ThreadLock : Uint8B
    +0x048 ApcState : _KAPC_STATE
    +0x048 ApcStateFill : [43] UChar
    +0x073 Priority : Char
    +0x074 NextProcessor : Uint2B
    +0x076 DeferredProcessor : Uint2B
    +0x078 ApcQueueLock : Uint8B
    .....

    we will find out that ApcQueueLock is located at offset 0x78, which explains the
    crash:

    0: kd> uf nt!KeInsertQueueApc
    nt!KeInsertQueueApc:
    fffff800`018d3250 4c8bdc mov r11,rsp
    fffff800`018d3253 49895b08 mov qword ptr [r11+8],rbx
    fffff800`018d3257 49896b10 mov qword ptr [r11+10h],rbp
    fffff800`018d325b 49897318 mov qword ptr [r11+18h],rsi
    fffff800`018d325f 57 push rdi
    fffff800`018d3260 4154 push r12
    fffff800`018d3262 4155 push r13
    fffff800`018d3264 4883ec40 sub rsp,40h
    fffff800`018d3268 488b7908 mov rdi,qword ptr [rcx+8]
    fffff800`018d326c 4533ed xor r13d,r13d
    fffff800`018d326f 418bf1 mov esi,r9d
    fffff800`018d3272 4c8d5778 lea r10,[rdi+78h]

    At the last instruction above rdi was supposed to point to _KTHREAD object, but
    was 0, because _KAPC was not populated correctly. Subsequently KeInsertQueueApc crashed as it was trying to read from memory
    0+0x78.


    Wednesday, May 27, 2009 9:22 PM

All replies

  • Do you happen to know the callstack when your ALE_AUTH_CONNECT_V4_DISCARD callout is invoked? That actually suggested that for some reason the connection is blocked instead of pended.

    We haven't seen bugchecks like yours, please submit a memory dump.

    Thanks,
    Biao.W.
    Tuesday, May 26, 2009 9:38 PM
  • I do not know what the call stack was when ALE_AUTH_CONNECT_V4_DISCARD was invoked. I am speculating that the discard call happened based on per connection information my driver stores.

    Where should I upload the dump?


    Further analysis showed that the _KAPC object (passed to KeInsertQueueApc) has
    its Thread member set to NULL.

    0: kd> kv
    *** Stack trace for last set context - .thread/.cxr resets it
    Child-SP RetAddr : Args to : Call Site
    fffffa60`019c0510 fffffa60`03b029e2 : 00000000`ffffffff fffffa80`015347d0
    fffffa80`014db010 fffff880`49646641 : nt!KeInsertQueueApc+0x4b
    fffffa60`019c0570 fffff800`018a0705 : fffffa80`015347d0 fffffa80`01f4f300
    fffffa80`0398f000 fffffa80`01534933 : afd!AfdRestartConnect+0x212
    fffffa60`019c05f0 fffffa60`023e9c0f : fffffa80`01518820 fffffa80`0398f002
    fffffa80`0242ca80 fffffa60`00f4a295 : nt!IopfCompleteRequest+0x315
    fffffa60`019c06b0 fffffa60`00ec4cae : 00000000`c0000120 fffffa80`0398f002
    fffffa80`01505580 00000000`00000000 : tdx!TdxConnectConnectionTlRequestComplete+0x1f3
    fffffa60`019c0730 fffffa60`00ec2937 : 00000000`00000000 00000000`00000000
    00000000`c0000120 00000000`00000000 : tcpip!TcpCreateAndConnectTcbComplete+0x1de
    fffffa60`019c07f0 fffffa60`00f07f26 : fffffa80`01505580 fffffa80`01505580
    fffffa80`0414a520 00000000`00000000 : tcpip!TcpShutdownTcb+0x217
    fffffa60`019c08b0 fffffa60`00ec735d : fffffa60`00001cf1 00000000`000009c0
    fffffa80`048b2130 00000000`00000000 : tcpip! ?? ::FNODOBFM::`string'+0x38433
    fffffa60`019c0a00 fffffa60`00f623e7 : 00000000`00000001 fffffa80`01ef5300
    fffffa80`01587d30 fffffa80`01ef5300 : tcpip!TcpCreateAndConnectTcbInspectComplete+0x2bd
    fffffa60`019c0aa0 fffffa60`00f64a8c : fffffa80`0444c840 00000000`00000000
    00000000`00000006 00000000`00000000 : tcpip!WfpAlepAuthorizeCompletionCommon+0x27
    fffffa60`019c0ad0 fffffa60`00f6f0a9 : fffffa80`048b2130 fffffa80`014237b0
    fffffa80`01587d30 fffffa60`03a0b65f : tcpip!WfpAlepAuthorizeConnectCompletion+0xec
    fffffa60`019c0bb0 fffffa60`00dd2172 : fffffa80`01ef72c0 fffffa80`014237b0
    00050333`000f0038 00000000`00000000 : tcpip!WfpAleCompleteOperation+0x139
    fffffa60`019c0c10 fffffa60`03a14665 : fffffa80`01ef72c0 fffffa60`03a261e0
    fffffa80`014237b0 00000000`00000038 : fwpkclnt!FwpsCompleteOperation0+0x1e
    fffffa60`019c0c40 fffff800`01b072cf : fffffa80`01ef72c0 fffffa80`038c5260
    fffffa80`038c5260 fffff800`019db8f8 : mfewfpk!CompleteOutboundConnectWorker+0x105
    fffffa60`019c0cc0 fffff800`018ab066 : fffff800`01b072a8 fffff800`019db801
    fffffa80`00c80700 00000000`00000001 : nt!IopProcessWorkItem+0x27
    fffffa60`019c0cf0 fffff800`01ac1de3 : fffffa80`01587d30 00000000`00000000
    fffffa80`00c80720 00000000`00000080 : nt!ExpWorkerThread+0x11a
    fffffa60`019c0d50 fffff800`018d8536 : fffffa60`005ec180 fffffa80`00c80720
    fffffa60`005f5d40 00000000`00000001 : nt!PspSystemThreadStartup+0x57
    fffffa60`019c0d80 00000000`00000000 : 00000000`00000000 00000000`00000000
    00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16
    0: kd> dq fffffa60`019c0510
    fffffa60`019c0510 fffffa80`015347d0 00000000`00000000
    fffffa60`019c0520 fffffa80`014db010 fffffa60`03ae87e2
    fffffa60`019c0530 00000000`00000000 00000000`00000078
    fffffa60`019c0540 fffffa80`014cec02 fffffa80`01ccfd70
    fffffa60`019c0550 fffffa80`014cec50 00000000`00000000
    fffffa60`019c0560 fffffa80`01ccfd70 fffffa60`03b029e2
    fffffa60`019c0570 00000000`ffffffff fffffa80`015347d0
    fffffa60`019c0580 fffffa80`014db010 fffff880`49646641
    0: kd> dq fffffa80`01ccfd70+40
    fffffa80`01ccfdb0 00000000`00580012 00000000`00000000
    fffffa80`01ccfdc0 00000000`00000000 00000000`00000000
    fffffa80`01ccfdd0 fffffa60`03acfbc0 fffffa60`03ad9900
    fffffa80`01ccfde0 00000000`00000000 00000000`00000000
    fffffa80`01ccfdf0 00000000`00000000 00000000`00000000
    fffffa80`01ccfe00 00000000`00000000 00000000`00100090
    fffffa80`01ccfe10 00000000`00000000 00000000`00000000
    fffffa80`01ccfe20 00000000`00000000 00000000`00000000
    0: kd> dt nt!_KAPC (fffffa80`01ccfd70+40)
    +0x000 Type : 0x12 ''
    +0x001 SpareByte0 : 0 ''
    +0x002 Size : 0x58 'X'
    +0x003 SpareByte1 : 0 ''
    +0x004 SpareLong0 : 0
    +0x008 Thread : (null)
    +0x010 ApcListEntry : _LIST_ENTRY [ 0x00000000`00000000 - 0x0 ]
    +0x020 KernelRoutine : 0xfffffa60`03acfbc0 void
    afd!AfdConnectApcKernelRoutine+0
    +0x028 RundownRoutine : 0xfffffa60`03ad9900 void
    afd!AfdConnectApcRundownRoutine+0
    +0x030 NormalRoutine : (null)
    +0x038 NormalContext : (null)
    +0x040 SystemArgument1 : (null)
    +0x048 SystemArgument2 : (null)
    +0x050 ApcStateIndex : 0 ''
    +0x051 ApcMode : 0 ''
    +0x052 Inserted : 0 ''

    If we look at the _KTHREAD object:

    kd> dt nt!_KTHREAD
    +0x000 Header : _DISPATCHER_HEADER
    +0x018 CycleTime : Uint8B
    +0x020 QuantumTarget : Uint8B
    +0x028 InitialStack : Ptr64 Void
    +0x030 StackLimit : Ptr64 Void
    +0x038 KernelStack : Ptr64 Void
    +0x040 ThreadLock : Uint8B
    +0x048 ApcState : _KAPC_STATE
    +0x048 ApcStateFill : [43] UChar
    +0x073 Priority : Char
    +0x074 NextProcessor : Uint2B
    +0x076 DeferredProcessor : Uint2B
    +0x078 ApcQueueLock : Uint8B
    .....

    we will find out that ApcQueueLock is located at offset 0x78, which explains the
    crash:

    0: kd> uf nt!KeInsertQueueApc
    nt!KeInsertQueueApc:
    fffff800`018d3250 4c8bdc mov r11,rsp
    fffff800`018d3253 49895b08 mov qword ptr [r11+8],rbx
    fffff800`018d3257 49896b10 mov qword ptr [r11+10h],rbp
    fffff800`018d325b 49897318 mov qword ptr [r11+18h],rsi
    fffff800`018d325f 57 push rdi
    fffff800`018d3260 4154 push r12
    fffff800`018d3262 4155 push r13
    fffff800`018d3264 4883ec40 sub rsp,40h
    fffff800`018d3268 488b7908 mov rdi,qword ptr [rcx+8]
    fffff800`018d326c 4533ed xor r13d,r13d
    fffff800`018d326f 418bf1 mov esi,r9d
    fffff800`018d3272 4c8d5778 lea r10,[rdi+78h]

    At the last instruction above rdi was supposed to point to _KTHREAD object, but
    was 0, because _KAPC was not populated correctly. Subsequently KeInsertQueueApc crashed as it was trying to read from memory
    0+0x78.


    Wednesday, May 27, 2009 9:22 PM