locked
Crash with inspect sample RRS feed

  • Question

  • Hi. I am facing a crash caused by NETIO.SYS with Windows7 (RTM) and inspect sample. My question is: What went wrong and how to avoid this crash?

    Environment:

    - Windows7 RTM
    - Inspect sample from newest WDK (7600.16385.0)

    Full dump is available. Crash analysis:


    Loading Dump File [C:\Users\Frank\Desktop\MEMORY.DMP]
    Kernel Complete Dump File: Full address space is available

    Symbol search path is: SRV*p:\websymbols*http://msdl.microsoft.com/download/symbols;SRV*p:\websymbols*\\pegasus\download\symbols;M:\Symbols
    Executable search path is:
    Windows 7 Kernel Version 7600 MP (4 procs) Free x86 compatible
    Product: WinNt, suite: TerminalServer SingleUserTS
    Built by: 7600.16385.x86fre.win7_rtm.090713-1255
    Machine Name:
    Kernel base = 0x8280d000 PsLoadedModuleList = 0x82955810
    Debug session time: Mon Aug 10 20:55:57.934 2009 (GMT+2)
    System Uptime: 0 days 5:25:53.323
    Loading Kernel Symbols
    ...............................................................
    ................................................................
    ..........
    Loading User Symbols

    Loading unloaded module list
    ..................................................
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    Use !analyze -v to get detailed debugging information.

    BugCheck D1, {0, 2, 0, 88c95dad}

    Probably caused by : NETIO.SYS ( NETIO!NetioDereferenceNetBufferList+a2 )

    Followup: MachineOwner
    ---------

    2: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If kernel debugger is available get stack backtrace.
    Arguments:
    Arg1: 00000000, memory referenced
    Arg2: 00000002, IRQL
    Arg3: 00000000, value 0 = read operation, 1 = write operation
    Arg4: 88c95dad, address which referenced memory

    Debugging Details:
    ------------------


    READ_ADDRESS:  00000000

    CURRENT_IRQL:  2

    FAULTING_IP:
    tcpip!FlpReturnNetBufferListChain+35
    88c95dad 8b08            mov     ecx,dword ptr [eax]

    DEFAULT_BUCKET_ID:  INTEL_CPU_MICROCODE_ZERO

    BUGCHECK_STR:  0xD1

    PROCESS_NAME:  System

    TRAP_FRAME:  942d1ab0 -- (.trap 0xffffffff942d1ab0)
    ErrCode = 00000000
    eax=00000000 ebx=861fee30 ecx=856ad918 edx=829429c0 esi=861feed0 edi=ffffffac
    eip=88c95dad esp=942d1b24 ebp=942d1b38 iopl=0         nv up ei ng nz na pe nc
    cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010286
    tcpip!FlpReturnNetBufferListChain+0x35:
    88c95dad 8b08            mov     ecx,dword ptr [eax]  ds:0023:00000000=????????
    Resetting default scope

    LAST_CONTROL_TRANSFER:  from 88c95dad to 828537eb

    STACK_TEXT: 
    942d1ab0 88c95dad badb0d00 829429c0 829428c0 nt!KiTrap0E+0x2cf
    942d1b38 88abfb48 861fee30 00000001 00000000 tcpip!FlpReturnNetBufferListChain+0x35
    942d1b58 88ac121c 84a83d70 00000000 00000000 NETIO!NetioDereferenceNetBufferList+0xa2
    942d1b88 88c97b40 00000000 00000000 00000000 NETIO!NetioDereferenceNetBufferListChain+0x3a
    942d1ba8 88c98fc0 8567e000 00000000 85647670 tcpip!IppCompleteAndFreePacketList+0xd7
    942d1bec 88c96b64 88cf8d98 00000011 84a83d70 tcpip!IppReceiveHeaderBatch+0x28c
    942d1c80 88cd3fad 861250f8 00000000 00000001 tcpip!IpFlcReceivePackets+0xbe5
    942d1ca0 88d60197 02000000 00000001 0000000b tcpip!IppInspectInjectReceive+0xca
    942d1cd8 931151b0 860be740 00000000 00000000 fwpkclnt!FwpsInjectTransportReceiveAsync0+0x1bc
    942d1d1c 931152fa 00000000 00000000 84cd8d48 inspect!TLInspectCloneReinjectInbound+0xc2 [p:\winddk\7600.16385.0\src\network\trans\inspect\sys\inspect.c @ 1033]
    942d1d50 82a1b66d 00000000 b388620e 00000000 inspect!TLInspectWorker+0xd2 [p:\winddk\7600.16385.0\src\network\trans\inspect\sys\inspect.c @ 1216]
    942d1d90 828cd0d9 93115228 00000000 00000000 nt!PspSystemThreadStartup+0x9e
    00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    NETIO!NetioDereferenceNetBufferList+a2
    88abfb48 85ff            test    edi,edi

    SYMBOL_STACK_INDEX:  2

    SYMBOL_NAME:  NETIO!NetioDereferenceNetBufferList+a2

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: NETIO

    IMAGE_NAME:  NETIO.SYS

    DEBUG_FLR_IMAGE_TIMESTAMP:  4a5bbf63

    FAILURE_BUCKET_ID:  0xD1_NETIO!NetioDereferenceNetBufferList+a2

    BUCKET_ID:  0xD1_NETIO!NetioDereferenceNetBufferList+a2

    Followup: MachineOwner
    ---------




    Tuesday, August 11, 2009 7:47 AM

All replies

  • Had the sample been modified?

    thanks,
    Biao.W.
    Friday, August 14, 2009 5:40 PM
  • No, the sample is unmodified. It's just the sample on a clean Windows 7. No third party software installed. 

    I can offer you to download the dump.

    Regards
    Frank
    Saturday, August 15, 2009 12:19 AM
  • Yes please send a email to wfp@microsoft.com with instructions on how to download the memory dmp.

    Thanks!
    Biao.W.

    • Proposed as answer by shuishangxin Thursday, November 25, 2010 10:34 AM
    Saturday, August 15, 2009 1:12 AM
  • Any progress? My driver is working fine in Vista, but same problem happened in Windows 7. The call stack is the same as above. I can offer the kernel memory dump.

    Thanks
    xiaolin

    • Proposed as answer by shuishangxin Thursday, November 25, 2010 10:34 AM
    Friday, August 28, 2009 6:39 AM
  • is your driver re-using the same logic below? Does the ASSERT fire prior to the bugcheck?

       //

       // The TCP/IP stack could have retreated the net buffer list by the

       // transportHeaderSize amount; detect the condition here to avoid

       // retreating twice.

       //

       if (nblOffset != packet->nblOffset)

       {

          ASSERT(packet->nblOffset - nblOffset == packet->transportHeaderSize);

          packet->transportHeaderSize = 0;

       }

    Wednesday, September 2, 2009 4:38 AM
  • Hi This is an update --

    We believe we had root-caused the issue behind the Inspect sample crashing on Win7 machines. In a nutshell it is a Tcpip/WFP interaction bug manifested by some raw socket listener changes made in Win7.

    As a workaround you can modify the sample such that packet is cloned from within classifyFn instead of referenced and clone outside of classifyFn.


    The sample's the approach (calling FwpsReferenceNetBufferList0 from classifyFn and then cloning it from the worker thread) makes it susceptible to the subtle changes introduced in Win7.

    If workaround is not desirable, you could also contact MSFT PSS for a hotfix.

    Thanks,
    Biao.W.

    • Marked as answer by Biao Wang [MSFT] Thursday, February 25, 2010 8:02 PM
    • Unmarked as answer by smilish Wednesday, February 9, 2011 2:24 PM
    Thursday, February 25, 2010 8:02 PM
  • hello:

           I  want to know what is the final 方案 to the problem.

                     thanks! 

    Thursday, November 25, 2010 10:33 AM
  • Yes please send a email to wfp@microsoft.com with instructions on how to download the memory dmp.

    Thanks!
    Biao.W.


    hello:

          I want to know how you solve the problem finally.

                         thanks!

    Thursday, November 25, 2010 10:35 AM
  • I  want to know what is the final 方案 to the problem.

                     thanks! 

    Thursday, November 25, 2010 10:36 AM
  •  

    Hi.

    I implemented as you suggested. I cloned the NetBufferList within my classifyFn. The problem still persists (although much much harder to reproduce, after the modification).

    Is this problem fixed with upcoming Win7 SP1? I could live with that one.

     

    Thanks

    Frank

    Wednesday, February 9, 2011 2:32 PM
  • Hi All,

     

    The fix of this issue is complex thus not included in the Win7 SP1 release. If you need a hotfix, please contact Microsoft customer service for a hotfix request, we will consider the fix as a hotfix.

    Thanks,

    Charlie


    Charlie
    Tuesday, May 3, 2011 11:15 PM
  • Has there ever been a resolution to this issue?  The inspect sample still crashes sporadically on Windows 7.  Thank you,
    Friday, February 24, 2012 1:21 AM
  • As previously stated, you will need to contact Microsoft Product support request a fix.

    Thanks,


    Dusty Harper [MSFT]
    Microsoft Corporation
    ------------------------------------------------------------
    This posting is provided "AS IS", with NO warranties and confers NO rights
    ------------------------------------------------------------

    Friday, February 24, 2012 3:44 AM
    Moderator
  • For the life of me I cannot find what you mean by "microsoft product support". I found a good tech suport contact process but it got stopped pretty early since I am outside of the US. So instead I've posted it in the most technical support place I can find for microsoft:

    http://answers.microsoft.com/en-us/windows/forum/windows_7-networking/request-for-windows-7-hotfix-re-crash-with-inspect/3600c4e1-fab0-4dec-bb95-8d354794f7ab

    Hopefully someone there will tell me exactly where I can submit a hotfix development request. If you would like to see this patch implemented too, then you can click "Me Too" under "X people had this question"

    Friday, March 2, 2012 12:42 PM
  • Microsoft Product Support:

    http://support.microsoft.com/common/international.aspx?RDPATH=dm;en-us;select&target=assistance

    If your company has a Microsoft Technical Account Manager (TAM), then you can contact them as well.

    Unfortunately forums are not able to push for fixes (although they do help to bring up the issues and give us justifications for the fixes).  The requests need to come in through the official Microsoft Product support channels.

    Hope this helps,


    Dusty Harper [MSFT]
    Microsoft Corporation
    ------------------------------------------------------------
    This posting is provided "AS IS", with NO warranties and confers NO rights
    ------------------------------------------------------------

    Friday, March 2, 2012 5:15 PM
    Moderator
  • I'm running into this exact same issue, and before I spend more hours trying to get through the customer support gauntlet, lets see if we can solve this here.

    Two years and no fix for Windows 7, where is this supposed non-public hotfix? November 20th 2014, with all the latest service packs, and updates, and there's still issues using the approach to defer the cloning of a packet for inspection/modification/reinjection to a lower priority system thread.

    My current target of interest HAS to be Windows 7 Professional x64. The kernel driver was developed using Visual Studio 2013 Professional, with DDK 8.1, and borrows heavily from the inspect example to get this working.

    Is the below method supported, or not?

    1. FwpsReferenceNetBufferList is called on the NET_BUFFER_LIST (NBL) in the inbound classify callout.
    2. That same NBL is then handed off to a previously created system thread.
    3. Alerted system thread then processes the NBL it was handed
    4. Packet is then re-injected via FwpsInjectNetworkSendAsync OR FwpsInjectNetworkReceiveAsync (have tried both)
    5. Upon injection completion, the registered callback frees and dereferences the NBL via calls to FwpsFreeCloneNetBufferList, then FwpsDereferenceNetBufferList

    So what's the problem here, FwpsReferenceNetBufferList is failing to increment all the reference counts in the NBL chain? We can see from all the supplied stack traces that the bug check results from a call to NETIO!NetioDereferenceNetBufferList on a NULL NBL.

    Is FwpsReferenceNetBufferList incompatible with the kernel's injection implementation calling NetioDereferenceNetBufferList?

    Is there a NetioReferenceNetBufferList the inbound callout classify function should have called instead of the Fwps version?

    __________________________________

    STACK_TEXT:  
    fffff880`02f2ee18 fffff800`028d2169 : 00000000`0000000a 00000000`00000000 00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
    fffff880`02f2ee20 fffff800`028d0de0 : fffff880`02f2f290 fffff880`02f2f2b0 00000000`00001002 fffffa80`103af3b0 : nt!KiBugCheckDispatch+0x69
    fffff880`02f2ef60 fffff880`01a9556b : fffffa80`103af3b0 fffff880`018346b8 00000000`206c644d 00000000`00000000 : nt!KiPageFault+0x260
    fffff880`02f2f0f0 fffff880`01987316 : fffffa80`103af3b0 00000000`01985d4b 00000000`00000000 00000000`00000000 : tcpip! ?? ::FNODOBFM::`string'+0x57b4
    fffff880`02f2f140 fffff880`01986a72 : 00000000`00000000 00000000`00000000 00000000`00000002 00000000`00001000 : NETIO!NetioDereferenceNetBufferList+0x86
    fffff880`02f2f170 fffff880`01a5b232 : 00000000`00000000 fffffa80`10fd3100 fffff880`02f2f2c0 fffffa80`00000000 : NETIO!NetioDereferenceNetBufferListChain+0x332
    fffff880`02f2f240 fffff880`01a3e28f : fffff880`01b6e9a0 00000000`00000000 00000000`00000000 fffff880`02f2f3d8 : tcpip!IppReceiveHeaderBatch+0x3c3
    fffff880`02f2f320 fffff800`028de878 : fffff880`01b6e9a0 00000000`00000000 00000000`00000000 00000000`00000000 : tcpip!IppLoopbackTransmit+0x38f
    fffff880`02f2f3d0 fffff880`01a3e92f : fffff880`01a916fc fffffa80`10662660 fffff880`02f2f502 00000000`00000000 : nt!KeExpandKernelStackAndCalloutEx+0xd8
    fffff880`02f2f4b0 fffff880`01a5d4ca : fffffa80`10fd31c0 00000000`00000030 fffffa80`10662600 fffffa80`10752db0 : tcpip!IppLoopbackEnqueue+0x22f
    fffff880`02f2f560 fffff880`01a5ebf5 : 00000000`00000000 fffffa80`00000000 fffffa80`676e7000 00000000`00000000 : tcpip!IppDispatchSendPacketHelper+0x38a
    fffff880`02f2f620 fffff880`01a5de7e : fffffa80`0d216811 fffff880`02f2f900 00000000`00000014 fffffa80`00000000 : tcpip!IppPacketizeDatagrams+0x2d5
    fffff880`02f2f740 fffff880`01b3785f : fffffa80`10752db0 fffff880`01b6e900 00000000`00000000 fffffa80`10752db0 : tcpip!IppSendDatagramsCommon+0x87e
    fffff880`02f2f8e0 fffff880`0183512d : 00000000`00000001 00000000`00000000 00000000`00000000 fffffa80`127644b0 : tcpip!IppInspectInjectRawSend+0x15f
    fffff880`02f2fa80 fffff880`04e314d4 : fffffa80`127645e0 00000000`00000008 00000000`00000000 fffff800`00000001 : fwpkclnt!FwpsInjectNetworkSendAsync0+0x1c1
    fffff880`02f2fb30 fffff880`04e3274e : fffff800`02a5bc00 00000000`00000080 00000000`00000000 00000000`00000000 : xxx!_xxxCloneModifyReinjectInbound+0x254 [xxx.c @ xxx]
    fffff880`02f2fba0 fffff800`02b6e73a : fffffa80`0d1a03a0 fffffa80`0cd8db30 fffff880`02f2fc70 fffffa80`0d1a03a0 : xxx!xxxWorker+0x9e xxx.c @ xxx]
    fffff880`02f2fc00 fffff800`028c38e6 : fffff800`02a4de80 fffffa80`0d1a03a0 fffff800`02a5bcc0 fffff880`02ebdde0 : nt!PspSystemThreadStartup+0x5a
    fffff880`02f2fc40 00000000`00000000 : fffff880`02f30000 fffff880`02f2a000 fffff880`02f2f8f0 00000000`00000000 : nt!KxStartSystemThread+0x16


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    NETIO!NetioDereferenceNetBufferList+86
    fffff880`01987316 4885ff                  test    rdi,rdi

    SYMBOL_STACK_INDEX:  4

    SYMBOL_NAME:  NETIO!NetioDereferenceNetBufferList+86

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: NETIO

    IMAGE_NAME:  NETIO.SYS

    DEBUG_FLR_IMAGE_TIMESTAMP:  5294760d

    FAILURE_BUCKET_ID:  X64_0xD1_NETIO!NetioDereferenceNetBufferList+86

    BUCKET_ID:  X64_0xD1_NETIO!NetioDereferenceNetBufferList+86

    ANALYSIS_SOURCE:  KM

    FAILURE_ID_HASH_STRING:  km:x64_0xd1_netio!netiodereferencenetbufferlist+86

    FAILURE_ID_HASH:  {b03a5328-38ca-1e4a-8e26-dcd45efb9256}

    Followup: MachineOwner

    Thursday, November 20, 2014 11:18 PM
  • I had same problem in Windows 7.

    For me it happened when I've tried to reinject cloned or copied OUTBOUND EOF NBL packet.

    So my windows 7 workaround is to reference all NBL with data, clone INBOUND EOF NBL, and skip OUTBOUND EOF NBL. Later on when I need to inject OUTBOUND EOF back to stream I'm creating new zero length NBL using FwosAllocateNetBufferAndNetBufferList and Inject it to the stream.

    In Windows 8 and later referencing all packets works fine.


    Thursday, December 18, 2014 12:47 PM