none
Issue injecting ARP packets RRS feed

  • Question

  • Hello.

     

    We have a LWF driver which, based on WDK NDIS LWF driver, intercepts NDIS packets.

     

    The associated INF file is configured as follows:

     

    - FilterRunType -> 0x00010001, 2 (Optional).

    - FilterMediaTypes -> "ethernet, tokenring, fddi, wan"

    - FilterClass -> "compression"

     

    Our LWF driver attaches to Ethernet and Wifi devices. The filter drops or reinjects intercepted network packets (by means of NdisFSendNetBufferLists, in the outbound case); this works well in Windows Vista, Windows 7, Windows 8 and Windows 8.1. Nevertheless, sometimes it’s needed that we inject an outbound crafted ARP packet generated by the filter (not an intercepted one). But it seems that an error is occurring while the injection is done, because when NDIS calls to  FilterReturnNetBufferLists function, the following error code is received in the NBL status field:

     

    0xC023001F

    STATUS_NDIS_MEDIA_DISCONNECTED

    The I/O operation failed because the network media is disconnected or the wireless access point is out of range.

     

    This is happening with WiFi adapters  in Windows 8 and Windows 8.1. As far as we are concerned, this might have nothing to do with the ARP packet crafting itself, because if we unload and load the LWF filter the problem is fixed.

     

    He have the suspicion that it could be a problem with the adapters binding, so we have traced the process and observed the following sequence:

     

    (ISSUE REPRODUCED)

     

    FilterAttach

                    ASUS Set to paused

    FilterRestart

                    ASUS Set to running

                   

    FilterAttach

                    WiFi Direct set to paused

    FilterRestart

                    WiFi Direct Set to running

    FilterPause

                    WiFi Direct Set to pausing

                    WiFi Direct Set to paused

    FilterRestart

                    WiFi Direct Set to running

     

    (ISSUE FIXED)

     

    FilterAttach

                    WiFi Direct set to paused

    FilterRestart

                    WiFi Direct set to running

     

    FilterAttach

                    ASUS Set to paused

    FilterRestart

                    ASUS Set to running

     

    Could the binding process be related to the issue? What might be happening?

     

    We would appreciate it if you could help us with this issue. Thank you in advance,

    Thursday, June 12, 2014 4:55 PM

All replies

  • What port number are you sending the ARP frames on? NDIS_DEFAULT_PORT_NUMBER?

    When are you sending the ARP frames?  Based on the status code, I think the send operation might be occurring before the NIC has finished associating with an AP.

    One more thing to check: is your filter binding above or below the NWIFI filter driver?  Use !ndiskd.miniport and share out the BINDINGS section.  The media connect state is implemented by the NWIFI filter driver, so any filter below it will see "wrong" values.

    Monday, June 16, 2014 7:20 PM
  • Hi Jeffrey, thanks for your reply.

    - Yes, we are using NDIS_DEFAULT_PORT_NUMBER. Is this a problem?

    - We are sending ARP frames after another (user mode) module detects the "wlan_notification_acm_connection_complete" network event. We think the NIC has already associated with the AP by the time this event is fired, is this correct?

    This is the output for !ndiskd.miniport command:

    0: kd> !ndiskd.miniport
        MiniDriver         Miniport            Name                                 
        8f13d948           8f13e0e8            Adaptador virtual directo Wi-Fi de Microsoft
        85ca5d90           85cc20e8            Realtek RTL8188CU Wireless LAN 802.11n USB 2.0 Network Adapter
        85a57d90           85ba30e8            Minipuerto WAN (PPPOE)
        85b9f618           85b9d0e8            Minipuerto WAN (L2TP)
        85aff998           85b9e0e8            Minipuerto WAN (Monitor de red)
        85ad8010           85b9b0e8            Adaptador ISATAP de Microsoft #2
        85ad8010           85b970e8            Adaptador ISATAP de Microsoft
        85b99c88           85b980e8            Minipuerto WAN (PPTP)
        85aff998           85ae90e8            Minipuerto WAN (IPv6)
        85ad8010           85ade0e8            Teredo Tunneling Pseudo-Interface
        85ac7010           85ad60e8            Minipuerto WAN (IKEv2)
        85ac9588           85ad11d8            Minipuerto WAN (SSTP)
        85aff998           85afe0e8            Minipuerto WAN (IP)

    And this is the BINDINGS section for the Wifi adapter (our filter is bold-marked):

    0: kd> !ndiskd.miniport 85cc20e8


    MINIPORT

        Realtek RTL8188CU Wireless LAN 802.11n USB 2.0 Network Adapter

        Ndis handle        85cc20e8
        Ndis API version   v6.30
    *** ERROR: Module load completed but symbols could not be loaded for rtwlanu.sys
        Adapter context    860f7000
        Miniport driver    85ca5d90 - RtlWlanu  v1.0
        Network interface  85877008

        Media type         802.3
        Physical medium    Native802.11
        Device instance    USB\VID_0BDA&PID_8176\00e04c000001
        Device object      85cc2030            More information
        MAC address        a0-f3-c1-27-dc-69


    STATE

        Miniport           Running
        Device PnP         Started             Show state history
        Datapath           Normal
        Interface          Up
        Media              Connected
        Power              D0
        References         0n152               Show detail
        Automatic resets   2
        Resets requested   0
        Pending OID        None
        Flags              NOT_BUS_MASTER, DEFAULT_PORT_ACTIVATED,
                           SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,
                           MEDIA_CONNECTED
        PnP flags          PM_SUPPORTED, DEVICE_POWER_ENABLED, RECEIVED_START,
                           HARDWARE_DEVICE, NDIS_WDM_DRIVER


    BINDINGS

        Open List          Open                Protocol           Context           
        RSPNDR             8f036608            8f0335a0           8f036828
        NDISUIO            8f0209b0            8f02de10           8f02d7b0
        LLTDIO             8f00f7f8            9bff3860           8f00e008
        TCPIP6             85b69008            85869598           8f00d840
        TCPIP              8f008468            85819710           8f00bb40

        Filter List        Filter              Filter Driver      Context           
        WFP 802.3 MAC Layer LightWeight Filter-0000
                           86048008            85819008           860568f8
        QoS Packet Scheduler-0000
                           86048888            85966e78           86048728
        Network Activity Hook Server - LightWeight Filter-0000
                           8f008688            85a63950           8f00c008
        Native WiFi Filter Driver-0000
                           9bfffdf0            86f5fd00           8f002508
        Virtual WiFi Filter Driver-0000
                           86f77938            85a589b8           9acb3528
        WFP Native MAC Layer LightWeight Filter-0000
                           86ffedf0            85870908           86f7ea40


    MORE INFORMATION

        Driver handlers                        Task offloads
        Power management                       PM protocol offloads
        Pending OIDs                           Timers
        Pending NBLs
        Wake-on-LAN (WoL)                      Packet filter
        Receive queues                         Receive filtering
        RSS                                    NIC switch
        Hardware resources                     Selective suspend
        NDIS ports                             WMI guids

    Maybe this is noteworthy: the "Automatic resets" line is colored in red, does it mean something wrong?

    We also have some NDIS traces, but we really don't know what to look for...any indications?

    Thanks again for your help!





    Tuesday, June 17, 2014 8:33 AM
  • I'm afraid I don't have any concrete solution for you.  Here are some thoughts and answers to your incidental questions.

    No, DEFAULT_PORT is correct; I just wanted to verify that you were using that.

    Your filter binds above NWIFI, so it shouldn't get into trouble with the media state.

    I think wlan_notification_acm_connection_complete should be a fine trigger.

    The "automatic resets" is indeed not good, but it actually is typical of 802.11 NICs. These NICs encounter issues fairly often. Unless the reset counter always increases when you see the problem, I don't think it's related to the issue here.

    NDIS traces won't help much, unfortunately, since NDIS doesn't trace datapath events. (The tracing infrastructure can't scale nearly as much as NDIS needs to scale up to.)

    I wonder if this is a race of some sort?  What if you insert an artificial delay into the send path?  For example, wait 5000 milliseconds.  Or more interestingly, send an ARP every 16 milliseconds and see which of them comes back success/failure?  If it's a race, then you might need to look for a better event to trigger this off of.

    Tuesday, June 17, 2014 11:43 PM
  • Hi Jeffrey,

    Thanks, we will try your suggestions.

    However, the fact that this issue is solved when we unload and load our filter, makes us think that it might be related to binding... any thoughts on that?



    Thanks again!
    Wednesday, June 18, 2014 8:31 AM
  • There are a couple reasons I'm not sure it's directly related to binding. First, you said this repros in both Windows 8 and Windows 8.1.  We rewrote the binding logic in NDIS between those two releases, so it's not super likely that there would be the same OS bug in both.

    Secondly, binding actually doesn't interact with the datapath very much.  Although it's possible that a bug in NDIS's binding logic could break the datapath, I've only seen that once, and that was in Windows 7. 

    It seems more likely to me that unloading and re-loading your filter driver affects the timing of some sort of race condition.

    Wednesday, June 18, 2014 6:58 PM
  • Still trying to figure out this issue...

    Inserting some delays didn't help, but we have seen a couple of interesting things:

    Using !ndiskd.netreport to get a diagram of both cases (working and not working) we saw a different order in the "Components" section, "filter driver" subsection, such as this:

    WORKING CASE
    =========== 
    <Our filter>
    Native WiFi Filter Driver
    Virtual WiFi Filter Driver
    QoS Packet Scheduler  
    WFP native MAC Layer LWF
    WFP 802.3 MAC Layer LWF

    NOT-WORKING CASE

    ==============

    Native WiFi Filter Driver
    <Our filter>
    Virtual WiFi Filter Driver
    QoS Packet Scheduler
    WFP native MAC Layer LWF
    WFP 802.3 MAC Layer LWF



    When the packet injection is working fine, our filter is "above" (at least in this list) NWIFI.
    When the packet injection doesn't work, our filter is "below" NWIFI.

    Could this be the problem? Could we change our place in the filtering stack?

    On the other hand, we are seeing (in both scenarios, good and bad) that our driver's FilterStatus callback is being called twice, with these values:

    NDIS_STATUS_DOT11_LINK_QUALITY (0x4003000C)
    NDIS_STATUS_LINK_STATE (0x40010017)

    Is this normal o could be relevant to our issue?

    Thanks!





    Monday, June 23, 2014 3:07 PM
  • Now that is very interesting & relevant.

    The media connect state will be wrong if your filter binds below NWIFI.SYS. (The reason: NWIFI actually synthesizes the media connect state on behalf of the miniport. Below NWIFI, the miniport just says "I'm connected" 100% of the time.)

    So you do not want your filter to bind below NWIFI.

    Normally, a LWF won't bind below NWIFI unless the LWF is a monitoring LWF.  So my guess is that you have a monitoring LWF.  Check this by looking at FilterType in your INF:

    HKR, Ndi,FilterType,0x00010001, ????

    If the last field in the FilterType line is 1, then your filter is a monitoring filter.  If it's 2, then you have a modifying filter.

    I suspect you have a monitoring filter (FilterType  == 1).  If so, then NDIS will actually try to bind your filter EVERYWHERE.  That means the intended binding stack looks like this:

        Your Filter-0004
        Native Wifi (NWIFI)
        Your Filter-0003
        Virtual WiFi (VWIFI)
        Your Filter-0002
        QoS (PSCHED)
        Your Filter-0001
        WFP Native (WFP_Lower)
        Your Filter-0000
        The wireless NIC

    Since you don't see 5 copies of your filter, my guess is that your filter permits the first instance to attach, then fails the attach of the remaining instances.  (Check if your FilterAttach function has some code to check if it's the "first" instance, perhaps by parsing the name for a "-0000" string, or perhaps by consulting a global list.)

    Monday, June 23, 2014 7:39 PM
  • Hi Jeffrey, thanks for your response.

    Our driver is a modifying LWF, according to !ndiskd:

    Filter type        MODIFYING_FILTER
    Run type           OPTIONAL_FILTER
    Class              compression
    References         6

    And no, we don't have any code for dismissing instances based on that scheme.
    However, we do avoid the attachment if MiniportMediaType is different from "NdisMedium802_3" or "NdisMediumWan", or if MiniportPhysicalMediaType is different from NdisPhysicalMediumWirelessLan or NdisPhysicalMediumNative802_11.

    Could this be the cause?

    OTOH, running the !ndiskd.netreport in another machine with W7 (where injection is working fine) shows our filter under NativeWifi, so it's pretty clear that we are doing something that W8 don't like....but what is it?

    Thanks again!

    Tuesday, June 24, 2014 2:10 PM