none
TDI filter driver interferes WFP filter driver!? RRS feed

  • Question

  • Hi,

     

    I wrote a WFP callout driver for out-of-band stream data inspection/modification (layer FWPS_LAYER_STREAM_V4/6). The driver blocks all data at the stream layer (for a specific port, e.g. 80 for http), routes the data through a user app and injects all data back to the network stack. This works fine so far.

     

    But there seems to be some kind of problem when there is also an TDI filter driver installed on the system: Internet connections get very slow. There sometimes is a really strange 250ms delay between receives and send request on one connection.

     

    Are there any known problems between TDI and WFP filter drivers when they run simultaneously? How is it possible that these types of drivers interfere each other?

     

    Thank you for your answer!

     

     

    Tuesday, August 28, 2007 1:47 PM

Answers

  • Unfortunately I don't see a workaround on the WFP side -- the way push bit is encoded in the NET_BUFFER_LIST is private implementation detail and can subject to change w/o notice. We would need to expose a public mean for callouts to check for presence of the Push bit as part of the fix.

     

    There may be ways to modify the data receving model of your TDI driver to workaround this delay, however. The filter driver with which we were able to reproduce the problem first rejects a receive indication and then re-posts buffer to receive it. This is not the way IE performs a receive. And that's why the delay is not present w/o the TDI driver.

     

    I am not familar with TDI enough to suggest what could be changed to mitigate this issue.

     

    Biao.W.

    Friday, September 14, 2007 11:15 PM

All replies

  • Do you own this TDI filter driver? Do you know what kind of filtering it performs? If you are not familiar with the driver, can you google the driver name to see if there is known perf issue by itself?

     

    Can you confirm the perf issue is not there if only the TDI driver is present? Is the 250ms delay inbound or outbound?

     

    Another area to investigate is whether this has anyting to do with the injection rate of the WFP driver -- for example, if timing conditions cause you to inject large amount of data back to the stack in a short period of time, it may overwhelm the buffering of the TDI driver and causing it to discard data. If that's the case you may want to rate-limit your injection code.

     

    Biao.W.  

    Tuesday, August 28, 2007 6:58 PM
  • Hi Biao,

     

    We have an own TDI driver in our product and this problem occurs when the TDI driver and the WFP driver is loaded at the same time. In my scenario the TDI driver does "nothing", just passing through. I don't believe that the TDI driver is buggy:

    I installed a product from a different company which also includes a TDI driver (just to test if it is our own TDI driver). If this product is installed, I can observe exactly the same problem. So it seems to be a more general problem.

     

    I can confirm that this does not happen if only the TDI driver is present. Also the data amount is not very large so I think this is not the problem.

     

    I have some debug output for you, so you can see what happens (I was wrong with the 250ms, it is exactly 500ms as you can see in the debug output), my comments are written in red:

     

    App (e.g. IExplorer) sends a request:

    08\29\2007-09:57:00:451 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:451 0x84409CA8 flags = 0x00010000

    08\29\2007-09:57:00:451 0x84409CA8 >>> SND: Bytes: 390: GET /images/I/21Ef9rIvR%2BL.jpg HTTP/1.1

    08\29\2007-09:57:00:451 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 390)

    The data is blocked and reinjected using a DPC. The data is correctly indicated at an inspection filter (stream layer):

    08\29\2007-09:57:00:451 0x84409CA8 -> DpcStreamInject(V4, 202, nBytes = 390)

    08\29\2007-09:57:00:451 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:451 0x84409CA8 >>> SND: Bytes: 390: GET /images/I/21Ef9rIvR%2BL.jpg HTTP/1.1

    08\29\2007-09:57:00:451 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:00:451 0x84409CA8 <- DpcStreamInject(V4, 202)

    Data is received 20ms later:

    08\29\2007-09:57:00:473 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:473 0x84409CA8 flags = 0x00000001

    08\29\2007-09:57:00:473 0x84409CA8 <<< RCV: Bytes: 614: HTTP/1.1 304 Not Modified

    08\29\2007-09:57:00:473 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 614)

    The received data is blocked and reinjected:

    08\29\2007-09:57:00:473 0x84409CA8 -> DpcStreamInject(V4, 203, nBytes = 614)

    08\29\2007-09:57:00:473 0x84409CA8 -> StreamInjectCompletion(V4, 202): nBytes = 390

    08\29\2007-09:57:00:473 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:473 0x84409CA8 <<< RCV: Bytes: 614: HTTP/1.1 304 Not Modified

    08\29\2007-09:57:00:473 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:00:473 0x84409CA8 <- DpcStreamInject(V4, 203)

    08\29\2007-09:57:00:473 0x84409CA8 -> StreamInjectCompletion(V4, 203): nBytes = 614

    All data is injected now and was indicated to the next stream filter, so everything should be OK. Now it takes exactly 500ms till the app sends the next request. Again, data is send and received:

    08\29\2007-09:57:00:973 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:973 0x84409CA8 flags = 0x00010000

    08\29\2007-09:57:00:973 0x84409CA8 >>> SND: Bytes: 338: GET /images/I/11W4MYC3E1L.jpg HTTP/1.1

    08\29\2007-09:57:00:973 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 338)

    08\29\2007-09:57:00:973 0x84409CA8 -> DpcStreamInject(V4, 209, nBytes = 338)

    08\29\2007-09:57:00:973 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:00:973 0x84409CA8 >>> SND: Bytes: 338: GET /images/I/11W4MYC3E1L.jpg HTTP/1.1

    08\29\2007-09:57:00:973 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:00:973 0x84409CA8 <- DpcStreamInject(V4, 209)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 flags = 0x00000001

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 1460: HTTP/1.1 200 OK

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 1460)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamInjectCompletion(V4, 209): nBytes = 338

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 flags = 0x00000001

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 1460: ªq̪HÈ¡K¹I»UÍÀŽ5îXⳞé®uX¬£9ÛóÊG ÿ

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 1460)

    08\29\2007-09:57:01:09 0x84409CA8 -> DpcStreamInject(V4, 210, nBytes = 1460)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 1460: HTTP/1.1 200 OK

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 flags = 0x00000001

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 702: ¬é#s,`7ûEzöß„.½öœ)ÁÙ·Ô Çð¯Ô•

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 702)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamInjectCompletion(V4, 210): nBytes = 1460

    08\29\2007-09:57:01:09 0x84409CA8 <- DpcStreamInject(V4, 210)

    08\29\2007-09:57:01:09 0x84409CA8 -> DpcStreamInject(V4, 211, nBytes = 1460)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 1460: ªq̪HÈ¡K¹I»UÍÀŽ5îXⳞé®uX¬£9ÛóÊG ÿ

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:01:09 0x84409CA8 <- DpcStreamInject(V4, 211)

    08\29\2007-09:57:01:09 0x84409CA8 -> DpcStreamInject(V4, 212, nBytes = 702)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamV4Callout_dbg(): flowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:09 0x84409CA8 <<< RCV: Bytes: 702: ¬é#s,`7ûEzöß„.½öœ)ÁÙ·Ô Çð¯Ô

    08\29\2007-09:57:01:09 0x84409CA8 <- StreamV4Callout_dbg()

    08\29\2007-09:57:01:09 0x84409CA8 <- DpcStreamInject(V4, 212)

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamInjectCompletion(V4, 211): nBytes = 1460

    08\29\2007-09:57:01:09 0x84409CA8 -> StreamInjectCompletion(V4, 212): nBytes = 702

    All data is received and reinjected, and again the next request takes more or less exactly 500ms to be send out:

    08\29\2007-09:57:01:517 0x84409CA8 -> StreamCallout(V4): uFlowHandle = 0x00000000000000a7

    08\29\2007-09:57:01:517 0x84409CA8 flags = 0x00010000

    08\29\2007-09:57:01:517 0x84409CA8 >>> SND: Bytes: 340: GET /images/I/11rW%2Biv1AmL.jpg HTTP/1.1

    08\29\2007-09:57:01:517 0x84409CA8 <- StreamCallout(V4, BLOCKING nBytes = 340)

     

    This all happens on one connection (always the same flow handle). If I just download a file for example, everything is fast and OK. Only those "ping-pong-connections" cause this strange 500ms delay.

     

    I think this should be easy for you to reproduce. Install a product which installs a TDI driver (I took a trial version of Kaspersky Internet Security 7.0). Then you need a callout driver which clones every stream packet and just injects it back to the network stack. Then you should be able to see the problem (for example www.amazon.de is good for testing).

     

    So, any idea what goes wrong here?

     

     

     

     

     

    Wednesday, August 29, 2007 9:04 AM
  • Hi Biao,

     

    Anything new on this topic? Did you already try to reproduce this problem? At least I would like to know if this problem is caused by my WFP driver or if this is a Windows Vista problem...

     

    Thanks for your answer.

     

    Boris

     

    Tuesday, September 4, 2007 11:55 AM
  • We will try to repro this issue. However, TDI is going to be obsoleted by WFP. Could you consider migrating all your drivers to WFP? Are there specific reasons or functionalities you prefer TDI over WFP?

     

    Charlie

    Tuesday, September 4, 2007 10:14 PM
  • Hi Charlie,

     

    We will migrate all of our drivers to WFP, that's not the point. But, what if our customers have some kind of TDI driver installed on their systems at home? This will (right now) mean, that my WFP driver will not work properly. This will definitely cause some support.

     

    I encountered another strange thing. As mentioned in another thread in this forum, I had the problem described in KB929547. This problem does not (!) occur, if our TDI driver is present! So I can decide between KB929547 (without TDI driver) and the problem described in this thread (with TDI driver). I know that KB929547 will be fixed with SP1, but if it is somehow possible, I would like to know if the problem described in this thread will also be fixed with SP1. I just want to make sure that my driver works fine in all situations...

     

    Best regards,

     

    Boris

     

     

    Wednesday, September 5, 2007 5:33 AM
  • We can repro the issue, will debug and come back to you soon later. Thanks for reporting.

     

    Charlie

    Friday, September 7, 2007 7:19 AM
  • Boris,

     

    Can you share the spec of the machine where you observed the delay? Is it a dual-core, or dual-proc machine?

     

    thanks,

    Biao.W.

    Friday, September 7, 2007 8:21 PM
  • My test machine is dual-core, but this problem is also present in a VirtualPC-Vista with 1 proc. Do you already know what the problem is?

     

    Boris

     

     

    Monday, September 10, 2007 3:45 PM
  • One more question -- What api you are using to create the Net buffer list for re-injection? FwpsCloneStreamData? FwpsAllocateCloneNetBufferList0? Or FwpsAllocateNetBufferAndNetBufferList0?

     

    I think we are making progress here in understanding the issue.

     

    Thanks,

    Biao.W.

    Tuesday, September 11, 2007 1:37 AM
  • I tried both: just cloning with FwpsCloneStreamData and building new NBLs with FwpsAllocateNetBufferAndNetBufferList. There is no difference: as soon as I start to block and reinject something, this delay occurs.

     

    Boris

     

     

    Tuesday, September 11, 2007 8:32 AM
  •  

    Hi Boris,

     

    We've identified the root cause for this delay -- the TCP Push bit encoded in the NET_BUFFER_LIST structure is lost during FwpsCloneStreamData or FwpsAllocateNetBufferAndNetBufferList; and coupled with the way TDI drivers receive network traffic the 500mS delay then manifests itself.

     

    We are investigating a fix for possible SP1 inclusion and would like to thank you for reporting this issue to us.

     

    If a hotfix is required or you consider this a must-fix for SP1, please contact your Microsoft representitive stating such.

     

    Thanks,

    Biao.W.

    Thursday, September 13, 2007 3:48 AM
  • So... is there any workaround? Is it possible to set the TCP push bit in the NBL manually?

     

    Thanks,

    Boris

     

     

    Thursday, September 13, 2007 8:20 AM
  • Unfortunately I don't see a workaround on the WFP side -- the way push bit is encoded in the NET_BUFFER_LIST is private implementation detail and can subject to change w/o notice. We would need to expose a public mean for callouts to check for presence of the Push bit as part of the fix.

     

    There may be ways to modify the data receving model of your TDI driver to workaround this delay, however. The filter driver with which we were able to reproduce the problem first rejects a receive indication and then re-posts buffer to receive it. This is not the way IE performs a receive. And that's why the delay is not present w/o the TDI driver.

     

    I am not familar with TDI enough to suggest what could be changed to mitigate this issue.

     

    Biao.W.

    Friday, September 14, 2007 11:15 PM
  •  

    Just a note that we are fixing this in Vista SP1 (and WS08).

     

    When a stream data w/ Push bit on is indicated, a new stream flag FWPS_STREAM_FLAG_RECEIVE_PUSH will be set.

     

    If the re-injected data is a clone of the original indication, the Push bit will be carried forward automatically. If injection data is allocated using FwpsAllocateNetBufferAndNetBufferList0 function, a callout can manually set the push bit flag when invoking FwpsStreamInjectAsync0 function.

     

    Biao.W.
    Friday, September 21, 2007 9:33 PM
  • Sounds good. Do you already know when there will be a new WS08 or SP1 beta with this bug fixed? Can you give me the #define for FWPS_STREAM_FLAG_RECEIVE_PUSH, so I can adjust my driver accordingly. Thanks!

     

    Boris

     

     

    Monday, September 24, 2007 8:15 AM
  • The fix should be included in WS08 RC1 (and the corresponding SP1 release).

     

    This is what you will see in fwpsk.h

     

    // Flags set on receives (inbound stream layer).
    #define FWPS_STREAM_FLAG_RECEIVE                (0x00000001)
    // Received TCP urgent data.
    #define FWPS_STREAM_FLAG_RECEIVE_EXPEDITED      (0x00000002)
    #define FWPS_STREAM_FLAG_RECEIVE_DISCONNECT     (0x00000004)
    #define FWPS_STREAM_FLAG_RECEIVE_ABORT          (0x00000008)
    #if (NTDDI_VERSION >= NTDDI_WIN6SP1)
    #define FWPS_STREAM_FLAG_RECEIVE_PUSH           (0x00000010)
    #endif // (NTDDI_VERSION >= NTDDI_WIN6SP1)

    // Flags set on sends (outbound stream layer).
    #define FWPS_STREAM_FLAG_SEND                   (0x00010000)
    #define FWPS_STREAM_FLAG_SEND_EXPEDITED         (0x00020000)
    #define FWPS_STREAM_FLAG_SEND_NODELAY           (0x00040000)
    #define FWPS_STREAM_FLAG_SEND_DISCONNECT        (0x00080000)
    #define FWPS_STREAM_FLAG_SEND_ABORT             (0x00100000)


     

    Biao.W.
    Monday, September 24, 2007 7:27 PM
  •  

    I see that RC0 was just released (which is a silly name IMHO).  Did you literally mean RC1, or did you really mean RC0 (the first RC)?
    Wednesday, September 26, 2007 5:08 PM
  • Yes I meant RC1 (the next RC release). This fix is not present in the just released RC0.

     

    We didn't get to pick those names.

     

    Biao.W.

    Wednesday, September 26, 2007 8:39 PM