locked
Bandwidth problem on Windows 7 RRS feed

  • Question

  • Hello,

    We are developing a VPN product using a WFP callout driver. It works on the transport layer, but injects packet on the network layer.
    The driver has been working fine since the early versions of Vista, but now in Windows 7, we experience bandwith problems when re-injecting packets using FwpsInjectNetworkSendAsync0.
    The driver works by blocking packets during (FWPM_LAYER_OUTBOUND_TRANSPORT_V4), and re-injecting them by cloning the NBL and re-inject. The IP adress are re-written to localhost for our proxy client, that receives the packages and send the packets over a secure connection.
    Today, on Windows 7, the upstream has a bandwidth of approx. 90 KB/sec (using iperf), but on Vista it runs at full speed.

    Our driver filters:
    FWPM_LAYER_OUTBOUND_TRANSPORT_V4
    FWPM_LAYER_INBOUND_TRANSPORT_V4
    FWPM_LAYER_ALE_AUTH_CONNECT_V4
    FWPM_LAYER_ALE_AUTH_RECV_ACCEPT_V4

    We have tested to split up the NBLs into multiple NBLs with one NB in each according to post:
    http://social.msdn.microsoft.com/Forums/en-US/wfp/thread/78c0b998-a739-4f41-aa56-d776748195eb
    After that re-write of the driver the downstream bandwidth was greatly improved, but upstream was improved only by a factor 3. (approx. 300 KB/sec).

    /Andy
    Wednesday, February 3, 2010 1:23 PM

Answers

  • We use FwpsInjectNetworkReceiveAsync0 to inject received packets.

    We have also seen that we could increase speed by providing a unique completionContext to the injection routine. Previously we handed the same context to a number of injections when splitting one NBL into several. When the last completion routine was called, the memory was freed, so we tried to instead of having references to the same completion context, we made an actual copy of the context.
    This increased the speed by a factor of 10.

    /Andy

    After some more performance testing we have found out that the fixes above solved our performance problem. So our fix for win7 was indeed three fixes
    1. When injecting, only inject one NBL at a time. Create a new NBL for each NB
    2. Provide a unique context for each SendAsync0 call. Do not try to be clever and keep a reference counter
    3. Only do (1) for NBLs containing more than one NB. Otherwise clone the packet as the old solution that works on Vista.

    /Andy
    Wednesday, February 10, 2010 7:19 AM

All replies

  • Can you elaborate your upstream processing logic? Are you using WFP recv injection?

    Thanks,
    Biao.W.

    Friday, February 5, 2010 8:47 AM
  • We use FwpsInjectNetworkReceiveAsync0 to inject received packets.

    We have also seen that we could increase speed by providing a unique completionContext to the injection routine. Previously we handed the same context to a number of injections when splitting one NBL into several. When the last completion routine was called, the memory was freed, so we tried to instead of having references to the same completion context, we made an actual copy of the context.
    This increased the speed by a factor of 10.

    /Andy
    Tuesday, February 9, 2010 10:01 AM
  • We use FwpsInjectNetworkReceiveAsync0 to inject received packets.

    We have also seen that we could increase speed by providing a unique completionContext to the injection routine. Previously we handed the same context to a number of injections when splitting one NBL into several. When the last completion routine was called, the memory was freed, so we tried to instead of having references to the same completion context, we made an actual copy of the context.
    This increased the speed by a factor of 10.

    /Andy

    After some more performance testing we have found out that the fixes above solved our performance problem. So our fix for win7 was indeed three fixes
    1. When injecting, only inject one NBL at a time. Create a new NBL for each NB
    2. Provide a unique context for each SendAsync0 call. Do not try to be clever and keep a reference counter
    3. Only do (1) for NBLs containing more than one NB. Otherwise clone the packet as the old solution that works on Vista.

    /Andy
    Wednesday, February 10, 2010 7:19 AM
  • Hi Andy,

    We met the same problem as you. Could you please elaborate in detail what fix you have done? We have already split the NBL to multiple NB. But how should we provide a unique context for each SendAsync0 call? Do we need to make a copy on the original NBL each time when we call SendAsync0? And how can we control the lifetime of MDL owned by the original NBL?

    Thanks for your help in advance!

    Thanks,

    Michael

    Thursday, November 11, 2010 1:49 AM
  • We use FwpsInjectNetworkReceiveAsync0 to inject received packets.

    We have also seen that we could increase speed by providing a unique completionContext to the injection routine. Previously we handed the same context to a number of injections when splitting one NBL into several. When the last completion routine was called, the memory was freed, so we tried to instead of having references to the same completion context, we made an actual copy of the context.
    This increased the speed by a factor of 10.

    /Andy

    Thanks for your instruction! It's quite useful.
    Friday, March 4, 2011 11:56 PM
  • Hi Petula,

    How did you do this? Could you please help me out?

    Thanks,

    Michael

    Wednesday, March 16, 2011 1:14 AM