none
L4 checksum oflload with IPv4 fragments RRS feed

  • Question

  • Hi,

    I've observed a somewhat unexpected behavior from the Windows network stack when transmitting UDP traffic that is being fragmented into IPv4 fragments. The network stack sets the UDP checksum offload flag on (in the OOB) for each separate IPv4 fragment that is being propagated to the miniport driver. Calculating the UDP checksum by the NIC requires buffering of separate IP fragments and that inevitably has negative effect on latency (besides the increased complexity of packet processing by the miniport/NIC). Seems that in such scenario the OS should calculate the UDP checksum by itself instead of asking to offload the calculation (since the natural purpose of offloads is to increase performance..).

    Can anyone please comment on this?

    Thanks



    • Edited by -IgorC- Tuesday, January 30, 2018 9:52 AM
    Monday, January 29, 2018 11:49 PM

Answers

  • Oh, that's interesting.  It's an OS bug that netsh allows you to set an illegal MTU via the subinterface command.  If you use the more conventional command "netsh interface ipv4 set interface Ethernet mtu=128", it correctly complains about the illegal value.

    We'll fix netsh to check the bounds on subinterface too.

    Meanwhile, you don't have to worry about this in your NIC driver.

    Thursday, February 1, 2018 2:22 AM

All replies

  • That shouldn't happen.  The IP/UDP/TCP checksum offloads are stateless, and the NIC does not have to buffer up IP fragments.  Windows' TCPIP stack does not request a layer-4 checksum on a packet that's fragmented at layer-3.

    If you have a repro, we can investigate it as a bug report.  We'd like to know the OS version, which protocol send the datagram (look in NBL->SourceHandle, or NBL->PoolHandle), and a characterization of the packet contents.  Perhaps the packet was injected by some driver other than TCPIP.SYS?

    Tuesday, January 30, 2018 5:45 PM
  • Thanks, Jeffrey. I'll assemble the data and send it.
    Tuesday, January 30, 2018 6:03 PM
  • So here is the repro scenario:

    • Start packet capture on the transmitter side (e.g., with Wireshark or Netmon)
    • Set MTU to 128 (yes, I know it's out of the valid range, nonetheless the OS allows it)
    • Wait for BROWSER packet transmission (or alternatively restart the Computer Browser service)
    • Observe the packet being fragmented to IP fragments
    • Check the UDP checksum - it's invalid, i.e. the OS didn't calculate it or calculated it incorrectly
    • The underlying miniport receives UDP checksum offload flag set for each IP fragment (I see it from our proprietary logs). I suppose NDIS traces should provide this data as well but I'm not sure what flags to enable and for which WPP provider

    Some additional comments:

    • The MTU setting to 128 is done to force the IP fragmentation. Maybe it can be forced by other means as well, but I'm not familiar with them. I don't see any principal obstacle to such behavior with normal MTU values if IP fragmentation is performed
    • Tried to reproduce with DHCP traffic - got IP fragments but in this case the UDP checksum was calculated correctly by the OS (didn't check the miniport logs to see the checksum offload flag but can check it as well if needed). So it indeed seems to be somehow related to higher level protocol. BTW, don't know if it's related but Browser protocol had a zero-day vulnerability - http://www.zdnet.com/article/microsoft-confirms-windows-browser-protocol-zero-day/
    • Observed this behavior on Windows Server 2012 and Windows Server 2016 (didn't try on other Windows versions)

    • Edited by -IgorC- Wednesday, January 31, 2018 10:02 PM
    Wednesday, January 31, 2018 10:00 PM
  • Thanks for the details.  Are you referring to the layer-2 MTU (what the NIC indicates in MiniportInitializeEx) or the IPv4 MTU (settable from Set-NetIPInterface or netsh.exe)?
    Wednesday, January 31, 2018 10:08 PM
  • The netsh MTU.
    Wednesday, January 31, 2018 10:23 PM
  • I'm not able to set an MTU of 128 via netsh.exe.  What's the exact command line you've used?  Are you doing anything else to make it work?  As far as I can see, Windows enforces a minimum MTU of 576 bytes, per RFC.

    In any case, if there is a situation where the OS produces a fragmented IPv4 datagram with checksum enabled, you're free to make your NIC driver do any of these:

    • Silently discard the packet (ideally also incrementing a discard counter)
    • Transmit the packet, without modifying the checksum
    • Transmit the packet, inserting a useless checksum calculated over only the portion of the packet that is in the first fragment
    • Log an error somewhere (make sure to throttle if it's possible that 1000's of packets could get sent)

    You should not add complexity to your driver/hardware to try and buffer these up and reassemble them to compute a correct checksum.  Checksum offload should be simple and stateless.

    We'd like to know if the OS generates these packets in a normal scenario.  But combining an illegal MTU with the deprecated SMBv1 protocol is not an important scenario.

    Wednesday, January 31, 2018 11:57 PM
  • I'm not able to set an MTU of 128 via netsh.exe.  What's the exact command line you've used?  Are you doing anything else to make it work?  As far as I can see, Windows enforces a minimum MTU of 576 bytes, per RFC.

    In any case, if there is a situation where the OS produces a fragmented IPv4 datagram with checksum enabled, you're free to make your NIC driver do any of these:

    • Silently discard the packet (ideally also incrementing a discard counter)
    • Transmit the packet, without modifying the checksum
    • Transmit the packet, inserting a useless checksum calculated over only the portion of the packet that is in the first fragment
    • Log an error somewhere (make sure to throttle if it's possible that 1000's of packets could get sent)

    You should not add complexity to your driver/hardware to try and buffer these up and reassemble them to compute a correct checksum.  Checksum offload should be simple and stateless.

    We'd like to know if the OS generates these packets in a normal scenario.  But combining an illegal MTU with the deprecated SMBv1 protocol is not an important scenario.

    Thanks, I'll see if there is a less esotheric repro scenario.

    Below is a snapshot of setting MTU to 128 on Windows Server 2012R2:

    Setting MTU to 128

    Thursday, February 1, 2018 12:10 AM
  • Oh, that's interesting.  It's an OS bug that netsh allows you to set an illegal MTU via the subinterface command.  If you use the more conventional command "netsh interface ipv4 set interface Ethernet mtu=128", it correctly complains about the illegal value.

    We'll fix netsh to check the bounds on subinterface too.

    Meanwhile, you don't have to worry about this in your NIC driver.

    Thursday, February 1, 2018 2:22 AM
  • Just adding a side note:

    Indeed we were unable to reproduce the issue with normal MTU and wth generated UDP traffic that is being fragmented by the IP layer. However, it seems that data payload affects the decisions made by the transport layer. Doesn't sound valid to me, regardless of the specific repro scenario - could indicate a more serious bug.

    Saturday, February 3, 2018 5:54 AM