none
NBL Copy Techniques RRS feed

  • Question

  • Hi,

    We have developed a NDIS lightweight filter driver with two different copy techniques available. As a general rule we replicate the received NBLs, complete de original ones and process the replicated one. In the end, we inject or discard the replicated depending on the traffic analysis.                   
                                                                                                                                                                                                                                                                                                                      
    Next we will show the two techniques. Explanation refers to both receiving and sending cases.
                                                                                                                                                                                                                                                                                                                      
    Technique 1 is currently used in production. Technique 2 is under development and the one that is being used in the future, due to several improvements.
                                                                                                                                                                                                                                                                                                                      
    Technique 1: NBL Clone

     1. NBL list is received from NDIS (OriginalNBLList).  We make a clone of each NBL in the list with NdisAllocateCloneNetBufferList function and NDIS_CLONE_FLAGS_USE_ORIGINAL_MDLS parameter. Apart from this:
      - Properties copy: 
      - Copy NBLFlags.
      - Ndis functions NdisCopyReceiveNetBufferListInfo and NdisCopySendNetBufferListInfo.
      Note we are not modifiying ChildRefCount in the parent NBL, neither setting ParentHandle in the Cloned one.
     2. From  this point on, NBL copied list (ClonedNBLList) is the one being used, so we complete the OriginalNBLList because it's no longer used. NdisFReturnNetBufferLists and NdisFSendNetBufferListsCompletefunctions are used.
     3. With each NetBuffer in an NBL we generate a NetworkPacket, an entity that describes the packet and has a buffer data with is a copy of the NetBuffer MDLs DataBuffers.
     4. Each NetworkPacket is processed separatedly. Whenever one has been finished, two cases:
      - Denied: NetworkPacket is destroyed.
      - Allowed:  Its buffer data is used to generate a new NBL-NetBuffer-MDL-DataBuffer structures. Copied cloned NBL properties are the same that were previously copied to the clone (see above).
     5. When all the NetBuffers in the cloned NBL have been processed, cloned NBL is always destroyed with NdisFreeCloneNetBufferList, as allowed NetworkPackets info have already been injected. 
                                                                                                                                                                                                                                                                                                                      
    Technique 2: NBL Deep Copy

     1. NBL list is received from NDIS (OriginalNBLList).  We make a deep copy of the list, involving NBLs, NetBuffers, MDLs and DataBuffers. Functions that take part in the copy process:
             - NBL allocation and copy: NdisAllocateNetBufferAndNetBufferList
             - Netbuffer allocation and copy: NdisAllocateNetBuffer, NdisCopyFromNetBufferToNetBuffer
             - MDL allocation and copy: NdisAllocateMdl 
             - Data buffer allocation and copy: NdisAllocateMemoryWithTagPriority    
      - Properties copy:
           - Copy NBLFlags.
           - Ndis functions NdisCopyReceiveNetBufferListInfo and NdisCopySendNetBufferListInfo. 
     2. From  this point on, NBL copied list (DeepCopyNBLList) is the one being processed, so we complete the OriginalNBLList because it's no longer used. NdisFReturnNetBufferLists and NdisFSendNetBufferListsCompletefunctions are used.
     3. DeepCopyNBLList is enqueued in a thread pool for later processing.
     4. Two NBL lists are generated, one with the allowed NBLs and another with the denied ones.
     5. When the whole DeepCopyNBLList processing is finished, NBLs are:
      - Accepted: NdisFIndicateReceiveNetBufferLists or NdisFSendNetBufferLists.
      - Discarded: NdisFReturnNetBufferLists or NdisFSendNetBufferListsComplete.
                                                                                                                                                                                                                                                                                                                      
    Questions

     - Are both techniques valid? Do you see something incorrect?
     - With respect to technique 1:
            - Do we have to manage ChildRefCount and ParentHandle? How?
            - This technique provokes some ndu.sys BSODs. Several considerations:
                         - Completing the OriginalNBLList before finishing ClonedNBLList processing is correct?
                         - Could the double properties copy be related with this? We have observed that if we don't use the properties copy functions but we copy some properties selectively, the BSODs don't occur (properties in this case are TcpIpChecksumNetBufferListInfo, NetBufferListFrameType and NetBufferListProtocolId).
                         - Any other idea of which can be the problem? 

    Thanks in advance!


    Monday, June 25, 2018 12:58 PM

Answers

  • > From  this point on, NBL copied list (ClonedNBLList) is the one being used, so we complete the OriginalNBLList because it's no longer used.

    This is potentially a problem. You must keep the parent NBL (OriginalNBLList) as long as it has any child NBLs (ClonedNBLList).  The reason is that the clones share the same MDL and payload buffers.  So if you return the parent NBL, its owner is free to discard the MDL and payload buffers.

    (Note that removing the NDIS_CLONE_FLAGS_USE_ORIGINAL_MDLS flag does not completely solve the problem, since the payload buffer is still shared.)


    It sounds like you are eventually making a copy of the payload buffer, but you must make that copy *before* returning the parent NBL.  Otherwise, there's a window of time where the payload buffer could get destroyed before you have finished making a copy.

    > Are both techniques valid? Do you see something incorrect?

    In general, both are valid.  Although, see my above note about exactly when to call NdisFReturnNetBufferLists/NdisFSendNetBufferListsComplete.


    > Do we have to manage ChildRefCount and ParentHandle? How?

    You are not obligated to use these fields.  They are available for you to use however you want.  The convention, though, is to use them to keep track of when you can free the parent NBL.


    > Completing the OriginalNBLList before finishing ClonedNBLList processing is correct?

    This is the problem.

    Tuesday, June 26, 2018 7:08 PM

All replies

  • > From  this point on, NBL copied list (ClonedNBLList) is the one being used, so we complete the OriginalNBLList because it's no longer used.

    This is potentially a problem. You must keep the parent NBL (OriginalNBLList) as long as it has any child NBLs (ClonedNBLList).  The reason is that the clones share the same MDL and payload buffers.  So if you return the parent NBL, its owner is free to discard the MDL and payload buffers.

    (Note that removing the NDIS_CLONE_FLAGS_USE_ORIGINAL_MDLS flag does not completely solve the problem, since the payload buffer is still shared.)


    It sounds like you are eventually making a copy of the payload buffer, but you must make that copy *before* returning the parent NBL.  Otherwise, there's a window of time where the payload buffer could get destroyed before you have finished making a copy.

    > Are both techniques valid? Do you see something incorrect?

    In general, both are valid.  Although, see my above note about exactly when to call NdisFReturnNetBufferLists/NdisFSendNetBufferListsComplete.


    > Do we have to manage ChildRefCount and ParentHandle? How?

    You are not obligated to use these fields.  They are available for you to use however you want.  The convention, though, is to use them to keep track of when you can free the parent NBL.


    > Completing the OriginalNBLList before finishing ClonedNBLList processing is correct?

    This is the problem.

    Tuesday, June 26, 2018 7:08 PM
  • Hello Jeffrey.

    First of all thanks a lot for your quick answer. It has been very helpful in several ways.

    Maybe the NBL Clone mode is working well in some cases because we don’t access either to the MDL or the Payload Buffers after returning the original NBL, as we make a copy of each Payload Buffer before.

    Apart from this, we would like to emphasize some aspects of our NBL fields and properties processing. In both “NBL Cloning” and “NBL Deep Copy” modes we copy some NBL fields and List Info properties in the replica:

    NBL Fields Copy

    • In both modes we set the same original NET_BUFFER_LIST structure NblFlags in the replica. Is this correct? Do we have to copy any other NBL fields? Which of them?

    NBL List Info Copy

    • Which is the best way of copying NBL list info?
      • In the NBL Deep Copy case we use the NdisCopyReceiveNetBufferListInfo and NdisCopySendNetBufferListInfo functions. This seems to work well in any case.

      • In the NBL Clone case we are manually copying only TcpIpChecksumNetBufferListInfo, NetBufferListFrameType and NetBufferListProtocolId properties. We have noticed that in this case copying the properties with NdisCopyReceiveNetBufferListInfo and NdisCopySendNetBufferListInfo functions sometimes led us to a BSOD (see below).

    NBL Clone case BSOD

    DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL (d5)

    Memory was referenced after it was freed. This cannot be protected by try-except. When possible, the guilty            driver's name (Unicode string) is printed on the bugcheck screen and saved in KiBugCheckDriver. Arguments: Arg1: ffffcf810feb6db8, memory referenced Arg2: 0000000000000000, value 0 = read operation, 1 = write operation Arg3: fffff8020103117e, if non-zero, the address which referenced memory. Arg4: 0000000000000000, (reserved)

    nt!KeBugCheckEx+0

    nt!MiSystemFault+12fae0 (perf)

    nt!MmAccessFault+5f1 (perf)

    nt!KiPageFault+13c

    fwpkclnt!FwppNetBufferListEventNotify+4e

    tcpip!WfppTaggedContextClone+5b

    NETIO!WfpNblInfoCloneEx+b5

    ndis!NetioCopyOpaqueNetBufferListInformation+47

    ndis!NdisCopySendNetBufferListInfo+105

    DRIVER!DRIVER_NBL_CopyProperties+88

    A detail explanation of the BSOD context is required. In the NBL Clone case we completed original NBL and couldn’t use payload buffers from that point on. So, before completing it, with each NetBuffer in the cloned NBL we generated a buffer (NetworkPacket) into which we copied of the NetBuffer payload buffer.

    After processing each NetworkPacket, if we had to reinject it, a new final NBL was generated. Clone NBL properties were copied in the new NBL in the same way that we did to copy the properties from the original to the clone (NdisCopyReceiveNetBufferListInfo and NdisCopySendNetBufferListInfo in the case of the BSOD). Could this double-copy NBLListInfo (original-clone-final NBL) be the reason of the BSOD?

    Thanks a lot in advance!



    Thursday, June 28, 2018 6:42 AM
  • Aside from calling NdisCopyReceiveNetBufferListInfo or NdisCopySendNetBufferListInfo, you shouldn't need to manually copy over flags.  These routines already copy over relevant flags.

    These two routines are sufficient for both the deep copy and for the clone cases.  Indeed, many built-in drivers do exactly this.

    The bugcheck is, I think, caused because the original NBL was completed before the copy & the call to NdisCopySendNetBufferListInfo.  Make sure that the original sticks around long enough for you to copy everything out of it.

    Thursday, June 28, 2018 5:36 PM
  • Hello Jeffrey,

    Thanks a lot for the information, it's very helpful. Surelly these insights will help us fixing the issue.

    Thanks again!

    Monday, July 2, 2018 7:35 AM