locked
NDIS Miniport and Protocol Interaction RRS feed

  • Question

  • Hello, I am developing a driver for audio support through ethernet for windows. The driver stack design is just the NIC's developer miniport and I ve developed a protocol driver for acquiring the audio. So, I have got it all working, my problem is performance.
    The hardware we work with sends packs at a rate of 62micro seconds. My protocol driver receives the packets from the miniport by NDIS at a rate of 180us delivering 3 packs (at once) or sometimes at 1800us (delivering about 30packs at once). Of course these values are an average from the data we collected. For us that is a problem, given that we need a more stable rate for lower latency and better efficiency, and we seem to loose data somewhere in the process, for the audio received in the final application is distorted.
    So, my protocol driver is based on the conectionless NDIS 6.0 sample for VS2012. My question is: Are there attributes or priorities to configure the communication between the miniport and protocol drivers? How could I make the miniport give a higher importance to attending my protocol driver? Or how would you advice me to deal with this situation?
    Thanks for any help in advance...
    Tuesday, November 18, 2014 6:16 PM

Answers

  • Generally, most miniports are going to indicate NBLs at DISPATCH_LEVEL, so the variance in the rate of NBL indication is not due to preemption (which is good news for you). Since you are receiving a chain of NBLs with each indication, the first thing I would look at is the interrupt moderation setting for the miniport. Now the bad news is that Microsoft has not prescribed exact interrupt moderation behaviors, so IHVs do have liberty in how they choose to implement the functionality.

    Another optimization, albeit a less important one, that you can consider is to avoid (if at all possible) defining FrameTypeArray values that overlap with those declared by other protocol drivers, e.g. 0x800 (IPv4), etc.

    Tuesday, November 18, 2014 11:09 PM

All replies

  • Generally, most miniports are going to indicate NBLs at DISPATCH_LEVEL, so the variance in the rate of NBL indication is not due to preemption (which is good news for you). Since you are receiving a chain of NBLs with each indication, the first thing I would look at is the interrupt moderation setting for the miniport. Now the bad news is that Microsoft has not prescribed exact interrupt moderation behaviors, so IHVs do have liberty in how they choose to implement the functionality.

    Another optimization, albeit a less important one, that you can consider is to avoid (if at all possible) defining FrameTypeArray values that overlap with those declared by other protocol drivers, e.g. 0x800 (IPv4), etc.

    Tuesday, November 18, 2014 11:09 PM
  • It's not clear to me: did you create both the miniport and the protocol? Or just the protocol?

    NDIS itself doesn't have much to do with threading/scheduling/priority. That comes from the miniport (for the receieve path) and the protocol (for the send path).  So your question isn't really about NDIS - it's about how you can make the miniport indicate data with less latency and less jitter.

    If you own the miniport, this should be straightforward, since you have a lot of degrees of freedom.  I suggest creating a dedicated thread to do nothing put pump IO up.  Set the thread's priority carefully: high enough to not stomp over the audio decoder thread, but not so low that every usermode app is starving your thread.

    If you don't own the miniport, then you don't have many options available.  The main knobs at your disposal are interrupt moderation (e.g., *InterruptModeration, or OID_GEN_INTERRUPT_MODERATION); and possibly RSS.  (RSS only works for TCP-encapsulated traffic, and only if TCPIP is not also bound to the NIC.)

    Tuesday, November 18, 2014 11:18 PM
  • I thank you guys for the quick replies.

    I only worked only on the protocol driver.

    Our FrameType is unique, so that is not a problem.

    I did check the interrupt moderation of the miniports, and they were enabled, which means there is nothing I can do there right?
    We are working with Ethernet communication so the RSS will not help either.

    So I am looking at either interpolating the packets or making a miniport driver right?
    Any last advice on that? Material I could study?

    Thanks

    Wednesday, November 19, 2014 5:10 PM
  • Disable interrupt moderation with OID_GEN_INTERRUPT_MODERATION.

    In general, you're going to be at the mercy of the miniport implementation.  Different implementations will have different levels of quality.  If this is a closed system, shop around until you find a miniport that offers good latency and low jitter.  If you are making a high-volume product, you might be able to get a custom driver from the NIC OEM.  If this is an open system, you'll just have to do lots of buffering and be prepared to be installed over a poor miniport.

    The OS actually doesn't have much control over the miniport's scheduling decisions.  The NIC interrupts when it wants to, based on interrupt moderation, and the OS generally indicates packets as soon as possible after the interrupt.

    Wednesday, November 19, 2014 7:10 PM
  • Did the Interrupt Moderation Disabling, but the problem persists, however it did seem to help. I will have to implement a different buffering and interpolation method. Thanks for the help guys.

    Just another question, related to the same problem, but another perspective. Do correct me if I am wrong on my concepts please.

    I have a callback thread for sending and receiving packets (declared as PROTOCOL_RECEIVE_NET_BUFFER_LISTS for example) and another for IOCTLs, reading and writing requests (declared as EVT_WDF_IO_QUEUE_IO_READ for example).

    These WDF Queues are configured by the driver on creation. Then I assume the IO Manager creates and manages the threads for these queues, which have a thread starting priority of 15 during operation, in general. The protocol NDIS callback functions have priority of 8.

    These priorities are variable and as I analysed them I noticed that the queue callback threads slowly lower the priority to 8. The protocol callback creates new threads with higher priority when needed and after dealing with what was needed these new threads are deleted.

    I am not sure if these priorities have an influence on the results I have been obtaining, but I have reason to believe they can generate latency and jitter, since they have locked resources in common.

    Is there a way I can predefine the EVT_WDF_IO_QUEUE_IO_READ callback thread priority? I know that by configuring the queue that is not possible. Does only IO Manager have control over that?

    I believe that with both receiving/sending and reading/writing callbacks having the same priority this would lower the latency, exacly because NDIS does create new higher priority threads when delivering resources has a higher importance. I noticed that at the start that is not enough since the reading thread still has a value of lets say 13 (lowering from 15 to 8) and the NDIS newly created thread of 12. Which means NDIS wanted to get rid of the resources, but my reading priority is still higher, so that NDIS might have to wait.

    I tried lowering the IO Manager priorities using KeSetPriorityThread, but the following requests readjuts the thread priority to higher values.

    Any advice, comments or missconcepts of mine?

    Thanks for the support!

    Thursday, November 20, 2014 5:07 PM
  • I'm not an expert in WDF, so take with a grain of salt.  I believe that WDF doesn't touch the thread priority when issuing queue callbacks. WDF just uses whichever thread it's on.  In some cases, that means you'll get called back on the usermode process's thread. (Which is generally good, since that's the maximally-efficient way to get the data from usermode.) This also means your thread runs at whatever priority usermode asked for. (Thread doesn't change priority just because the thread makes a user->kernel transition.)
    If you have a serialized queue, WDF might be forced to queue some requests, and therefore can't always use the calling usermode thread to dispatch requests. Then you're probably just at the mercy of the kernel threadpool.

    In the ideal case, you do not have a serialized queue, so requests are passed through directly from usermode on the same thread.  Then your usermode app can set its thread priority appropriately.

    I'm not aware of any OS facility to "slowly lower" a thread priority.  Threads can get temporary boosts in some cases (e.g., foreground window thread, or during IO completion).  And the boost can be slowly raised in the case of the balance set manager.  But slowly lowered, I do not know about.  It's possible that you're just seeing a sequence of unrelated threads?

    If you want to get into the details of scheduling, the latest edition of the Windows Internals book is the best way to learn.  Maybe it's overkill for this project, but it never hurts to be an expert.

    Thursday, November 20, 2014 9:31 PM