Synchronizing access to DPC data from a queue callback RRS feed

  • Question

  • Just implemented interrupt handling in my driver. The ISR looks at the hardware registers and decides which DPCs to trigger, and then acknowledges to the hardware that the interrupts were handled.

    The DPC looks at some "work in progress" data, which is simply the active WDFREQUEST and some pointers, if any work is needed. It does what it can do, and updates that data (for example, increment the data pointer). When it finishes the work, it calls WdfCompleteRequest() and clears the work data.

    The IO queue generates the work-in-progress data on EvtIoRead (for example), and then simply schedules the associated DPC to try and complete it.

    Since the IRQ might have caused the DPC to have started running already, access to the work-in-progress part must be interlocked between the queue and the DPC routine.

    The obvious solution is a spin lock for each work-in-progress data struct. But since the DPC is the complex part that consumes the most cycles, I'd rather use something that puts the IoQueue thread to sleep instead of keeping it spinning for the duration of the DPC.

    From the documentation I understand that I could call WdfObjectAcquireLock() on the DPC object, and use that for synchronizing access to the struct. However, my experiments with that only resulted in BSODs so far.

    Monday, April 25, 2016 1:59 PM

All replies

  • I assume you are not using any execution level or sync scope settings. I would just use spinlocks (either KPSIN_LOCK or WDFSPINLOCK) for each level of granularity you want. keep it simple

    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Monday, April 25, 2016 5:22 PM
  • Just keep on mind that your next instances of the DPC may run on other processors. If the DPC does a lot of work and you use a spinlock around the whole DPC, the other processors will be stalled. Either steer all DPCs to same processor so they will be serialized with no spinlock at all - or think about better granularity of locks.

    -- pa

    • Edited by Pavel A Monday, April 25, 2016 6:23 PM
    Monday, April 25, 2016 6:23 PM
  • That's why I'm asking. The DPC constantly accesses the work data and will run for a "long time" (milliseconds). So my assumption that the DPC having to keep the lock for the whole duration is going to cause performance issues is correct then?

    So I'm looking for alternatives.

    One thing that comes to mind is to have a (very short) queue of incoming work items, and have the DPC grab one and move into the "currently being worked on" set which only the DPC will ever access. That way, only the queue needs locking. I suspect Windows has something similar to a kfifo?

    Tuesday, April 26, 2016 6:35 AM
  • I cannot see how a kfifo could help here. Even if Windows had a kfifo, how would you ensure that parallel DPCs pick the items in sequence and won't wait on each other?

    Suppose you have a queue/list/whatever of "work items". They must be processed in sequence, but not necessary on same processor.  Windows or Linux, it is the same problem.

    To keep things simple, maybe you need your own kernel thread to do the work, and DPCs to add the work to the list. Use a spinlock to protect insertion & removal from the list. The DPCs will block on each other, but they will be short.

    IMHO this can be done with DPCs alone,  in a more complicated way.


    • Edited by Pavel A Tuesday, April 26, 2016 2:24 PM
    Tuesday, April 26, 2016 2:20 PM
  • Each DPC handles events from one stream and one stream only. Each stream has its own work queue. The ISR reads the hardware registers and enqueues those DPCs that now have work to do. Multiple streams can be handled in parallel in hardware.

    A prefab queue like kfifo would be handy here, and it's quite a common thing to use, so I'd suspect WDF has something like it. For now I just rolled my own.

    So now I just use a spinlock to access the "next work to do" data (in the stream context), the EvtIoRead fills that with new work, and the DPC moves that into a private location (in the DPC context) for further processing and progress tracking.

    Tuesday, April 26, 2016 3:10 PM
  • Ok but the point is not in the queue (especially when you already did it). The point is in multiple concurrent readers. If we look only at one of the queues/streams: the queue itself is protected by a spinlock, so insertions and removals are atomic. What if two DPCs run on different CPUs; DPC1 gets a piece of data from the queue, then DPC2 gets the next piece. Then DPC1 is delayed, DPC2 completes before DPC1 and passes its data further, before DPC1. Is this acceptable?

    -- pa

    Wednesday, April 27, 2016 11:30 AM
  • I don't understand the context of your remark.

    What I understood is that a single DPC (as created by WdfDpcCreate) will not run concurrently in more than one thread. Calling WdfDpcEnqueue while the DPC is already running will just queue another call to the DPC routine, but they will not start running in parallel.

    One DPC in my driver will do work for one stream.

    It's perfectly okay for multiple DPCs to be running in parallel. The hardware handles this, none of its registers are shared between streams.

    If the same DPC were to run twice, that would cause problems, as both would be acting on the same data (e.g. the amount of data in the hardware buffer)

    Wednesday, April 27, 2016 2:17 PM