none
DPC latency measurement ? RRS feed

  • Question

  • Hello:

    During the CHAOS test, my debugger break in and complains that the DPC latency is exeeding the default limit.

    I try to use Windows Performance Tool as part of Windows ADK and run xperf with "latency" turned on. But I do not see

    my driver on the list of the data reported. The module that I got a report on are: ACPI.sys, HECI.sys, USBPORT.sys, ataport.sys

    dxgkrnl.sys, ndis.sys, pcmcia.sys

    So:

    1/ How do I tell it to capture dpc latency for a specific "abc.sys" driver ?

    2/ says I have the above info. How do I get this tool to display DPC latency per function within the driver if I have symbol for it?

    Other than xperf tool, any other suggestion to track down the specific DPC latency issue?

    When does the system start measure the starting of a specific DPC routine? Below the registered DPC for my write:

    WDF_DPC_CONFIG_INIT(&dpc_cfg, acm_request_write_complete_dpc);

    dpc_cfg.AutomaticSerialization = TRUE;

    status = WdfDpcCreate(&dpc_cfg, &attributes, &fdx->dpc_write);

    ................................

    Later, the DPC write object is being inserted in the system DPC queue in the write

    completion routine.

    When the write DPC object is serviced, it will calls the registered DPC routine, acm_request_write_complete_dpc().

    So, this DPC latency would be the execution time of this routine? Would it?

    If so, I can instrument the code to just measure the time it is spending in the DPC's registered handler ?


    KAL

    Tuesday, September 25, 2012 6:10 PM

Answers

  • When you say latency, I think you mean overall DPC execution time and exceeding that limit.

    Using automatic serialization with DPCs and a high number of IOs can easily get you into a situation where you exceed the amount of time a DPC is allowed to run without you doing anything. Why? because to run your DPC callback KMDF will acquire a WDFDEVICE wide lock.  that same WDFDEVICE wide lock is also acquired when calling your IO processing routines.  those IO processing routines can then starve the DPC from running and that duration of starvation is counted by the system as DPC time. 

    The fix is that you don't use any automatic synchronization and use your own locks to provide finer grain locking semantics that allow you to better control time spent in your driver.


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, September 25, 2012 6:33 PM

All replies

  • When you say latency, I think you mean overall DPC execution time and exceeding that limit.

    Using automatic serialization with DPCs and a high number of IOs can easily get you into a situation where you exceed the amount of time a DPC is allowed to run without you doing anything. Why? because to run your DPC callback KMDF will acquire a WDFDEVICE wide lock.  that same WDFDEVICE wide lock is also acquired when calling your IO processing routines.  those IO processing routines can then starve the DPC from running and that duration of starvation is counted by the system as DPC time. 

    The fix is that you don't use any automatic synchronization and use your own locks to provide finer grain locking semantics that allow you to better control time spent in your driver.


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, September 25, 2012 6:33 PM
  • Thanks Doron, this is good information. Here are my questions with the behavior that come with, "automatic serialization":

    1/ When the driver DPC callback get invoked. KMDF acquires a WDFDEVICE wide lock. Is this the mechanism to execute the DPC at DISPATCH_LEVEL ?

    2/ Why does KFDF use the same spinlock which interferre with the rest of IO processing? I heard that you suggest to use manual lock

    around DPC routine ? How would using a different lock help? Wouldn't either DPC thread or IO thread will be serialized if they all run get DISPATCH_LEVEL, independent of they are holding the same or different spinlock?

    Could you comments on recommended techniques for measuring DPC zeroing the culprit. How do I use Windows performance tool to measure my driver DPC latency ?

    Also, If I run the test without the debugger attached, it will pass. Is there any relationship between the DPC watchdog and kernel debug enable?

    In other words, it seems that when kernel debug is enabled and/or there is a debugger attached, the DPC watchdog will be enabled, hence my debugger breaks in with the complaint.


    KAL

    Tuesday, September 25, 2012 7:08 PM
  • 1) DPCs run at dispatch level regardless of the sync scope or execution level you configure your WDFDEVICE or WDFDPC with, that is an underlying implementation constant of the OS

    2) it uses the same lock because that is what you asked for.  A device wide synchronization scope means device wide, across io processing, timers, DPCs, etc. 

    there is some confusion here about IRQL. remember that DISPATCH_LEVEL is not a synchronization mechanism, it is a per CPU state that pertains to execution level (and what is allowed to be called). on a multiprocessor machine you can have one CPU at dispatch (your dpc), another at passive (your io processing routine). this is why you need a lock, not just IRQL.

    so with that said, using your own lock would give your finer grain control over when that lock is acquired and for how long it is acquired.  when kmdf does the lock/release, it is over the entire DPC and entire io processing routine.  you can easily trim down the amount of code that is run under the lock by controlling this yourself.

    an active kernel debugger can easily affect timing. I don't remember if a kernel debugger changes the dpc watchdog period or behavior.

    d


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, September 25, 2012 7:43 PM
  • Okay, thanks. Can you comments on the Windows Performance tools question? That is how do I tell the tool to collect latency data for a specific driver, and how do I get it to use my symbols?


    KAL

    Tuesday, September 25, 2012 8:26 PM
  • pretty sure WPT uses the standard symbol path mechanisms (symbol store, symbol cache, local path, etc). you just need to point it at your PDB.

    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, September 25, 2012 10:38 PM