Slow PCI read on Win7 vs XP RRS feed

  • Question

  • Hello all,

    I have a PLX9030 based PCI data acquisition card and 32-bit XP kernel driver that reads data FIFO from the BAR4 and places data to the application's ring buffer. The original driver was written using Numega DriwerWorks framework. I re-wrote the driver using WDF to work on Win 7 32 and 64-bit. The new driver works fine except that it reads data much slower than the XP's one. The reading is a simple PIO operation and exactly the same in both XP and Win7 drivers.
    It does not depend on which flavor I use both 32 and 64-bit are slow (it takes 30 ms to copy 64KB data from the FIFO register to the ring buffer and consumes ~70% of CPU core). I cannot benchmark the old driver since I cannot rebuild it but it consumes at least 2 time less of CPU resources.
    I am wondering why I see such a difference?


    Tuesday, November 17, 2015 4:23 PM

All replies

  • I would recommend you use XPERF to see where the time is consumed in your driver.  You say you are reading data from a FIFO, is this being read in small chunks?  If so is there a possibility of doing things in larger blocks?

    KMDF does have overhead, once you know where the time is being spent there are approaches to use to speed things up, but without an idea of where the current driver is taking time, you would potentially be adding complexity without any increase in performance.

    Don Burn Windows Driver Consulting Website:

    Tuesday, November 17, 2015 4:32 PM
  • Thank you for the advice. I will try the XPERF. Yes the driver reads the data in small chunks but the old driver does exactly the same.

    I guess I can read the whole FIFO content into a pre-allocated buffer or use a DMA (if it supports non-incrementing source) but I want to understand why I see such a speed difference?

    Tuesday, November 17, 2015 4:50 PM
  • I have observed a doubling of the overhead in a KMDF driver handling an IOCTL versus an older WDM driver.  You are accepting an overhead with KMDF, in exchange for a lot more reliability, safety and ease of development.  As I said there are ways to tune things once you know where the problem is.  For one client I had a driver that did 100,000 requests per second with a stock KMDF model, we accelerated this to over 750,000 requests per second once we understood where the delays were.

    Don Burn Windows Driver Consulting Website:

    Tuesday, November 17, 2015 5:02 PM
  • There is only 2 IOCTL requests that are issued to start and stop data transfer so it should not be an issue.

    So the simplified scenario is something like this

    1. User initiates data acquisition by sending IOCTL start_data_acquisition

    2. Driver reads data from the PCI FIFO to the application's ring buffer as long as FIFO is not empty and user did not send IOCTL stop_data_acquisition

    There are no any IOCTLs during data transfer.

    Tuesday, November 17, 2015 6:09 PM
  • Ah. Then you need to check the PCI device configuration (compare the relevant config space registers) and maybe use a bus sniffer. 

    I will check it up when I get XP machine.
    Tuesday, November 17, 2015 9:51 PM
  • Certainly it can explain some differences but not a huge performance degradation on Win7 Core i7 3GHz vs XP Core2 Duo 2Ghz unless I missed something.
    Wednesday, November 18, 2015 1:11 PM
  • I agree with you but I tested quite a few Win7 machines and they all performed badly compare to the XP one with the same PCI card so I guess my driver is the problem.
    Wednesday, November 18, 2015 3:23 PM