none
C++ AMP performance boost / window manipulation / desktop composition RRS feed

  • Question

  • Hi,

    On a Windows 7 laptop, with an NVIDIA GTX 460M, while running a console application making use of C++ AMP (VS 11 beta), I notice a *very big* performance boost when some windows are minimized then maximized (for example the VS IDE window while my app is running). This performance boost is temporary (3 to 5 seconds).

    I notice the same behavior (but for a longer period of time) when Desktop Composition is toggled on and off.

    Do other people observe the same behavior ?

    All the best, Arnaud.

    Friday, April 6, 2012 7:50 PM

Answers

  • OK, got it.

    In the NVIDIA Control Panel ('Manage 3D Settings' tab), the 'Power management mode' feature must be set to 'Prefer Maximum Performance' (as a global parameter or just for the C++ AMP application), before the application is launched.

    With the default setting ('Adaptive'), the driver steps down the GPU when it observes a total load less than a certain percentage (~10%) for a certain amount of time (~10s), which was my case (a lot of computations are being left to the CPU).

    Saturday, April 7, 2012 3:20 AM

All replies

  • Hi Arnaud,

    It sounds like you have contention around the single GPU resource on your system.

    The other apps you have mentioned use the GPU and may be competing for resources with your app. The Visual Studio IDE uses WPF which leverages hardware acceleration when possible. The Desktop Composition feature also uses the video memory. See http://msdn.microsoft.com/en-us/library/windows/desktop/aa969540(v=vs.85).aspx

    If you are curious, you can find out which apps are using the GPU with process explorer (http://technet.microsoft.com/en-us/sysinternals/bb896653).

    Thanks,
    Pooja


    Pooja Nagpal

    Friday, April 6, 2012 11:54 PM
  • Hi Pooja,

    Thanks for your reply, I agree I have contention around my single GPU resource, which is used for both compute and display.

    But even when VS IDE is not running, the behavior is the same.

    After some investigation, it seems that I have to keep the GPU graphically busy (!) in order for my application to have a performance boost.

    For example, when I keep scrolling pages up and down in Firefox (or IE9 or notepad when Desktop Composition is enabled), the C++ AMP kernels of my application run faster...

    Some more details:
    - this occurs independently of the accelerator_view queuing mode (immediate or automatic)
    - this occurs for staging and 'non staging' arrays (http://blogs.msdn.com/b/nativeconcurrency/archive/2011/11/10/staging-arrays-in-c-amp.aspx)

    This is the Process Explorer details for my process:

    When my process is running 'normally', it consumes around 37% GPU Usage (the high level GPU Usage peaks above). At that time my C++ AMP kernel is taking 20ms to complete (including memory operations and kernel execution).

    When I play with other windows (e.g. Firefox or IE9 or notepad), GPU Usage for my process drops to 9% (low level GPU Usage in procexp), and my C++ AMP kernel execution time drops by a factor 4 to only 5ms.

    All of this makes me feel some throttling mechanism could be in the way (driver ?)...

    Best regards, Arnaud.

    Saturday, April 7, 2012 1:39 AM
  • ... some more details, using hwinfo, I notice that my GPU clock speed and voltage raise when I interact with other windows:

    - 'idle' situation (with the C++ AMP app running...)

    - after interactions with other windows (with the C++ AMP app still running), we can see a step up (3x to 4x) in GPU clock speed, similar to the timing difference I measure in the AMP application.

    Some layer in the OS/driver/D3D/AMP stack definitely doesn't consider C++ AMP workloads (at least in my case) as valid for clock and voltage step-up.

    Saturday, April 7, 2012 2:25 AM
  • OK, got it.

    In the NVIDIA Control Panel ('Manage 3D Settings' tab), the 'Power management mode' feature must be set to 'Prefer Maximum Performance' (as a global parameter or just for the C++ AMP application), before the application is launched.

    With the default setting ('Adaptive'), the driver steps down the GPU when it observes a total load less than a certain percentage (~10%) for a certain amount of time (~10s), which was my case (a lot of computations are being left to the CPU).

    Saturday, April 7, 2012 3:20 AM
  • For NVIDIA (and AMD) cards, the control panel for the GPU is not available if the display is not attached to device.  In this case, how does one set the default for a GPU in the machine?  --Ken
    Monday, April 9, 2012 12:55 PM
  • Ken,

    Can you tell us more about your setup? Are any of the devices connected to the display?

    I found that we can launch the NVIDIA control even if one device is attached to the display. See screenshot below. This is a machine with two GPUs and only one connected to display. We can choose to tune the settings for each card even if it is not connected to display.

    You should also contact NVIDIA / AMD. They will be able to provide you with advice on how to tune these settings without using the Control Panel. The vendor’s advice would apply for any programming model since this is not C++ AMP specific, but rather GPU-specific.

    Thanks,
    Pooja


    Pooja Nagpal

    Tuesday, April 10, 2012 2:32 AM
  • My machine is a GIGABYTE GA-A75-UD4H with a llano processor (AMD A8-3850), running Windows 7 service pack 1.  In PCIe-16 slot #1, there is an ATI/AMD Radeon HD 7870 (PITCAIRN XT); in the second PCIe-16, an nVIDIA GeForce GTX 470 (GF100).  (I think in this scenario, both devices work at the half rate, 8x.)  The display is attached to the on-board GPU for the Llano, a ATI/AMD Radeon HD 6550D (SUMO Desktop).

    All the GPU's are accessible in C++ AMP and OpenCL programs.

    If I right-click on the desktop, the pop-up has an entry for the AMD and NVIDIA video controls.  AMD's control panel does display the two GPU's ("Information/hardware), but the second GPU is "disabled".

    The NVIDIA control panel does not come up at all.  I suspect that your set up has two NVIDIA cards, not a mix of AMD and NVIDIA.  I didn't try, but I could set that as the display temporarily, set up the defaults, then hope that they stick if I reset the display back to the on-board GPU.

    I remember it was very difficult getting the AMD and NVIDIA software installed if you don't set it up as the display adapter.  I think I had to muck around with BIOS and/or rearranging cards to get everything to install.

    I'm wondering, though, if it is necessary to set the default because a "warm-up" (a short, inane kernel call that basically does nothing except to increase the clock rate on the GPU) should work.  In several NVIDIA CUDA examples, they do a warm up before doing timing, e.g., "NVIDIA GPU Computing SDK 4.0\c\src\reduction\reduction.cpp", line 365.

    Ken

    Tuesday, April 10, 2012 6:45 PM
  • You are right, Ken. The control panel does not come up when the NVIDIA GPU is not connected to display. Please contact NVIDIA for more information on how to work around this. We will do the same.

    A warm-up is always required for performance measurements even if your power settings are at maximum. A warm up will remove outliers due to other factors as well. Example: JIT time taken for the first execution, first time memory zeroing out, and so on.

    Hope this helps,
    Pooja


    Pooja Nagpal

    Friday, April 13, 2012 2:01 AM