AMP and Zero Copy? RRS feed

  • Question

  • Does C++ AMP support zero copy operations for GPUs integrated into the CPU (i.e. works directly against system memory) such as AMD Fusion and Intel HD Graphics?

    Basically what I'd like to be able to do is to copy my data to a staging buffer and then read/write directly to this staging buffer without doing an unnecessary copy to a "device buffer", when that is not required.

    This should be possible at least hardware/driver wise since it is alrdy done in OpenGL using the following extensions:



    • Edited by Dragon89 Thursday, January 17, 2013 6:20 AM
    Thursday, January 17, 2013 6:19 AM


All replies

  • To my knowledge, restrictions imposed by DirectX's buffer management prevent this from working, and this will require a new version of DirectX. If somebody implements the AMP library on top of something else (for example OpenCL or even OpenGL now that it has Compute Shaders...or even lower level, by compiling directly to some IL like HSAIL), it should work just fine assuming the implementer exposes it. See Intel's work in Shevlin Park for an example.
    • Edited by Alex Voicu Thursday, January 17, 2013 11:17 PM
    • Proposed as answer by Amit K Agarwal Friday, January 18, 2013 4:33 PM
    Thursday, January 17, 2013 11:14 PM
  • Given that DX adds support for this in the future. Is the semantics of the current C++ AMP API currently good enough so that no copy could be taken advantage of?
    Friday, January 18, 2013 6:33 AM
  • Yes, the C++ AMP array_view abstraction was designed exactly for this purpose. When you program against the array_view abstration, your code does not have to deal with data transfers between the CPU and the accelerator. It is taken care of by the C++ AMP runtime - when running on a discrete accelerator the runtime will peform copies and when running on an integrated GPU it can directly access the CPU memory on the GPU. This frees the programmer from the task of coding the data transfer management which will vary across different types of accelerator hardware. Please refer to the concurrency::array_view - Introduction blog post for more details.

    Having said that, as Alex mentioned in an earlier response, MS implementation of C++ AMP in VS2012 builds on DirectCompute which does not support this (zero-copy) capability yet (we realize this is very important and are working on it); but the point is that when it does, you application will reap the benefits without requiring any code changes if you are using array_views. 

    Amit K Agarwal

    Friday, January 18, 2013 4:32 PM
  • C++ AMP supports Zero copy now. please refer to the blog post for more details
    Tuesday, August 13, 2013 8:46 PM