Answered array_view memory restrictions

  • Thursday, August 16, 2012 8:25 PM
     
     

    Hello,

    I'm trying to implement c++ AMP in my native c++ convolutional neural network library. But I don't quite understand how you can copy something like a vector<vector<vector<int_2>>> connections; to an array_view (on the gpu memory) when there is no fixed width in the number of elements in connections in every dimension. All the c++ AMP samples I've already explored have always a rather simple data structure. I was also wondering if it is possible with c++ AMP to create your memory structures directly on the GPU without some sort of copying an array from cpu side.

    (hope this make some sense)

    thanks,

    Filip  


    • Edited by Zamirra Thursday, August 16, 2012 8:28 PM
    •  

All Replies

  • Monday, August 20, 2012 7:34 PM
    Owner
     
     Answered

    Hi Zamirra,

    C++ AMP requires the data source underlying an array_view to be contiguous in memory. However, you can build a higher level abstraction of a multidimensional container with varying sizes of sub-arrays, by using an underlying storage (array_view) that is contiguously laid out in memory and an auxiliary array_view that contains the offsets of the respective sub-arrays within the actual underlying contiguous storage.

    For example, a 2 dimensional arrays of arrays will look like the following. Note that this does not provide a dynamically growing container like std::vector does.

    array_view<T, 1> dataStorage;

    array_view<int, 1> rowOffsets;

    T& operator()(int row, int col)

    {

       return dataStorage(rowOffsets(row) + col);

    }

    As for copying a vector of vectors to the GPU memory, the most performant way would be to copy the content from the vector of vectors to a C++ AMP staging array and thereafter copy this content to the GPU either explicitly or by creating an array_view over the staging array.

    C++ AMP offers the choice of concurrency::array and concurrency::array_view types. The type concurrency::array denotes a data container which is bound to and accessible at a specific memory region (such as GPU memory) and the programmer is responsible for explicitly performing any data transfers between this container and the CPU through C++ AMP copy operations. The type concurrency::array_view offers the abstraction of a data container which can be transparently accessed both on the CPU and the GPU with the C++ AMP runtime automatically taking care of any required data transfers. When using array_views, programmers can use “discard_data” to indicate that they do not want the existing contents of the array_view to be copied from the CPU to the GPU, when accessing the array_view inside a parallel_for_eachc call.

    Hope this helps.

    -Amit


    Amit K Agarwal

    • Proposed As Answer by Zhu, Weirong Monday, August 20, 2012 7:41 PM
    • Marked As Answer by Zamirra Thursday, August 23, 2012 10:17 PM
    •  
  • Thursday, August 23, 2012 10:20 PM
     
     

    Thanks for your helpful insights,

    Filip