none
Swapping array_views while maintaining data residency? RRS feed

  • Question

  • Hello all. I'm implementing an algorithm that iteratively reads from one array and writes the processed result to another. The number of times the algorithm iterates is only known at runtime. I also would like to keep the data on the GPU and not have to copy the results back and forth from host to GPU with every iteration. The procedure goes as follows:  in one iteration I'll read from the first array_view and write to the second array_view. In the next iteration I'd like to swap the order i.e. read the previous iterations results from the second array_view, process and store in the first array_view. The alternating pattern continues for a given number of iterations. I'm considering the following pattern:

    array_view<const float, 2> SourceData(dataExtent, &dataSource[0]);
    array_view<float, 2> Buffer1(dataExtent, &tempBuffer1[0]);
    array_view<float, 2> Buffer2(dataExtent, &tempBuffer2[0]);

    // Process SourceData on GPU, store results in Buffer1. (not shown).
    // Then, iterate and alternate Buffer1 with Buffer2:

    for(unsigned int x = 0; x < LoopCount; x++)
    {
        if(x % 2 == 0)
        {
            array_view<float, 2> Source(Buffer1);
            array_view<float, 2> Target(Buffer2);

            // Invoke GPU kernel to process "Source" and store in "Target":
            parallel_for_each(...etc...
        }
        else
        {
            array_view<float, 2> Source(Buffer2);
            array_view<float, 2> Target(Buffer1);

            // Invoke GPU kernel to process "Source" and store in "Target":
            parallel_for_each(...etc...
         }
    }

    Will the above pattern work and, if so, is my assumption that the data will remain on the GPU until I exit the loop and "synchronize()" one of the array_views correct? Lastly, since the "parallel_for-each" is identical in both cases, I'd like to take it out of the if-else scope and have only one invocation (after the source and target buffers have been chosen in the if-else block) but I couldn't see how an empty array_view reference is declared outside the if-else blocks and then "pointed" to another existing array_view inside the blocks. How is this done?

    Thank you in advance everyone.

    -L    

    Saturday, December 1, 2012 12:27 AM

Answers

  • In theory it should stay on the GPU AFAICS, and you can safely use std::swap for this. As for creating a naked av, a quick, dirty, cholesterol laden way that is applicable in your case would be:

    array_view<float, 2> Source(extent<2>(1, 1), vector<float>(1));
    array_view<float, 2> Target(extent<2>(1, 1), vector<float>(1));
    

    Or default contruct a float an take its address. Note that this is out of spec behaviour since you cannot bind a non-const ref to an rvalue / temporary, and the array_view constructor is declared with a non-const ref. However this works in VS2012 since MS allows it (backwards compatibility perhaps?). At any rate, note that array_view copies are shallow, not deep, so swapping AVs does not mess with the underlying data (IIRC). In this context, you could just bind the AVs to the buffers from the get-go, and then do swap(Source, Target) in the loop. This also prevents any potentially rogue syncs happening behind your back which might occur with the current arrangement as Source and Target get killed when going out of scope (albeit with Buffer1 and Buffer2 still live the runtime should see that no sync is needed).

    Saturday, December 1, 2012 11:52 PM