Answered Handling pointers using C++ AMP

  • Monday, May 28, 2012 10:50 AM
     
      Has Code

    Hi there,

    I recently discovered this new API and I'm still exploring the features/limits as we speak. 

    One of those limits is the restricting usage of pointers. Although it makes senses to do this since pointers aren't allowed under HLSL syntax, it makes me wonder if this will ever change. Afaik, CUDA does allow for pointer arithmetic but I'm not saying that CUDA is better or worse. I'm figuring out a way to get around this limitation since tbh, I like C++ AMP better in comparison to CUDA already.

    I've analysed some samples such as the matrix multiplication, the nbody sample and the ocean sample. All 3 are great reference guides but they were build from the perspective of AMP. I did it the otherway around and looked at some algorithms that first needed a conversion step before being able to benifit from the massive acceleration.

    The way I worked was, I programmed the algorithm to work in C++11 using the normal elements such as pointers and etc. Converting the program intense loops to AMP syntax is straightforward but within this loop I have pointers to functions. This was and still is a bottleneck in the conversion part. Since the functions inside the loop need a restrict parameter as well, using pointers becomes rather 'impossible'.

    I searched these forums for answers but didn't found what I was looking for. Then I came along a blog post(http://blogs.msdn.com/b/nativeconcurrency/archive/2011/12/09/passing-pointers-through-c-amp.aspx) which gave me a new possibilty to handle it. It's a good way to handle it but at the same time it feels making things to complex as it should such as operator overloading at a later stage.

    Assume I have the following method call in the parallel_for_each loop:

    //byte is typedef as unsigned char
    SomeMethod(const byte* data)restrict(amp)
    {
    
    }

    What would be the most appropriate way of converting this in terms of AMP?

    With kind regards,

    Thomas Kinet

All Replies

  • Monday, May 28, 2012 4:55 PM
     
     Answered

    Hi Kinetomatics,

    Thank you for your kind words regarding C++ AMP, I'm glad you like it.

    Before answering your questions, I would like to clarify that we separate the concepts of function pointers and data pointers. You are talking about both, so I will address them separately.

    To begin with, C++ AMP fully supports single indirection to data as long as the pointer/reference object is not passed between threads. It means that you can declare and use pointer/reference as a local object, pass it to functions or return from them. However you cannot load/store pointers nor references from/to global memory (through array, array_view, etc.), tile_static memory nor pass them as parameters to the launched kernel from the host (e.g. in lambda closure). Also, in version 1, multiple indirection (pointer-to-pointer, reference-to-pointer) is not supported in any scenario, and neither are pointers/references to functions. That being said, I'd like to emphasize that we are looking forward to relax these limitations in the future. Please refer to section 13.2 (Projected Evolution of amp-Restricted Code) of the C++ AMP open spec for our roadmap. Most pointer-related scenarios are expected to be enabled in amp:1.2 http://blogs.msdn.com/b/nativeconcurrency/archive/2012/02/03/c-amp-open-spec-published.aspx

    Before that happens, the only way is to slightly rework your algorithm. Instead of using function pointers you could use a construct resembling static tag-based jump tables (e.g. switch statement with particular cases calling different functions). And instead of using data pointers you could use relative index-based addressing. However both cases may incur some performance hit: high divergence for the first one and uncoalesced memory accesses for the other. And actually it's not specific to these workarounds, but to function and data pointers in general. Therefore it may be worthwhile to re-architect some parts of your algorithm to reduce the need for such constructs altogether.

    The blog post that you have found describes how to pass data pointers through the C++ AMP runtime without exceeding the imposed limitations. This is merely a scenario where you marshall a data structure containing pointers to amp-restricted functions and use only other data members. Actually, as the pointer is treated there as an integral data type you could extend it to perform some arithmetic operations on it, but you could not use it for addressing. I believe this is not the scenario your are interested in.

    Lastly, please keep in mind that your example is using 1-byte element type, which is not allowed in amp-restricted functions.

    Regards,
    Łukasz



  • Tuesday, May 29, 2012 9:11 AM
     
     

    Hi Lukasz,

    Thank you for the feedback, it's interesting to see the roadmap.

    I'm glad to hear the evolution of AMP also concerns the usage of pointers, I'll be keeping my eye on that. Like you said, I'll need to remodel my code to make it properly compatible since I cannot afford having these workarounds effecting the performance.

    With kind regards,

    Thomas Kinet