none
C++ AMP: tile_static

    問題

  • Hi Folks, In CUDA, programmers can create a dynamically-sized shared memory object, e.g.:

    __global__ void foo()
    {
       extern __shared__ int arr[];
       arr[...] = ...; ...
    }
    
    void bar()
    {  ...
       int number_of_bytes_of_shared_memory = 1024;
       foo<<<grid_dim, block_dim, number_of_bytes_of_shared_memory>>>();
       ...
    }
    

    In C++ AMP, is it possible to dynamically allocate shared memory whose size is determined at run-time? (I don't think so, but I'd thought I'd ask. It comes in handy for all sorts of problems, like matrix/vector multiplication.)

    Ken

    2012年4月13日 下午 02:05

解答

  • Hi Ken,

    With C++ AMP you cannot select the tile_static memory size dynamically. But you can achieve that in certain degree by using template parameters in certain scenarios.

    For example, in matrix-vector multiplication case, you might want to select different tile size based on different input size, which is subsequently used to determine the tile_static memory size:

    class context
    {
    public:
        static const int mv_size_s = 1024;    // small size for matrix-vector multiply
        static const int mv_tile_size_s = 32; // tile size for small matrix-vector multiply
        static const int mv_tile_size_l = 64; // tile size for large matrix-vector multiply 
    };
     
    template<int tile_size, typename T>
    void matrix_vector_multiply_kernel(int n, array_view<T, 2>& matrix, array_view<const T>& vector)
    {
        auto t_ext = vector.extent.tile<tile_size>();
     
        parallel_for_each<t_ext, [=](tiled_index<tile_size> tidx) restrict(amp)
        {
            tile_static T shared_buf[tile_size];
            ...
        });
    }
     
    template<typename T>
    void matrix_vector_multiply(int n, array_view<T, 2>& matrix, array_view<const T>& vector)
    {
        if (n <= context::mv_size_s)
        {
            matrix_vector_multiply_kernel<context::mv_size_s>(n, matrix, vector);
        }
        else
        {
            matrix_vector_multiply_kernel<context::mv_size_l>(n, matrix, vector);
        }
    }

    Thanks for your question.

    Charles


    2012年4月13日 下午 07:57