none
How do I trigger a breakpoint inside amp-restricted code? RRS feed

  • Question

  • I have a parallel_for_each function I would like to debug, so I followed the Debugging walkthrough (http://msdn.microsoft.com/en-us/library/vstudio/hh368280.aspx) and confirmed that Debugger to Launch was set to Local Windows Debugger. Then I setDebugger Type to GPU Only. My code looks like this:

    extent<1> e2(12); parallel_for_each(e2,[](index<1> idx) restrict(amp) { int i = idx[0]; int b = 2; // Breakpoint here });

    system("pause");


    When I run in GPU only mode, the code simply stops at system("pause"), ignoring my breakpoints.

    How can I get GPU debugging to work?

    I am running Windows 8.1 64-bit and Visual Studio 2013, and I am able to trigger breakpoints in the example code on the page linked above by simply copy-pasting the code into my project... Something is wrong with parallel_for_each but I can't debug it!

    • Edited by arman_sch Tuesday, November 12, 2013 7:48 AM
    Tuesday, November 12, 2013 5:12 AM

Answers

  • Hi arman_sch,

    In your code above, since you are not capturing any thing that you will be using back on CPU, C++ AMP optimizes away this parallel_for_each in such cases and is never actually dispatched to GPU. Also if you capture something in parallel_for_each kernel, the C++ AMP runtime lazily dispatches the kernel. In other words, unless you try to use the data back on CPU (e.g. accessing the data or waiting on the accelerator_view etc.), the parallel_for_each kernel is not dispatched to the GPU.

    Following code snippet will not hit the break point

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will not be hit
    
    	arr_v[idx] = 10;
    });
    
    system("pause");

    Following code snippet will hit the breakpoint

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will be hit
    
    	arr_v[idx] = 10;
    });
    
    arr_v.synchronize_to(accelerator(accelerator::cpu_accelerator).default_view);
    
    system("pause");

    Following code snippet will hit the break point

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will be hit
    
    	arr_v[idx] = 10;
    });
    
    accelerator(accelerator::default_accelerator).default_view.wait();
    
    system("pause");
    Wednesday, November 20, 2013 1:00 AM

All replies

  • Hi arman_sch,

    In your code above, since you are not capturing any thing that you will be using back on CPU, C++ AMP optimizes away this parallel_for_each in such cases and is never actually dispatched to GPU. Also if you capture something in parallel_for_each kernel, the C++ AMP runtime lazily dispatches the kernel. In other words, unless you try to use the data back on CPU (e.g. accessing the data or waiting on the accelerator_view etc.), the parallel_for_each kernel is not dispatched to the GPU.

    Following code snippet will not hit the break point

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will not be hit
    
    	arr_v[idx] = 10;
    });
    
    system("pause");

    Following code snippet will hit the breakpoint

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will be hit
    
    	arr_v[idx] = 10;
    });
    
    arr_v.synchronize_to(accelerator(accelerator::cpu_accelerator).default_view);
    
    system("pause");

    Following code snippet will hit the break point

    extent<1> e2(12);
    array_view<int, 1> arr_v(e2);
    
    parallel_for_each(e2, [=](index<1> idx) restrict(amp)
    {
    	int i = idx[0];
    	int b = 2; // Beakpoint here will be hit
    
    	arr_v[idx] = 10;
    });
    
    accelerator(accelerator::default_accelerator).default_view.wait();
    
    system("pause");
    Wednesday, November 20, 2013 1:00 AM
  • Hi Arman_sch,

    In addition to GPU Only-mode debugging, C++ AMP mixed mode debugging (both CPU & GPU code) was also introduced in Visual Studio 2013.  Mixed mode debug is only supported on the WARP accelerator at this time, but it may provide a more familiar debugging experience.

    Please see the following blog post for more information on mixed mode debugging.

    --Daniel


    Wednesday, November 20, 2013 6:03 PM