C++ AMP bug with for loop
-
2012년 2월 1일 수요일 오후 9:57
This code, which would seem to be fine, crashes via an unresponsive driver exception with an NVIDIA card when compiled and run in Release mode. The problem is the innocuous looking for-loop within the kernel ("for (int k = i + 1; k < N; ++k) a[j * N + k] = j;"). The value of "j * N + k" should be fine. If I unroll the loop manually, it works fine. If I compile and run the program in Debug mode, it works fine. If I add a guard before the "a[...] = j" assignment (i.e., "if (j * N + k < N*N) ...", then it works fine. My guess is that the code generated is wrong. (Run on Windows 7, Developer Preview of MSVC++ 11.)
BTW, Is there any compiler options to look at the HLSL generated?
#include <stdio.h> #include <stddef.h> #include <malloc.h> #include <amp.h> #include <sys/timeb.h> #include <iostream> #include <stdlib.h> #include <string.h> using namespace concurrency; using namespace std; template <typename Kernel> void parallel_for_each(accelerator_view & acc, int ext_size, Kernel kernel) { auto krn = [=] (index<1> idx) restrict(direct3d) { kernel(idx[0]); }; concurrency::parallel_for_each(acc, grid<1>(concurrency::extent<1>(ext_size)), krn); } void xxx(accelerator_view & acc) { int w[] = { -1, -1, -1, -1, -1, -1, -1, -1, -1 }; int N = 3; int * z = (int*)malloc(N*N * sizeof(int)); array<int, 1> a(N * N, acc); copy(w, a); for (int i = 0; i < N; i++) { parallel_for_each(acc, N, [=, &a](int j) restrict(direct3d) { #ifndef INLINE for (int k = i + 1; k < N; ++k) a[j * N + k] = j; #else a[j * N + 1] = j; a[j * N + 2] = j; #endif }); acc.flush(); copy(a, z); for (int p = 0; p < N*N; ++p) std::cout << " " << z[p]; std::cout << "\n"; break; } } #define MYSIZE 2000 char buffer[MYSIZE]; CHAR* wtoc(const WCHAR* Source) { for (int j = 0; j < MYSIZE; ++j) buffer[j] = 0; int i = 0; while(Source[i] != '\0') { buffer[i] = (CHAR)Source[i]; ++i; if (i > 2000) break; } return buffer; } int main() { int n = 3; #define TYPE int std::vector<concurrency::accelerator> accelerators = concurrency::get_accelerators(); std::vector<concurrency::accelerator>::iterator it; for (it = accelerators.begin(); it != accelerators.end(); ++it) { std::cout << "has disp " << (*it).get_has_display() << "\n"; std::cout << "mem " << (*it).get_dedicated_memory() << "\n"; std::cout << "dev " << wtoc((*it).get_device_path().c_str()) << "\n"; std::cout << "dev " << wtoc((*it).get_description().c_str()) << "\n"; if (strcmp(wtoc((*it).get_description().c_str()), "Software Adapter") == 0) continue; if (strcmp(wtoc((*it).get_description().c_str()), "CPU accelerator") == 0) continue; accelerator_view acc = it->create_view(); xxx(acc); } return 0; }
UPDATE: The code runs fine on another machine (Windows 8, AMD Llano). Might be a driver problem?- 편집됨 Ken Domino 2012년 2월 1일 수요일 오후 10:54 See update.
모든 응답
-
2012년 2월 2일 목요일 오전 1:47소유자
Hi Ken
Can you also try on the REF accelerator (Software Adapter)? That is the correctness accelerator that everything else is measured against.
If you have a piece of code that doesn't fail on REF (and you already said it doesn't fail on hardware from another vendor), then you are right that would point to a driver bug for the hardware that it is failing on (and you should ensure you have their latest public driver).
I tried your code on our latest bits, and this code still causes a TDR on NVIDIA Quadro 2000M hardware, so I am leaning towards your conclusion.
Cheers
Daniel
http://www.danielmoth.com/Blog/- 편집됨 DanielMothMicrosoft Employee, Owner 2012년 2월 2일 목요일 오전 2:36
-
2012년 2월 2일 목요일 오전 3:03
The code runs fine on the "Software Adapter" accelerator. I updated the driver (now win7 64bit 286.19), but the problem is still there for that accelerator (NVIDIA GTX 470). --Ken
-
2012년 2월 2일 목요일 오전 4:31소유자
Thanks Ken
One of my colleagues also reproed this on a GTX 480, so there is consistency here. We'll add it to our bug report list for the IHV and if you have your own mechanism for reporting these bugs to NVIDIA, please go ahead and do that too.
Thanks for reporting it to us.
Cheers
Daniel
http://www.danielmoth.com/Blog/- 답변으로 표시됨 Ken Domino 2012년 2월 2일 목요일 오전 11:48

