how to implement a "steps" of for loop in C++AMP
-
2012年7月5日 1:00
Hi.
I've succeeded to implement for loop in C++ AMP.
But if the for loop has a steps more than 2.
e.g. for(int i=5;i<100;i+=2) { (job) }
in this case I implement like this for now,
parallel_for_each(extent,[=](index<1> idx) restrict(amp)
{
int i=idx[0];
if( i >=5 && i % 2==1)
{
(job)
}
}
I am thinking this coding is a bit awkward.
Is there any more elegant coding? or this is only way to this?
thanx for reading .
すべての返信
-
2012年7月5日 1:44
This is a common indexing problem, where the parallel_for_each idx represents a single GPU thread but the computation may not use the same thread-like index.
I usually try to think of the extent as the number of worker threads I want to launch and then map the idx from the lambda to my problem's data indexes.
For the problem above you are 'wasting' the first 5 threads and every other one after that. That's not a very efficient use of GPU threads. What I would do is pre-compute the number of threads you will actually need.
Your above loop will execute: (100-5) / 2 times.
Consider the following for-loop as an equivalent:
int num = 100;
int skip = 5;
int num_iterations = (num - skip) / 2;for(int idx = 0; idx < num_iterations; idx++) {
int i = skip + (idx * 2);
(job)
}And therefore in C++AMP you can do:
parallel_for_each(extent<1>(num_iterations), [=](index<1> idx) restrict(amp) {
int i = skip + (idx[0] * 2);
(job)
}This has the benefit of making sure you don't wasted GPU threads.
Let me know if this makes sense.
- 回答の候補に設定 Zhu, Weirong 2012年7月5日 18:27
- 回答としてマーク HotInCool 2012年7月5日 23:08
-
2012年7月5日 1:45
Without using "i" inside your job you can just use 95/2 consecusive iterations as-is:
extend<1> num((100 - 5) / 2);
If you use "i", do the [0, 95/2) and calculate "i":
int i = 5 + (idx[0] * 2)
- 回答の候補に設定 Zhu, Weirong 2012年7月5日 18:27
- 回答としてマーク HotInCool 2012年7月5日 23:08
-
2012年7月5日 16:58
Thank you! JoeM and Ethatron,
Wow ,Perfect Answer. Yes It does very make sense.
I have tried your code and it work as fine as my code.
Theres no performance differences between my code and your code.
I thought my code is slightly slow,but it was quite same.
anyway I use your code because it look more cool :)
-
2012年7月5日 23:08
Theres no performance differences between my code and your code.
Im so sorry,I was completely wrong..I saw wrong result...
Yes, my code is 2times slower.
Thanks for good response.
I feel I am so donkey...

