I recently done a test in a simple OpenMP based application and OpenMP couldn't create more than 64 threads.
Also, please follow a discussion on: http://software.intel.com/en-us/forums/showthread.php?t=103375 if interested.
Here is a code of the test:
void main( void )
int iShowNumOfThreads = 1;
omp_set_num_threads( 1024 );
#pragma omp parallel num_threads( 1024 )
if( iShowNumOfThreads == 1 )
iShowNumOfThreads = 0;
printf( "Number of threads created: %ld\n", ( int )omp_get_num_threads() );
for( int i = 0; i < 16777216; i++ )
double dA = ( 2 * 4 * 8 * 16 );
printf( "Done\n" );
How could I create as many as possible OpenMP threads? For example, more than 32,768?
- 編集済み S. Kostrov 2012年3月6日 2:05
Hi Sergey, the 64-thread limit is a Windows OS limitation, not a limitation of OpenMP per se. In particular, the OS calls to WaitAll and WaitOne have a 64-thread limit. Just tested on Windows 7, and I know this was true for Windows Server 2008 and earlier...
Intel OpenMP library allows to create up to 32,768 threads in a parallel region.
Did you follow a link I provided? Please take a look. As soon as I applied a "hack" in the VS Debugger the Microsoft OpenMP library ( vcompd.dll ) was able to create more than 1,024 threads.
Also, where did you see a limitation for '...OS calls to WaitAll and WaitOne...'? What Win32 API functions are you talking about? Could you give me exact names, please?
- 編集済み S. Kostrov 2012年3月7日 7:34
Sorry, mixing my .NET and Native code :-) The function I'm referring to with the limitation is WaitForMultipleObjects, documented here. The question of whether MAXIMUM_WAIT_OBJECTS is really 64 threads is discussed on stackoverflow.com. My understanding is that the OpenMP implementation on Windows relies upon WaitForMultipleObjects to do its fork/join efficiently, hence the 64-thread limitation. I took a quick look at your "hack" but don't really understand it, at least I don't understand the point. If you can make it work by hacking the DLLs through the debugger, go for it :-)
There are ways around the 64-thread limitation by building a tree of WaitForMultipleObjects, but the key is whether Microsoft's OpenMP implementation will go this route or not. My guess is not. OpenMP is moving forward and now at version 3.0, while support in Visual C++ remains at 2.0. MSFT has been putting their energy into the new concurrency features in C++11, PPL, and AMP, and simply may not have the cycles to also work on OpenMP 3.0.
Thank you, Joe! I'm currently waiting for a response from a Visual Studio team. Let's see what they say.
- 編集済み S. Kostrov 2012年3月8日 5:34
In short, yes you are right about the limitation.
Here is the reference on MSDN: http://msdn.microsoft.com/en-US/library/cz27w838(v=vs.80).aspx which says:
Number of threads: ...
In Visual C++, for a non-nested parallel region, 64 threads (the maximum) will be provided. [RP: note that even in the case of nested parallelism, total max threads for the process is 64].OMP_NUM_THREADS environment variable: ...
In Visual C++, if value specified is zero or less, the number of threads is equal to the number of processors. If value is greater than 64, the number of threads is 64.
Rahul V. Patil
Here are some technical details for comparison:
Maximum number of OpenMP threads for Intel C++ compiler ( XE v12.1.3 ) on a 32-bit Windows XP is 16,384.