none
Problem with the timmings of a program that uses 1-8 threads on a server that has 4 Dual Core Cpu's? RRS feed

  • Question

  • Hello,

    I am runnig a program on a server at my university that has 4 Dual-Core AMD Opteron(tm) Processor 2210 HE and th O.S. is Linux version 2.6.27.25-78.2.56.fc9.x86_64. My program implements Conways Game of Life and it runs using pthreads and openmp. I timed the parraller part of the program using the getimeofday() function using 1-8 threads. But the timmings dont seem right. I get the biggest time using 1 thread(as expected), then the time gets smaller. But the smallest time i get is when i use 4 threads.

    Here is an example when i use an array 1000x1000. 

    Using 1 thread~9,62 sec, Using 2 Threads~4,73 sec, Using 3 ~ 3.64 sec, Using 4~2.99 sec, Using 5 ~4,19 sec, Using 6~3.84, Using 7~3.34, Using 8~3.12.

    The above timmings are when i use pthreads. When i use openmp the timming are smaller but follow the same pattern.

    I expected that the time would decrease from 1-8 because of the 4 Dual core cpus? I thought that because there are 4 cpus with 2 cores each, 8 threads could run at the same time. Does it have to do with the operating system that the server runs?

    Also i tested the same programs on another server that has 7  Dual-Core AMD Opteron(tm) Processor 8214 and runs Linux version 2.6.18-194.3.1.el5. There the timmings i get are what i expected. The timmings get smaller starting from 1(the biggest) to 8(smallest excecution time).

    The program implements the Game of Life correct, both using pthreads and openmp, i just cant figure out why the timmings are like the example i posted. So in conclusion, my questions are:

    1) The number of threads that can run at the same time on a system depends by the cores of the cpus?it depends only by the cpus althgough each cpu has more than one cores? It depends by all the previous and the Operating System?

    2) Does it have to do with the way i divide the 1000x1000 array to the number of threads? But if i did then the openmp code wouldnt give the same pattern of timmings?

    3)What is the reason i might get such timmings?

    excuse my english i am from europe... thnx in advanse.

     

     

    Sunday, July 25, 2010 4:43 PM

Answers

  • If I am correct in reading your question; I believe you are asking a Windows native C++ forum how to make Linux scale. 

    Of course our answer would be to try Windows and Visual Studio.  :-) 

    Beyond that, you have to be careful how you partition your data, since your hardware will demonstrate any limitations your algorithm and compiler have when it comes to false sharing of your data.  You might find that you have to be careful to arrange that array to take advantage of cache coherency.

    Herb Sutter has some great articles about false sharing and performance.

    Good luck.

    Dana Groff

    • Proposed as answer by Dana Groff Tuesday, July 27, 2010 2:23 AM
    • Marked as answer by Dana Groff Saturday, July 31, 2010 4:54 AM
    Tuesday, July 27, 2010 2:23 AM