none
How do i set thread affinity? (Specify a core for a thread to execute exclusively on.)

    Question

  • How do i set which CPU core a thread executes on? I need to know this for 2-16 core rigs (don't ask.)
    Tuesday, January 15, 2008 10:02 PM

Answers

  • I did some forum searching online. C# managed code doesn't seem to support the concept you are trying to implement. So you need to use an interop to do the job.

    Here is a code example, it appears as if it would work but I haven't tested it. Hope this works for ya.

    using System;
    using System.Runtime.InteropServices;

    using System.Diagnostics;
    using System.Threading;
    namespace Tester
    {
    class Program
    {
    [DllImport("kernel32")]
    static extern int GetCurrentThreadId();
    static void Main()
    {
    Thread t = new Thread(
    new ThreadStart(DoWork));
    t.Start();
    t.Join();
    }
    static void DoWork()
    {
    foreach(ProcessThread pt in Process.GetCurrentProcess().Threads)
    {
    int utid = GetCurrentThreadId();
    if (utid == pt.Id)
    {
    pt.ProcessorAffinity = (IntPtr)(1); // Set affinity for this
    thread to CPU #1
    Console.WriteLine("Set");
    }
    }
    }
    }
    }
    Thursday, January 17, 2008 12:56 AM

All replies

  • Do a google search for some coding examples using  the Process class

    Process has a property which is writeable called ProcessorAffinity

    You can set the affinity, I'm sure, in some fasion going that route.
     GL
    Wednesday, January 16, 2008 12:15 AM
  • I am looking at this http://msdn2.microsoft.com/en-us/library/system.threading.thread.beginthreadaffinity.aspx however i cannot seam to find how to actually set the core in use using this. It only shows how to request security permissions.
    Wednesday, January 16, 2008 9:09 PM
  • I did some forum searching online. C# managed code doesn't seem to support the concept you are trying to implement. So you need to use an interop to do the job.

    Here is a code example, it appears as if it would work but I haven't tested it. Hope this works for ya.

    using System;
    using System.Runtime.InteropServices;

    using System.Diagnostics;
    using System.Threading;
    namespace Tester
    {
    class Program
    {
    [DllImport("kernel32")]
    static extern int GetCurrentThreadId();
    static void Main()
    {
    Thread t = new Thread(
    new ThreadStart(DoWork));
    t.Start();
    t.Join();
    }
    static void DoWork()
    {
    foreach(ProcessThread pt in Process.GetCurrentProcess().Threads)
    {
    int utid = GetCurrentThreadId();
    if (utid == pt.Id)
    {
    pt.ProcessorAffinity = (IntPtr)(1); // Set affinity for this
    thread to CPU #1
    Console.WriteLine("Set");
    }
    }
    }
    }
    }
    Thursday, January 17, 2008 12:56 AM
  • Thank you, i had also been searching this night and came upon information concerning Kernel32.dll. I think the info was outdated though, this seams to be up to date. Thank you again, i will get to reverse engineering and implimenting it, tell you how it gos when im done.



    It works PERFECTLY!!
    Friday, January 18, 2008 3:48 AM
  • It's not save to set thread affinity in managed code.  The CLR is free to use different native threads for the same managed thread, which would mean you can end up setting affinity to a transient native thread and won't get the results you're expecting.

     

    Also, why do you think you need to set thread affinity? 

     

    Friday, January 18, 2008 4:54 PM
  • Because .net keeps dividing up the work of a single thread between the CPU cores and thus causing a performance loss (although it causes Windows to be more responsive.) It is more efficient to manually set with work is done on which cores if you can balance the loadproperly. This way the thread doesnt have to pause, switch cores, resume over and over and over.
    Friday, January 18, 2008 9:19 PM
  • Windows handles assigning CPUs to a thread during a thread context switch, not .NET. It has a very complex algorithm to decide what threads from what processes get what CPU.  It can do this because it has intimate knowledge of all the processes and all their threads.  You can't know any information from other processes so changing the affinity of a thread based solely on your processes criteria will likely case adverse effects in the way the Windows scheduler schedules threads causing an entire system slowdown, not just your application.

     

    Friday, January 18, 2008 9:31 PM
  • Unless my script execution time length measurement script is incorrect the program operates more efficiently with 1 thread per core.

    Incase it makes a difference im using an Athlon X2 5200+ (3Ghz)
    Friday, January 18, 2008 9:34 PM
  • Of course it does, but that's at the expense of all other processes running in the system.  Prior to changing the affinity Windows has to schedule a dozen or more processes with many threads each between two cores.  It decided that something else in the system had a higher priority (like a driver or a system component) and scheduled it time instead of one of your threads.  Only the Windows scheduler can know that Process X thread Y is more important that your processes thread Z, when you override that by setting thread affinity arbitrarily (simply to steal the CPU from other applications) then you affect all other processes in the system.  Worst case, this means a driver can't get the CPU to process data and loss of data occurs.

     

    It's recommended that you simply let Windows do the scheduling.

    Friday, January 18, 2008 9:47 PM
  • Few points i need to make. This program won't be executing at the same time as others that require allot of power. The thread priority is normal so it won't steal time from high priority processes, your logic is incorrect on it stealing time from a driver because that is a process that couldn't be interrupted by a thread of normal priority.
    Friday, January 18, 2008 9:51 PM
  • The opposite is true too.  If you say your thread X can only run on CPU 1 (which is what affinity is) that means Windows can't let your thread run until that CPU is available.  If a real-time thread has the CPU, this could be quite a while before you thread is given a time slice.

     

    Friday, January 18, 2008 9:58 PM
  • We already (and beat testers have confirmed) that this program runs better this way, meaning that is also incorrect. This is why kernel32 has this functions, because some applications just run better with a thread dedicated to 1 CPU. Microsoft wouldn't make this dll have this ability if it didn't yield better results in certaine situations.
    Friday, January 18, 2008 10:03 PM
  •  Lumina wrote:
    Few points i need to make. This program won't be executing at the same time as others that require allot of power. The thread priority is normal so it won't steal time from high priority processes
    If your program is running faster than it was before, it must be stealing CPU from other processes, Windows will give you thread the CPU is nothing else is using it.  If your thread is running on CPU 1, for example, Windows can't just stop it.  It only switches thread contexts on every quantum. (varies depending on version, but 20-40 milliseconds, I believe).  If you force both your threads to be on CPU 1 and CPU 2 Windows is forced to run your threads on those CPUs on the next quantum, meaning no other thread can get a CPU for the duration of the quantum--thus "stealing" time from other processes.  Do you honestly think Windows isn't efficiently allocating CPU time to threads?

     

    BTW, it's not my logic, it's the logic of the people who designed and wrote Windows, I'm simply repeating it.

    Friday, January 18, 2008 10:03 PM
  • My point is that this is irrelevant. It will not interrupt anything of critical importance to the point that something malfuntcions, that is why the thread priority is average. This program is a benchmarking application, no other programs will normaly be running during benchmarking. It can't interrupt high priority programs, and even if it was managed the way windows had automatically done it that would mean more switching and more waiting for all the programs then before. In a normal run of the mil application that would be good, yes, but this is a benchmark, not a calculator.
    Friday, January 18, 2008 10:08 PM
  • I should have been more clear, there's a "possibility" of this happening as well.

     

    See also:

    http://www.flounder.com/affinity.htm

    http://www.bluebytesoftware.com/blog/PermaLink,guid,8c2fed10-75b2-416b-aabc-c18ce8fe2ed4.aspx

    http://www.bluebytesoftware.com/blog/PermaLink,guid,f8404ab3-e3e6-4933-a5bc-b69348deedba.aspx

    http://msdn2.microsoft.com/en-us/library/ms686247.aspx:

    "Setting an affinity mask for a process or thread can result in threads receiving less processor time, as the system is restricted from running the threads on certain processors. In most cases, it is better to let the system select an available processor."

     

    Friday, January 18, 2008 10:16 PM
  •  Lumina wrote:
    My point is that this is irrelevant. It will not interrupt anything of critical importance to the point that something malfuntcions, that is why the thread priority is average. This program is a benchmarking application, no other programs will normaly be running during benchmarking. It can't interrupt high priority programs, and even if it was managed the way windows had automatically done it that would mean more switching and more waiting for all the programs then before. In a normal run of the mil application that would be good, yes, but this is a benchmark, not a calculator.
    Sorry, you're wrong.  Simply read the documentation and the text from other experts in the field (to which I referenced).  There are at least a dozen processes running in Windows even when you're not running other "programs".  They don't run simply so your application doesn't get CPU, they run for a specific reason.  Read the documentation for SetThreadAffinityMask, as I pointed out in my other message it specifically states "Setting an affinity mask for a process or thread can result in threads receiving less processor time...".
    Friday, January 18, 2008 10:21 PM
  • Great answer, thank you.  Also, for scientific applications 1thread/processor is desired because the computer will be a stand-alone one (i.e., no one will be using any other applications on it).

    Thank you, great solution and discussions.
    Sunday, October 05, 2008 8:47 PM
  • I meant, great answer from mcox05 and great question and discussion by Licht.
    Sunday, October 05, 2008 8:48 PM
  • All of you professors and book worms out there Licht is actually 100% right

    Yes the manual says that you should not manually set the processor affinity. And for the most part that is true when you are talking about dual core or quad core single CPU machines or old SMP architectures.
    However when you get into multi CPU multi core Opteron or Core 7 infrastructure the arguments simply do NOT hold true. Specific examples are quad cpu quad core Opterons.
    On an SMP XEON with its shared bus you can pretty much let the OS do its own thing. But on an Opteron or a Core 7 for certain types of memory intensive multi threaded applications you have no choice but to set the processor affinity.

    WHY ?

    The answer is ccNUMA

    A thread running on a core of one CPU pays a significant price in order to access memory that is local to another core. So what this means is that when a thread is created and memory is allocated for that threads work. This memory is quite likely to be local to the CPU (unless the machine is under memory pressure). Later on if the thread gets swapped to a different core on another CPU then it will be making cross CPU calls to get at its memory.

    ccNUMA superceeds the old SMP (AMD Opteron and Intel Core 7). The old arguments of let the OS decide no longer apply to high load multi threaded applications running on multi CPU machines.

    On a 2 CPU socket opteron machine the cost of accessing non local memory is twice as high. On a 4 socket Opteron machine the cost is between 2 to 3 times higher depending on which processor the memory is local to with respect to the hypertransport ring.

    In case your wondering why go with NUMA architectures instead of SMP trust me the old SMP simply cannot handle certain types of memory intensive applications. The web portal at http://www.chillx.com maxes out a 16 Core XEON (TigerTon) box (SMP). But runs at 5% CPU on a 16 core Opteron box (ccNUMA).

    To answer
    Licht 's question Unfortunately the problem is that managed threads and not OS threads and are not paired up a 1 to 1 either. From what I've seen the CLR also multiplexes managed threads against OS threads that it creates to handle the workload. Given that the CLR threads are logical threads within the context of the CLR it is not possible to use WINAPI functions to set their affinity. It is possible of course to lock a process to a given cpu core but for the individual threads I don't see anything
    Thread.SetProcessorAffinity only applies to XBOX 360 which has 6 cores.
    However it looks like the IHost interfaces might have something for this purpose.

    There also seems to be another approach as well:
    We could call BeginThreadAffinity and EndThreadAffinity to make the CLR logical thread stick to a physical thread for the duration between begin and end. However that still does not give us the ability to specify which CPU core.


    Saturday, September 05, 2009 10:03 AM
  • Thanks MCox it worked fine for me. However I have couple more question for you if you dont mind.

    1. I would like to spawn threads depending on the core count. On a 2 core I want to spawn off only 2 threads and on a 16 core 16 threads. What is the equilvalent for GetAffinityMask() in C#?

    2. I want to pass in prameter for your DoWork() thread. I googled a lot on this but could not get a satisfactory answer.
    3. Lastly if say 16 threads are seeking for an answer and one of them gets it,  I would like that thread to let all the other threads know, so that they can abandon whatever they are doing and return gracefully.

    thanks

    Wednesday, January 13, 2010 5:18 PM
  • I got the passing parameter working. Here is the code. Now if some kind soul answer the other two questions that would be awesome.

    Edit: No kind soul replied but I got it to working anyway. Thought of sharing it here so some other poor soul like me can benifit. What this does is spwans of one thread per
    processor(core). It automatically detects the core count and does its work. Thanks to the original author MCox. I just valueadded to it. I have tested it on various cores and it works fine. Enjoy.

    using System;
    using System.Runtime.InteropServices;
    using System.Diagnostics;
    using System.Threading;

    namespace Tester
    {
        class Program
        {
            [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
            static extern int GetCurrentThreadId();
            [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
            static extern bool GetProcessAffinityMask(IntPtr currentProcess, ref Int64 lpProcessAffinityMask, ref Int64 lpSystemAffinityMask);
            [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
            public static extern int GetCurrentProcessorNumber();

            static void Main()
            {
                Int64 lpProcessAffinityMask = 0, lpSystemAffinityMask = 0 ;
                int i = 0, j = 0;
                Int64 procCount = Environment.ProcessorCount;
                Thread[] t = new Thread[procCount];

                IntPtr currentProcess = Process.GetCurrentProcess().Handle;
                Int64 currProcMask;

                GetProcessAffinityMask(currentProcess, ref lpProcessAffinityMask, ref lpSystemAffinityMask);

                Console.WriteLine("ProcessAffintityMask = {0}, SystemAffinityMask ={1}", lpProcessAffinityMask, lpSystemAffinityMask);

                for (i = 0; i < procCount; i++)
                {
                    currProcMask = lpSystemAffinityMask & (1 << i);
                    if (currProcMask >= 1)
                    {
                        t[j] = new Thread(new ParameterizedThreadStart(DoWork));
                        object[] parameters = new object[] { currProcMask };
                        t[j++].Start(parameters);
                    }
                }
                for (i = 0; i < j; i++)
                {
                    t[i].Join();
                }
            }

            static void DoWork(object parameters)
            {

                object[] parameterArray = (object[])parameters;
                Int64 currProcMask = (Int64)parameterArray[0];

                foreach(ProcessThread pt in Process.GetCurrentProcess().Threads)
                {
                    int utid = GetCurrentThreadId();
                    if (utid == pt.Id)
                    {
                        pt.ProcessorAffinity = (IntPtr)(currProcMask); // Set affinity for this thread to CPU #1
                        Thread.Sleep(10);
                        Console.WriteLine("Running on processor {0}", GetCurrentProcessorNumber());
                    }
                }

            }
        }
    }

    Wednesday, January 13, 2010 9:15 PM