locked
.NET Thread Pool spawning lot of threads. RRS feed

  • Question

  • I have a .NET 4.5 application running on single core CPU 2.2 GHz VM. I spawn multiple TPL tasks in this application – about 3 million - all at once. (Code snippet below).

    foreach (var ip in IPs) // IPs collection has about 3 million entries.

    {

          Task.Factory.StartNew(Connect, new TaskArguments

          {

                      IP = ip,

                      TimeOutInMilliseconds = 3000

          });

    }

    Each task is doing some CPU bound and  I/O operations and on average it takes about 2-3 seconds for one thread to finish its work.

    I have read that the more threads a single CPU has to manage, the more it has to spend time in doing context switching which is not good. And .NET thread pool/TPL is designed to consider these factors in deciding the limiting number for spawning max threads.  Hence my choice here to use ThreadPool.

    In accordance to this described behavior, I see that when my application starts, it spawns about 15-20 threads and it remains at this level for some time. My CPU utilization goes to about 80-90%. All good here. But when I check the thread count after about 15+hours of execution time for this application, I see the thread count in excess of 1000+. I am wondering why the thread count is so high now. Bcos of this, I am seeing multiple problems as summarized below –

    1. The computer is very less responsive. Possibly bcos it has so many threads running.
    2. The memory size of my application has grown to about 400 MB.
    3. My application seems to be in hung state. It is not progressing further. It is not releasing any threads or memory now.
    4. The CPU utilization for my application is down to about 0 to 2 %. With so many threads in my application and with so less CPU utilization, this is undesirable.

    Kindly let me know what are the implications of spawning 3 million tasks at once and if I should seek an alternate design instead. Also I need to understand what might be stopping .NET thread pool from releasing these threads.

    Friday, May 9, 2014 8:22 PM

All replies

  •             Parallel.ForEach(IPs, (ip) =>
                {
                        IP = ip;
    
                        TimeOutInMilliseconds = 3000;
                });

    This will manage your threads better. Please try this.

    Your code will create the Task regardless of how many others there are. This example i have provided will not do that until the resources are available.

    As a side note if you are running on a single core. Running things in parallel is completely useless and you will always result in less performance than if you ran it sequentially. Parallel code really gets its advantage from running on multiple cores.

    Alternatives would be to run sequentially, or get more cores on your machine to run in parallel.
    Friday, May 9, 2014 8:38 PM
  • The ThreadPool is designed for Tasks that take less than 1/2 second. If a worker item has been in the queue for more than 1/2 second the ThreadPool will create another thread.

    That's not to say, don't ever put a worker item into the ThreadPool queue that will take more than 1/2 second, just realize that you're outside of their primary scenario when doing so.

    It looks like you're code needs some sort of Semaphore to limit the number of outstanding jobs at any given time.

    Monday, May 12, 2014 8:02 PM
  • Thx Idaho. Parallel.ForEach() is a blocking call - until it executes all elements. SO I won't be able to use it.


    Shadab B

    Thursday, May 15, 2014 9:49 PM
  • @Shadab B

    If you don't want a blocking call then I would highly suggest looking into async operations. You could pass in chunks of a list to a asynchronous method that then could invoke the Parallel.Foreach. This will then prevent a blocking call on the main thread.


    Monday, June 2, 2014 10:05 PM