none
task_group - VS10 - creates large number of threads with default policy RRS feed

  • Question

  • Hi,

    I use task_group to process in parallel hundreds of requests per second at peek time. A requests to process does not take more than 10-50 ms each.

    What I've observed is that if I make at one point let's say 100 requests wait for them to be processed, and repeat this for like 4-5 times I end up on my 8 way CPU machine (8 cores) with around 50 threads. I know that normally, considering I use default scheduler I shouldn't end up with more than my computer concurrency.

    Are there some things that could trigger this behavior ? I mention that one task does not wait on another task to finish, the tasks are independent.

    Thanks,

    -Ghita

    Wednesday, June 6, 2012 4:08 PM

Answers

  • Just found out that somewhere inside there was some code that performed concurrency aware locking (reader_writer_lock) and inadvertently created a lock convoy. That was the reason PPL was creating so many threads, because dowork() function that was supposed to be independent was locking (concurrency aware locking).

    Thanks for your support.

    • Marked as answer by raiderG Thursday, June 7, 2012 7:47 AM
    Thursday, June 7, 2012 7:47 AM

All replies

  • I ran the following test() function in a loop and monitored the number of threads. On my machine it was close to the number of cores (11 threads on 8 cores - the main thread, 2 background threads and 8 threads doing the actual work).

    void test()
    {
        task_group tg;
        for(int i=0; i < 100; i++) tg.run([]() { dowork(); });
        tg.wait();
    }

    Do you have any cooperative blocking in the tasks? Thread creation is usually in response to cooperative blocking (Concurrency::critical_section, Concurrency::event etc).

    --Krishnan (Microsoft)

    Wednesday, June 6, 2012 6:21 PM
  • Thanks for prompt response Krishnan.

    Actually I do have the dowork() called from task_group->task(inside task_group)->wait for future.

    "Wait for future" part is implemented using task_group/agents in order to implement a std::future like functionality (for use with VS10).

    For me to schedule a doWork() I need first to schedule a task inside task_group and then inside that task wait for a future (represented by doWork() body)

    I would expect though that even if I must wait inside my task_group task for something external the system would not create that many threads.. Or am I wrong ?

    Wednesday, June 6, 2012 6:29 PM
  • I was using something similar to future implementation from http://msdn.microsoft.com/en-us/library/dd764564.aspx

    Also I should note that I don't do tg.wait() until my program exits, because I return the doWork() results in a callback.

    • Edited by raiderG Wednesday, June 6, 2012 6:32 PM
    Wednesday, June 6, 2012 6:31 PM
  • The "Wait for future" would incur cooperative blocking as the implementation in http://msdn.microsoft.com/en-us/library/dd764564.aspx uses task.wait. This would trigger the creation of new threads to maintain the specified level of concurrency (in this case it is equal to the number of cores). The concurrency runtime caches the created threads so that it could be re-used. Thus the number of thread you are seeing would corresponding to the peak when there was probably a lot of blocking operations.

    For the next version of VS the concurrency runtime has thread throttling mechanisms to prevent an explosive growth in the number of threads.

    Thank you,

    --Krishnan (Microsoft)

    Wednesday, June 6, 2012 6:51 PM
  • Thanks for clarification, I also feel that throttling makes sense because I am waiting inside my task_group task for a CPU intensive operation to finish, and creating additional threads for every now work item creates real problems in my case as it seems.

    As a workaround for this I think I will have to make the processing inside my task_group without "depending"/waiting on additional tasks then. This should keep the thread creation level at reasonable limits.

    Wednesday, June 6, 2012 6:58 PM
  • And another question, regarding the caching of threads. For how long does the scheduler keep the "peak" created threads ?
    Wednesday, June 6, 2012 7:00 PM
  • Just found out that somewhere inside there was some code that performed concurrency aware locking (reader_writer_lock) and inadvertently created a lock convoy. That was the reason PPL was creating so many threads, because dowork() function that was supposed to be independent was locking (concurrency aware locking).

    Thanks for your support.

    • Marked as answer by raiderG Thursday, June 7, 2012 7:47 AM
    Thursday, June 7, 2012 7:47 AM