agent::start() does not schedule ::run() method
-
2011년 10월 20일 목요일 오후 7:53
I have been tearing my hair to the point of baldness for the past 2 days over this. What would cause an agent to NOT get scheduled after a call to start()?
I have a very plain vanilla agent based class that performs some activity on run() and exits. From an external code I do an equally plain vanilla myagentclass->start(). The run() method is subsequently **never** called. This is after I create/start/stop/destroy a few other instances of the same agent class. No matter how many instances I create why should it matter? Why in the God's name would the scheduler not honor my start() call?
If I debug the start() method, it does execute the line:
CurrentScheduler::ScheduleTask(proc, this);
AFter that, nothing!
What am I missing?
UPDATE (a few mins later):
===================
This is incredibly insane. I think I found out what the problem is but I am not able to explain it. My scenario is like this:
External Code =(starts)==> Agent1 =(starts)==> Agent2
External code starts Agent1 and Agent1 in turn starts Agent2.
Agent2, however goes and does a ::WaitForMultipleObjects() on a few event objects as soon as it enters run() method.
This seems to be a problem for any kind of subsequent start() call on newer instances of Agent1.
Once I unblock Agent2 from its wait state, magically all those calls that never got scheduled start executing now.
What is going on?
- 편집됨 Dilip_K_R 2011년 10월 20일 목요일 오후 8:18 More findings
모든 응답
-
2011년 10월 29일 토요일 오전 12:42소유자
Hi Dilip,
That seems strange. Is it possible to share the full code snippet or share a very narrow repro steps in code? Also, when the new instances are created, can you comment on the load on the CPU, is at a 100%?
Thanks
Rahul
Rahul V. Patil -
2011년 10월 29일 토요일 오후 4:29
Dilip,
How many cores are on your machine? Also, what does a call to Concurrency::CurrentScheduler::Get()->GetNumberOfVirtualProcessors() return? The Concurrency runtime is co-operative in nature - that is if the max concurrency of the scheduler is 1, and 1 task is executing, a new agent will not be started until that task blocks, yields or is finished. Blocking is accomplished by apis like receive(), wait(), event.wait(), etc. Blocking on a Win32 event however, is not detected by the Concurrency Runtime's scheduler.
It's likely that the virtual processor(s) of the scheduler are executing some task/agent that is preventing the scheduler from invoking run on your agent. You could also check in the debugger if you have any threads with FreeThreadProxy::Dispatch on the stack and let us know what they are doing.
Hope this helps.
--Geni
-
2011년 11월 2일 수요일 오전 9:20
Hi Geni and Rahul
Unfortunately I have moved on because I gave up following up on responses after a couple of days. I had to remove concurrency framework from my project and ended up standardizing on boost::thread. Geni's explanation seems to make the most sense to me. I no longer have any time to reproduce this but at least I now know what I should NOT be doing!
-
2011년 11월 2일 수요일 오전 10:16
Dilip,
How many cores are on your machine? Also, what does a call to Concurrency::CurrentScheduler::Get()->GetNumberOfVirtualProcessors() return? The Concurrency runtime is co-operative in nature - that is if the max concurrency of the scheduler is 1, and 1 task is executing, a new agent will not be started until that task blocks, yields or is finished. Blocking is accomplished by apis like receive(), wait(), event.wait(), etc. Blocking on a Win32 event however, is not detected by the Concurrency Runtime's scheduler.
[...]
Hope this helps.
--Geni
Hi Geni
One quick update. Earlier I was waiting on the Concurrency library's event::wait_for_multiple method. The docs say that I am dealing with a manual reset event. But it behaved quite flakily when I used that method to make several threads wait on a group of event objects. I couldn't figure out what the problem was after a while. I therefore switched to WFMO which got me further but promptly ran into agents not starting properly because the framework doesn't like agents waiting on Win32 APIs.
Its possible I am trying to fit a square peg in a round hole -- maybe my use-case is just not appropriate for the concurrency framework.
-
2011년 11월 2일 수요일 오후 7:01소유자
Hi Dilip,
One simple thing you could try is to trivially oversubscribe:
For example you would add:
Context::Oversubscribe(true) Win32::WFMO Context::oversubscribe(false)
This would take care of agents not starting properly when using Win32 API waiting objects. One caveat of course is that if you oversubcribe too many at the same time(say 100s), you may end up hurting performance.
Still, we are curious about what the wait_for_multiple problem is. When you say flaky, do you mean that certain tasks are not picked up predictably? How many agents are you creating? How many processors are on your box?
If I were to speculate that you have 4 core box and 200 long-running agents, then there is likely going be some of the issues you described (as Geni described above). Here is another simple thing for you to try; call yield() before waiting. This code will make it more predictable that all agents start, but you may need additional mecanisms depending on your workload.
Context::Yield() event.wait_for_multiple(...)Do let us know if either of these worked, along with how many agents and cores you are using.
Thanks!
Rahul V. Patil- 답변으로 제안됨 Rahul V. PatilModerator 2012년 1월 4일 수요일 오후 6:06
- 답변으로 표시됨 DanielMothMicrosoft Employee, Owner 2012년 6월 23일 토요일 오전 6:21
-
2011년 11월 11일 금요일 오후 2:44
Oversubscription would definitely help topic starter.
We had a similar issue in our program. Moreover, since we defaulted to UMS threads scheduler and developed the product on 64-bit Windows 7, we never faced the problem during development (this scheduler is capable of detecting waits on kernel resources). Finally, when we started to test the program on 32-bit XP, the problem started to appear and was fixed by oversubscription.
Since MS abandons the UMS scheduler in the next release, and it is only supported on 64-bit Windows 7, I suggest everyone test the code with standard scheduler and on single core machines (or VMs) to find all such cases.
Too bad Concurrency Runtime does not respect the process affinity (seems like it has been changed for Windows 8?), so actual single core VM or real OS (or special bootcfg option) must be used for testing.
-
2011년 11월 11일 금요일 오후 6:30소유자
Hi Barfy,
>>RE: Too bad Concurrency Runtime does not respect the process affinity (seems like it has been changed for Windows 8?), so actual single core VM or real OS (or special bootcfg option) must be used for testing
With Visual Studio 11 preview, the runtime now respects process affinity (as long as the affinity is set prior to any use of the runtime) - this will work on Vista, Windows 7, Windows "8" and Windows 2008 + servers. I think you had provided the feedback in the past.
Thanks
Rahul
Rahul V. Patil

