Parallel.For memory leaks?
-
Thursday, April 03, 2008 8:53 AMI have a cache which clones cached objects before returning them.
The sequential code I had is:
for (int i = 0; i != univers.Count; i++)
{
T o = univers.Values
;
tmp
= clone(o);
}
To test the new parrallel extensions library, I modified that code to the following:
Parallel.For(0, univers.Count, 1, i =>
{
T o = univers.Values
;
tmp
= clone(o);
});
It's a real world working financial application. We used that cache heavilly...
I asked the cache 100 times for a univers of ~20000 objects.I statistically get 30% less time with the parallelism on a double core then its sequential version.
Giving the very small impact in the code, I were really happy with the 30% improvement since there is some locking involved in clone(o)...
On the other hand I was surprised when I saw that the memory usage is growing indifenetly.
I used the scitech memory profiler. And my conclusion is that the Action (the lambda expression) as a parameter was kept alive by the Task object.
Since that lamda expresion carries the tmp array, I got all my cloned objets alive indifently...
Does anyone noticed such a problem?
For information, I use the version 1.0.2873.22603 which i got with Dec07 CTP.
All Replies
-
Thursday, April 03, 2008 12:45 PM
I see you referencing the tmp[] array in the action, but I don't see where it is defined. I'm assuming that you have defined it outside the scope of the loop? Wouldn't you have that same problem with a sequential loop? I'll have to write some test code to investigate.
Since you brought the topic up, here's what I can tell looking at the Dec 2007 CTP code using Reflector:
-
Parallel.For() uses a self-replicating task for its work (no surprise there).
-
The Task class implements IDisposable. The Dispose() method on a Task eventually gets down to disposing a manual reset handle.
-
Parallel.For() doesn't seem to call Dispose() on the task it creates (hmmmm).
-
System.Action<T> is just a delegate for void Method(T obj), so a dispose pattern isn't really an option there.
I'm not sure if I have the whole story, but that's what I see so far. Please correct my misunderstandings ...
Remember that this is a CTP, and things aren't fully baked yet. Don't worry too much about the memory pressure just yet. I'm sure that this will all get worked out in future CTPs and betas.
-
-
Thursday, April 03, 2008 1:31 PMJust to answer your question. tmp is just declared outside the for scope:
T[] tmp = new T[univers.Count];
In sequential code, tmp goes out of scope and it is collected by the GC.
In parallel code, tmp is captured (curryed) within the delegate. the delegate is referenced by Action<T> which seems to be kept alive by the Task...
I hadn't time to use a reflector but since you did it... I tried to figure out the problem by my self...
Actually For(...) mainly calls ForWorker<>(...) which calls:
Task.Create(....).Wait();
which returns a new Task (the one referencing the Action<T>).
That Task is not disposed within the ForWorker<> method.
Will it be disposed asynchronously by some one else? it seems strange...
Any way, the allocated task will go out of scope just after Wait() and should be collected if not referenced by someone else: the TaskCoordinator?
what's strange is that my example is very 'simple'. Any one who had tried to make extensive tests with the library will have met this memory leaks...
I'll try to make a standalone test application...
Thanks,
Hicham BOUHMADI. -
Thursday, April 03, 2008 2:52 PMOwner
Thanks, Hicham. That the Task isn't disposed by the Parallel.For loop was a bug in the CTP; we've fixed it recently. You may also be running into a set of other issues in the CTP related to runaway thread counts, which could potentially lead to unbounded memory consumption. When we release our next preview, I hope you'll see these issues disappear; if you don't, please let us know.
Regardless, I'm glad to see that you're pleased with the speedup in your example, even with these early bits.
Thanks for the report.
-
Thursday, April 03, 2008 3:04 PMThanks for the reply.
I am not used to look forward new libraries and making beta testing but I think the functionnal programming and parallel computing direction that software design is taking is one of the most important challenges nowadays...
So I'm weeling for the release of your library... and the many other good features you can add to it...
Hicham BOUHMADI.

