I was playing with "Samples for Parallel Programming with the .NET Framework 4 (http://code.msdn.microsoft.com/ParExtSamples)" and just tested the release version of Raytracer_CSharp and Raytracer_FSharp. The speed of both were the same in single threaded mode and interestingly the memory usage of F#version was less than the C# version. But when I switched to parallel mode, the C# version was faster than the F# version. I had a look at the CPU consumption and found out that the C# version was consuming 92% to 96% of CPU time while the F# version about 82% to 87%. Any explanation for this behavior?
I looked into this, and, after a bit of digging, discovered that the difference is that the Color and Vector types in the C# code are structs, but in the F# code they are classes. This is leading to more GC, which is interuptng the raytracing code more often.
You can make the F# code the same by adding [<Struct>] above each of the Color and Vector type definitions.
Coincidentally, to hunt this problem down, I used the new VS2010 Concurrency Profiler, to grab traces of the two versions of the app. There was a very clear difference between the two, where the worker threads doing the work for the Parallel.For were constantly sleeping in the F# version, but were almost pure execution in the C# version. Selecting the periods of sleeping on these threads interspersed between the raytracing execution showed stack traces which always included a Vector or Color operation causing an allocation which ultimately put the thread to sleep when a GC was needed.
Proposed as answer byJosh PhillipsFriday, November 06, 2009 8:59 PM