| Matthew Watson wrote: |
That's interesting! But it doesn't really answer the question about why the C# code is so much slower than the C++ code in the sample programs that I posted... Where is the overhead coming from?
| |
Based on running this in the CLRProfiler, it looks like the naive implementation in C# incurs some overhead due to SecurityPermissions. Dropping down to unsafe/fixed seems to remove that overhead. Without the overhead the performance difference factor is ~3 (for release build).
Based on the debug build disassembly (below), I'd expect (and observed) similar performance. The meat of the loop, the array access and multiplication, disassemble almost identically.
The addresses in the C# disassembly are psuedomemory addresses (start of app's memory space is location zero) which may account for the slight performance difference. Also, the C++ compiler is more mature than the C# compiler so the release build may produce somewhat more optimized code.
C# post-jit disassembly:
for (int index = 0; index < size; index++)
000000b6 mov dword ptr [ebp-40h],0
000000bd nop // unnecessary instruction as far as i can tell
000000be jmp 000000DB
value += (*pArr2++) * (*pArr1++);
000000c0 mov edi,dword ptr [ebp-3Ch] // body of loop starts here
000000c3 add dword ptr [ebp-3Ch],4
000000c7 mov esi,dword ptr [ebp-38h]
000000ca add dword ptr [ebp-38h],4
000000ce fld dword ptr [edi]
000000d0 fmul dword ptr [esi]
000000d2 fadd qword ptr [ebp-14h]
000000d5 fstp qword ptr [ebp-14h] // updates value of value, body of loop ends here
for (int index = 0; index < size; index++)
000000d8 inc dword ptr [ebp-40h]
000000db mov eax,dword ptr [ebp-40h]
000000de cmp eax,dword ptr ds:[00A15198h]
000000e4 jl 000000C0 // repeat the loop as long as index < size
C++ disassembly:
for (int index = 0; index < arrLen; index++)
00414052 mov dword ptr [index],0
00414059 jmp Class1::LookUp+64h (414064h)
0041405B mov eax,dword ptr [index]
0041405E add eax,1
00414061 mov dword ptr [index],eax
00414064 mov eax,dword ptr [this]
00414067 mov ecx,dword ptr [index]
0041406A cmp ecx,dword ptr [eax+8] // compare index to arrLen
0041406D jge Class1::LookUp+8Ch (41408Ch) // exit loop if index >= arrLen
{
value += arr2[index] * array[index];
0041406F mov eax,dword ptr [this] // body of loop starts here
00414072 mov ecx,dword ptr [eax+4]
00414075 mov edx,dword ptr [index]
00414078 mov eax,dword ptr [index]
0041407B mov esi,dword ptr [array]
0041407E fld dword ptr [ecx+edx*4]
00414081 fmul dword ptr [esi+eax*4]
00414084 fadd qword ptr [value]
00414087 fstp qword ptr [value] // body of loop ends here
}
0041408A jmp Class1::LookUp+5Bh (41405Bh)