none
Help it different JIT results on different machines RRS feed

  • Question

  • I have code that has been developed on a 32 bit XP system using VS 2008. Framework 1.1, 2.0 SP 1, 3.0 SP 1  and 3.5 and Windows SDKs 5.0, 6.0 and 6.0a are installed. (Prior to VS 2008, development was being done with VS 2005 previous version so none of the current was a clean install.) The code is marked to target Framework 3.5

     

    The code works on this system.

     

    I move the binaries (debug build with the .exe module set for platform x86) and they fail.  After catching the error with WinDbg, it appears that the THIS pointer for a method it corrupted.  Below is the resutling JIT'd code on the XP system.

     

    Code Snippet

    public string GetResponseMessage (byte[] sessionKey, ulong responseNumber)

    {

    00000000 55 push ebp

    00000001 8B EC mov ebp,esp

    00000003 57 push edi

    00000004 56 push esi

    00000005 53 push ebx

    00000006 83 EC 5C sub esp,5Ch

    00000009 33 C0 xor eax,eax

    0000000b 89 45 B8 mov dword ptr [ebp-48h],eax

    0000000e 89 45 BC mov dword ptr [ebp-44h],eax

    00000011 89 45 F0 mov dword ptr [ebp-10h],eax

    00000014 33 C0 xor eax,eax

    00000016 89 45 E4 mov dword ptr [ebp-1Ch],eax

    00000019 89 55 C0 mov dword ptr [ebp-40h],edx

    0000001c 8B D9 mov ebx,ecx

    0000001e 83 3D 38 1A C1 03 00 cmp dword ptr ds:[03C11A38h],0

    00000025 74 05 je 0000002C

    00000027 E8 8B 86 74 75 call 757486B7

    0000002c 33 D2 xor edx,edx

    0000002e 89 55 B4 mov dword ptr [ebp-4Ch],edx

    00000031 33 D2 xor edx,edx

    00000033 89 55 B0 mov dword ptr [ebp-50h],edx

    00000036 33 D2 xor edx,edx

    00000038 89 55 AC mov dword ptr [ebp-54h],edx

    0000003b C7 45 A8 00 00 00 00 mov dword ptr [ebp-58h],0

    00000042 90 nop

    DateTime start_time = DateTime.Now;

    00000043 8D 4D A0 lea ecx,[ebp-60h]

    00000046 E8 B5 B0 9C 74 call 749CB100

    0000004b 8D 7D B8 lea edi,[ebp-48h]

    0000004e 8D 75 A0 lea esi,[ebp-60h]

    00000051 F3 0F 7E 06 movq xmm0,mmword ptr [esi]

    00000055 66 0F D6 07 movq mmword ptr [edi],xmm0

     

     

     

    Below is the resulting JIT'd code on the 2nd system which is a 64 bit install (not Itainum) 2003 server.  (The code should be running in 32 bit mode.) Notice that stack offest and even instructions are different.  I do not know enough x86 assembler to be able to make a reasonable comment as to if the server version code is correct or not, but the results of the code are not.  Could some please comment on the correctness of the code on the server. The server does not have framework 1.1, but has 2.0 SP 1, 3.0 SP 1  and 3.5.  It does not any an Windows SDKs or Visual Studio installed.

     

    Code Snippet
    05f88fe8 55              push    ebp
    05f88fe9 8bec            mov     ebp,esp
    05f88feb 57              push    edi
    05f88fec 56              push    esi
    05f88fed 53              push    ebx
    05f88fee 83ec2c          sub     esp,2Ch
    05f88ff1 33c0            xor     eax,eax
    05f88ff3 8945ec          mov     dword ptr [ebp-14h],eax
    05f88ff6 8945f0          mov     dword ptr [ebp-10h],eax
    05f88ff9 8955dc          mov     dword ptr [ebp-24h],edx
    05f88ffc 8bd9            mov     ebx,ecx
    05f88ffe 833d082edf0100  cmp     dword ptr ds:[1DF2E08h],0
    05f89005 7405            je      05f8900c
    05f89007 e83bf31974      call    mscorwks!JIT_DbgIsJustMyCode (7a128347)
    05f8900c 33d2            xor     edx,edx
    05f8900e 8955d8          mov     dword ptr [ebp-28h],edx
    05f89011 33d2            xor     edx,edx
    05f89013 8955d4          mov     dword ptr [ebp-2Ch],edx
    05f89016 33d2            xor     edx,edx
    05f89018 8955d0          mov     dword ptr [ebp-30h],edx
    05f8901b c745e800000000  mov     dword ptr [ebp-18h],0
    05f89022 90              nop
    05f89023 8d4de0          lea     ecx,[ebp-20h]
    05f89026 e8651d4273      call    mscorlib_ni+0x2ead90 (793aad90) (System.DateTime.get_Now(), mdToken: 060002cf)
    05f8902b 8d7dec          lea     edi,[ebp-14h]
    05f8902e 8d75e0          lea     esi,[ebp-20h]
    05f89031 a5              movs    dword ptr es:[edi],dword ptr [esi]
    05f89032 a5              movs    dword ptr es:[edi],dword ptr [esi]

     

     

    Right after the code shown, another method is called.  That method is the one that aborts. The stack walkback at that point indicates that the frames are not valid and the debuger shows invalid information for the THIS and other 2 parameters to the called method.

     

    Sunday, April 20, 2008 8:01 PM

Answers

  • You cannot compare JITted code between machines.  The JIT compiler customizes code generation based on the CPU architecture.  That's obvious towards the end of the dump, the XP JITter uses SSE instructions, the server JITter isn't.  That's going to affect stack offsets too.

    I don't see any call to a method that might cause the crash.  The "this" pointer is stored in the ecx register, its handled the same way in both snippets.  Assuming a JIT bug is probably not going to be productive, it is by far the most tested code in the framework.
    Monday, April 21, 2008 11:33 AM
    Moderator

All replies

  • You cannot compare JITted code between machines.  The JIT compiler customizes code generation based on the CPU architecture.  That's obvious towards the end of the dump, the XP JITter uses SSE instructions, the server JITter isn't.  That's going to affect stack offsets too.

    I don't see any call to a method that might cause the crash.  The "this" pointer is stored in the ecx register, its handled the same way in both snippets.  Assuming a JIT bug is probably not going to be productive, it is by far the most tested code in the framework.
    Monday, April 21, 2008 11:33 AM
    Moderator
  • Thanks for you comments on the JIT'd code. I had no idea that the stack offests would be affected by over 30 hex due to the different instructions used.

     

    As far as your comments on not blaming the JITer.  I understand, but I have been a professional programmer for over 30 years on 4 widely different platforms and have managed to break a compilier or two on each one, includeing MS VC++ at least 3 times.  Blaming the compilier is not the first place to look by any means, but let us say that expierence has taught me that I at least need to think about it in the back of my mind.  (The last time I wasted days before I started to look closely at the VC++ generated code.  It only took MS 2 levels to fix it.)

     

    Anyway, I will do some more looking and see if I can find out where THIS is going wrong.  Looks like I may have to learn yet a little more X86 assembler.

     

    Monday, April 21, 2008 1:38 PM