none
GetThreadContext returns stale register values on WOW64

    Question

  • Hi,

    While using Boehm GC (http://www.hboehm.info/gc/) I've noticed issues on WOW64 (Windows 10 x64) while no issues on Windows 10 x86.

    After doing some research I've found that there is a known incompatibility on WOW64: http://zachsaw.blogspot.com/2010/11/wow64-bug-getthreadcontext-may-return.html

    http://www.nynaeve.net/?p=129 details how WOW64 is saving the compatibility mode (32-bit) registers while WOW64 is in long mode (64-bit).

    As I understand WOW64 GetThreadContext uses the actual context of a saved 32-bit context based on the CS selector. Thread is already using the 64-bit CS when the 32-bit context is saved, that has a possible race condition resulting a fully or partially stale context being returned by GetThreadContext when SuspendThread was called before the 32-bit context is fully saved.

    The problem is that 32-bit code has no access to the 64-bit context so there is no way to notice or avoid this race condition.

    I believe that the race condition could be easily eliminated in WOW64 by either one of the following methods:

    1. 32-bit context could be saved while in 32-bit mode before taking a far jump to 64-bit mode. This made sure that the saved 32-bit context is always up-to-date when the CS selector is in 64-bit mode, fixing WOW64 GetThreadContext. Malicious 32-bit code executing in WOW64 can easily make a far jump to 64-bit mode, so saving the 32-bit context in 64-bit mode provides no security improvements in my opinion. This seems to be the easiest solution to me.
    2. WOW64 GetThreadContext could check RIP (instruction pointer) and not use the saved 32-bit context while the context saving code is still executing.
    3. Alternatively the saved 32-bit context could be marked as stale when leaving 64-bit mode and marked as up-to-date when save is completed.

    I’ve attached a test code for reproducing the issue. I was able to reproduce issue on physical and virtual machines using various versions of Windows x64. Note that the code demonstrates stale ESP being returned but the issue affects other registers in the CONTEXT structure too.

    Could you please comment on whether this issue is known to Microsoft and whether there is a fix planned for the issue.

    Thank you.

    Best Regards,
    Kornel

    Test code for reproducing the issue:

    #define WIN32_LEAN_AND_MEAN
    #include <SDKDDKVer.h>
    #include <stdio.h>
    #include <tchar.h>
    #include <windows.h>
    
    DWORD savedEsp;
    DWORD prevEsp;
    
    void SwitchToThreadWithStack(DWORD stackSize)
    {
    	__asm
    	{
    		sub esp, stackSize;
    		mov savedEsp, esp;
    		call SwitchToThread;
    		mov savedEsp, 0;
    		mov prevEsp, esp;
    		add esp, stackSize;
    	}
    }
    
    DWORD WINAPI ThreadProc(LPVOID lpParameter)
    {
    	while (true)
    	{
    		SwitchToThreadWithStack(8);
    		SwitchToThreadWithStack(32);
    		SwitchToThreadWithStack(4);
    		SwitchToThreadWithStack(16);
    	}
    
    	return 0;
    }
    
    int main()
    {
    	HANDLE hThread = CreateThread(NULL, 0, &ThreadProc, NULL, 0, NULL);
    	if (hThread == NULL)
    	{
    		printf("ERROR: CreateThread failed.\r\n");
    		return 1;
    	}
    
    	while (true)
    	{
    		if (SuspendThread(hThread) == -1)
    		{
    			printf("ERROR: SuspendThread failed.\r\n");
    			return 1;
    		}
    
    		CONTEXT context;
    		context.ContextFlags = CONTEXT_CONTROL;
    		if (!GetThreadContext(hThread, &context))
    		{
    			printf("ERROR: GetThreadContext failed.\r\n");
    			return 1;
    		}
    
    		if (savedEsp != 0 && prevEsp != 0 && context.Esp > savedEsp)
    		{
    			if (context.Esp <= prevEsp)
    			{
    				printf("Stale ESP\tContext ESP: 0x%x\tSaved ESP: 0x%x\tPrevious ESP: 0x%x\r\n", context.Esp, savedEsp, prevEsp);
    			}
    			else
    			{
    				printf("ERROR: Unknown ESP.\r\n");
    				return 1;
    			}
    		}
    
    		if (ResumeThread(hThread) == -1)
    		{
    			printf("ERROR: ResumeThread failed.\r\n");
    			return 1;
    		}
    
    		SwitchToThread();
    	}
    
    	return 0;
    }

    • Edited by Kornel Pal Tuesday, January 5, 2016 4:50 PM
    Monday, January 4, 2016 6:20 AM

All replies

  • Hi Kornel,

    I would help you to find some senior engineer help with this question.

    Thanks for your understanding and it might take few days, please also be patient.

    --James


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Tuesday, January 5, 2016 6:19 AM
    Moderator
  • Hi James,

    Thank you for taking care of my issue.

    Please let me know if you need more information.

    Best Regards,
    Kornel

    Monday, January 18, 2016 8:47 AM
  • Hi.

    Yes, we are aware of the condition. You can overcome this issue by utilizing the CONTEXT_EXCEPTION_REQUEST and making sure the response has CONTEXT_EXCEPTION_REPORTING set and both CONTEXT_EXCEPTION_ACTIVE and CONTEXT_SERVICE_ACTIVE not set.

    Cheers.

    Pedro


    PedroT

    Monday, February 8, 2016 11:16 PM
  • Hi Pedro,

    Thank you for your follow up and I appreciate pointing to CONTEXT_EXCEPTION_REQUEST as a workaround.

    While it may be acceptable for a self-contained test case – like the one in my original post – to ignore invalid thread contexts, unfortunately that is not the case for a real world application that is using Boehm GC.

    As I understand, checking for CONTEXT_EXCEPTION_ACTIVE or CONTEXT_SERVICE_ACTIVE enables telling when the thread is executing user-mode code in long mode, executing kernel-mode code or is waiting in kernel-mode. As a performance degrading workaround, the thread could be resumed and suspended again in the hope of capturing a fresh compatibility mode context. Although of these cases only the "executing user-mode code in long mode" case might contain a stale context, the "waiting in kernel-mode" case (like WaitForMultipleObjects or Sleep) is immune to the resume-suspend round trip and thus ruling this out as a feasible workaround. (An alertable wait could be cancelled by an APC but that would alter program behavior and would require all waits to be alertable.)

    A more feasible workaround was to be able to tell when the thread is executing user-mode code in long mode from the other cases or if there was a way to obtain the actual long mode thread context.

    Could you please comment on whether a fix for the race condition in saving compatibility mode context is planned to enable backward compatibility with the x86 version.

    Thank you.

    Best Regards,
    Kornel

    Thursday, February 11, 2016 8:44 PM