none
WOW64 OS bug: Old 32-bit XP apps failing under Win7 WOW64

    Question

  • I have a 32-bit app which has been running under XP for years.

    Running it under Win7-x64 WOW64 however, I find that it crashes.

    After hours of debugging, I find that WOW64 clobbers the ESP value (as returned by GetThreadContext(), of which the the main thread in the app relies upon to get the current stack pointer of worker threads) when it's running kernel code (but restores ESP upon returning). Prior to calling GetThreadContext(), the main thread suspends all worker threads. If the main thread just so happens to suspend the worker thread while it's running the code in (note that my 32-bit debugger can't single step into 64-bit code)

    wow64cpu!X86SwitchTo64BitMode:
    73472320 jmp 0033:7347271E 
    

    the ESP gets changed to a value indicating a higher address than the actual stack pointer. I've seen this happen for SetEvent and SwitchToThread (as these are the most frequently called kernel functions in the app).

     

    This means that either SuspendThread is suspending a thread in an incompatible way to native x86, or the thread's context in WOW64 is not being protected when the code jumps to translation mode. Either way, it's a bug.

    Thursday, November 11, 2010 2:07 AM

Answers

All replies

  • Hi ZachSaw,

    Welcome to MSDN forums!

    I checked our internal databases and found several related known issues.  However, most of them should have been fixed.  

    If it is convenient, could you please share us with the 32-bit app for further investigation?   You can directly ping me at v-micsun@microsoft.com.   

    Have a nice weekend!

    Thanks
    Lingzhi Sun


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Friday, November 12, 2010 1:30 AM
  • Hi Lingzhi,


    I've sent you the 32-bit app.

    So far, from my communication with Ken Johnson (of Microsoft) aka skywing, the following hypothesis has been established (quote from his email):

    there’s an issue with get and set context operations against amd64 Wow64 threads returning bad information in circumstances when the thread is running long mode code in user mode.  This relates to us pulling the Wow64 context from the TLS slot (as described in the article) before that context structure has been updated with current contents.

    I'll send you another app tonight which is the debug build. It has the necessary traps to do a Software Interrupt (int 3) when it finds this anomaly in a suspended thread. Since the offending thread is suspended, you could perhaps then take a look at it to see if its 32-bit context is indeed stale.

    For what's worth, the WOW64 machines I've tested on are Core2Duo E7200 with 4GB RAM and Core2Duo T8100 with 2GB RAM, both running Win7-x64 with latest updates.


    Thanks,

    Zach


    Friday, November 12, 2010 3:42 AM
  • I've looked further into the issue and I can now confirm that it is an WOW64 OS bug.

    The stale contents from GetThreadContext() come from the previous system call out (a looong way up in the stack really - it's not as if it's a few instructions ago) when it should've returned contents from the *current* system call out (or to be precise, just before the call out to long mode takes place). Like Skywing said, you pulled the context before it's updated with the current contents.


    With that said, we can now conclude that it is indeed an OS bug.

     


    Saturday, November 13, 2010 5:41 AM
  • I've reported this issue as suggested to https://connect.microsoft.com/VisualStudio/feedback/details/621594/getthreadcontext-may-return-stale-contents-from-previous-call-out-to-long-mode

    Although this is not a VisualStudio problem, there's no category to submit a bug against the Windows OS.

     

    p.s. I'm pretty sure VisualStudio team is just going to close the report as "external" - they did that 2 years ago for my bug report against Vista OS. Where *DO* we report a bug against an OS?????

    Saturday, November 13, 2010 9:43 AM
  • Hi Zach,

    I also noticed that the connect team have handling several OS related issues before.   I will also keep an eye on our internal database to track the status of this issue.   If there is any updated messages, I will keep you informed.

    Have a nice day!

    Thanks
    Lingzhi Sun


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Monday, November 15, 2010 1:36 AM
  • Thanks Michael, you've been very helpful.

    In case anyone wants to find out more about this bug and follow my conversation with Alexey Pakhunov of Microsoft, feel free to take a look at my blog post - http://zachsaw.blogspot.com/2010/11/wow64-bug-getthreadcontext-may-return.html

    Monday, November 15, 2010 3:04 AM
  • :)  Actually, I think you are really helpful! 

    I believe your blog article will definitely be beneficial to other community members.

    Good day!

    Thanks
    Lingzhi Sun


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Tuesday, November 16, 2010 1:36 AM