locked
System.ExecutionEngineException in a Windows Store App? - Why?

    Question

  • Hoping for some help from the powers that be or other resident experts, as I think this is one tricky problem.

    I'm getting some seemingly random System.ExecutionEngineException events in my Windows Store app. Even though the error message itself is less than useless, when enabling native code debugging there does seem to be a connection to the use of streams and possibly thread blocking. The call stack almost always contains a reference to SHCreateStreamOnFileW or similar stream-related functions (although that particular call is unsupported in WinRT, I'm assuming the kernel is calling it under the hood; I certainly am not).  In my own code, I do indeed utilize CreateStreamOverRandomAccessStream though so there's an obvious possible connection there.

    I cut back on what was probably overly conservative use of EnterCriticalSection calls and that seems to have eliminated *almost* - but not all - of these insidious errors. Therefore I'm wondering if there's a connection between these errors and possible thread blocking or competing locks? They also definitely seem to occur only when disk access is sluggish. So again, I'm thinking some issue with thread blocking while waiting for the disk. But just a guess.   

    What's particularly confusing is that the docs say this exception is obsolete, so I don't understand why I'd be getting it in a Windows store app. 

    Any insight would be appreciated!

    Thanks,
    Peter

    Tuesday, January 27, 2015 3:48 PM

All replies

  • Hi Peter,

    Obsolete might not mean the API is unavailable, however it could be the API is not suggested to use or if we use the API, the warning will be displayed.

    As I can see from the documentation, looks like its a CLR issue instead of coding issue, its a internal error. Probably something wrong with your environment, I would like to know if you run the app on another machine, will the same exception be thrown?

    --James


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, January 28, 2015 2:45 AM
    Moderator
  • You know what? I was just telling one of my testers today that I've only ever had an issue on my one laptop and no other machines. And he's never experienced it. (Outside the dev environment if just manifests as a random app crash and he hasn't reported anything like that). Thing is the laptop is basically fine otherwise. Certainly no random app crashes except my own app. A few other data points. It only manifests in release mode so far. I've never seen it in debug mode. (Maybe code optimization could be contributing?) I am using a fair amount of C#/C++ interop and a lot of multithreadimg. When the crashes do occur they are always when the app is doing things that use the C++ modules and specifically they utilize COM streams (which I implement in C# to wrap .net streams) and synchronous disk read ops. IStream specifically. But the modules are always called from worker threads. They never block the UI thread. So that shouldn't be the issue. Even if it is my environment there should be a graceful way to handle the error. As I said other apps don't just crash. Trying to think of what other relevant information I can give without getting into gory and irrelevant detail. Again any other insights appreciated. I'll try to grab a stack history next time it happens (I have to be in native mode debugging to get any useful info). that will reveal more. Thanks so much. Peter
    Wednesday, January 28, 2015 4:59 AM
  • Upon further research, the most common report of this bug seems to be associated with hard-to-find errors in unmanaged code that are memory related and causing the CLR to think (probably correctly) that the heap has become corrupted. For example, accessing memory after deallocation, going beyond the bounds of allocation, etc. 

    Sure enough, I found a mistake in my C++ code that could theoretically be the problem. I was reporting the incorrect length of a string to a third party library - it wanted byte count, and I was giving it character count. Since I was reporting the string as too short, it's not obvious to me how it could have caused memory corruption, but wrong is wrong. Crossing my fingers that that was the problem.

    The joys of unmanaged code...

    Wednesday, January 28, 2015 4:16 PM
  • One more update. 

    More evidence it is in fact the machine. Getting strange out of memory errors from MS office applications, which is absurd since it's an 8GB laptop.

    And once again, if memory is getting corrupted, that EngineExecutionException will rear its ugly head.

    Still wish I could handle it better than just crashing. If it is happening on my machine - and I take good care of my toys - it's going to happen on a customer's.

    Wednesday, January 28, 2015 11:45 PM
  • Hi peter,

    Thanks for your updates till now, so is it possible to handle the exception by UnhandledException on App.xaml.cs?

    Something like this:

    UnhandledException += (s, e) =>
            {
    
            }; 

    --James


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.


    Friday, January 30, 2015 8:14 PM
    Moderator
  • Execution engine exceptions usually destroy the process.

    They can be caused by invalid memory allocation, memory corruption, or overuse of the GC class.

    Friday, January 30, 2015 9:07 PM
  • Hey James,

    No it is not caught in UnhandledException. As mcosmin correctly observes, it kills the process. I don't even get detailed exception information from VisualStudio. I just get told that I have the exception and can't do anything other than stop the process.

    New data point - it seems to almost always happen within the 1st minute of app launch. I had the app running all day being used by dozens of people on 5 computers and it never crashed except a couple of times right after launch. But still a random time after launch. It's not as though I can pinpoint code in the launch sequence that's doing it. Could be 5 seconds in; 10 seconds in; etc. 

    So I was wondering could it be something in the JIT compiler??

    Saturday, February 07, 2015 1:14 AM
  • Right mscosmin, and I always do have unmanaged code running during any crash instance. That is really the only constant other than it always happens very early in the life cycle. 

    Definitely not using the GC class (have really never found much use for it). Still thinking it must be in the unmanaged code or the interop.

    Saturday, February 07, 2015 1:16 AM
  • Might also be a thread race somewhere which corrupts memory. There is really no universal fix to this. Execution engine is the type of exception that is very stealthy and very hard to catch and debug.

    Even the debugger delaying execution and messing up with thread synchronization can cause it. I've seen it happening. Even a break point stopping execution can do it.What usually seems to be the cause for it is one thread getting somehow "stuck" in its execution, and another thread who is waiting for the stuck thread throws the exception because it never gets to run (yes, I am looking at IAsyncActions there). This is basically OS level kind of stuff, which is very hard to track down (even CPU overheat can do it). This is why I said the GC class may do it. Calling GC.Collect repeatedly causes this exception with almost 100% reproducibility. If you allocate large amounts of memory, it may cause the GC to kick in even if you don't explicitly class GC.Collect.

    This exception usually doesn't happen in release modes, unless there is a terrible mistake somewhere in the code. And when I say terrible, I mean something which is very hard to track down.




    • Edited by mcosmin Thursday, February 26, 2015 7:31 AM
    Wednesday, February 25, 2015 7:41 AM
  • Hey thanks for the insight. 

    Since I last posted, I concluded that the problem was in my SQLite interop. Specifically, I am using a custom SQLite VFS which uses the WinRT file i/o rather than CreateFile2. This allows me to read/write to databases anywhere the app has access to - not just the application data folder.

    After opening the SQLite database, I would use CreateStreamOverRandomAccessStream to get a COM IStream. I was then using the IStream functions in my VFS implementation to write and read into the buffers being given to me by SQLite.

    (For those who don't know, an SQLite VFS is a lot like a COM interface; you implement the I/O functions yourself and SQLite calls them. That's how SQLite is able to be ported to so many darned platforms with relative ease).

    On a hunch, I stopped using IStream entirely and only used the proper Windows Runtime APIs. Unfortunately this means intermediate steps of copying to and from IBuffers. But, I have not had the error appear since making this change.

    My conclusion is that there is a bug somewhere in the interop between IRandomAccessStream and IStream. I've scoured my own code and it's not me.

    Keeping fingers crossed this is over!

    Wednesday, February 25, 2015 7:28 PM
  • COULD THIS HAVE BEEN IT?

    Today I noticed a line of legacy (really super-legacy) C++ code associated with my SQLite interop (I write my own SQLite VFS for WinRT, rather than utilize the default one; this way I can access databases anywhere declared in the manifest):

    sqlite3_vfs *CreateWinRTVFS()
    {
    	static sqlite3_vfs WinRTvfs = { ... }
     	return &WinRTvfs;
    }

    Normally I'd say the "static" there should save me, but lord only knows with Windows Runtime. If "static" didn't save me, I could absolutely see this causing memory corruption problems. 

    Needless to say I promptly changed this to use "new". 

    Could this have been it?

    Friday, February 27, 2015 1:46 AM
  • Maybe. If it fixed your problem and didn't cause other problems then that was it. As I said, this thing is stealthier than a stealth plane
    Friday, February 27, 2015 11:12 AM