none
C# application works incorrectly when optimized in a 64 bit environment RRS feed

  • Question

  • I have a large and complicated C#-based .NET 2.0 application that has been running well in a variety of 32-bit environments.  I recently moved it to a 64-bit machine so we could take advantage of the much larger virtual address space.  After a few start-up glitches involving some third-party libraries the application appeared to work flawlessly.  Recently, we noticed an anomaly in a moderately-sized computation when performed on the 64 bit machine.  On all the 32 bit machines this same computation runs fine.  After some experimentation I discovered that the problem occurs with the Release build only, and not the Debug build.  Upon further investigation it appears that I can make the problem come and go by turning on and off the optimization switch.

    At a fairly high level, I know that in the failing case a list of items is missing a member.  I haven't had any success trying to debug the application to get more information.  The computation is sufficiently complicated that it might take some time to try and narrow down the problem.  I also know that the result is very sensitive to the environment, in the sense that running the same computation from a different code path (i.e., batch vs. interactive) appears to work fine.

    Of course, even if I find this one case I am now very worried that there are some serious and subtle bugs in the 64-bit optimizer.  I've done extensive web searching and failed to find any comments about any C#/CLR optimization problems.  This is scarily reminiscent of the old days where you'd have to debug your C++ code twice -- once in debug mode and then again in release mode. 

    I'd appreciate any suggestions, or even better, a pointer to a hot-fix.

    Thanks,
    --Howard
    Saturday, August 1, 2009 7:53 PM

All replies

  • Is you app multi-threaded?  My guess without seeing an example, or any of the code, is that it's not a optimization bug but rather how you have implemented the computation. Either you're running into a race condition with multi-threading or you're using something like a loop that does nothing for timing and it's being optimized out because it really does nothing. 

    You're going to need to provide more detail to get a detailed answer. 

    Jerry Schulist

    Please remember to mark replies which answer your question as answers and vote for replies which are helpful.
    Saturday, August 1, 2009 10:55 PM
  • Aside from spurious multithreading issue, another possibility are mismatched pointer sizes.

    You wouldn't have any pointer manipulation code in your application or in those third-party libraries, would you?  If such code casts a pointer to int (= Int32) that would work fine on 32-bit systems but lead to occasional bugs or crashes on 64-bit systems since 64-bit pointers don't fit in an Int32 variable.  And you don't get any compiler errors or warnings about such code, either.

    In any case, it's much more likely that there's a problem in your code rather than in the 64-bit optimizer.
    Sunday, August 2, 2009 7:28 AM
  • Thanks for the suggestion.  My app is not multi-threaded at the level of the failure.  In fact, the failure is totally 100% reproducible in the 64 bit release version and total 100% not there in the 64 bit debug, 32 bit debug and 32 but release versions.  The third-party libraries don't come into play at this particular point in the computation -- I just mentioned them to complete the picture.

    I am not running any unsafe code or using pointers in any way.

    I know that I didn't provide enough information to solve the problem, however, I'm hoping to hear whether any others have experienced release/debug inconsistencies and any suggestions for how to go about isolating the problematic code.

    Given that the problem is 100% reproducible I may be able to instrument the code to detect the failing case very early and generate some logging of related data.

    While I would almost always agree with the assertion that the problem is in my code, my gut, after 30+ years of experience, says that there might really be a deeper problem here.

    Monday, August 3, 2009 6:19 PM
  • Next step would be to isolate the problematic code down to an example that reproduces the issue 100% of the time like you're mentioning.  Hard code any input to remove dependencies on databases/external sources and remove any unnecessary code that handles other pieces of the app/computation.  Post the example when you have it and we will try to help narrow it down.
    Jerry Schulist

    Please remember to mark replies which answer your question as answers and vote for replies which are helpful.
    Monday, August 3, 2009 6:54 PM