Do you have a retry policy in place? This is a recommended best practice as requests can and do fail periodically, especially as resources shift around within the datacenter.
That said, I'm not certain what your specific isue is here. I'd recommend you try to drill into the role instances and see if you can capture the exact error message that's causing the http 500 response. This will help isolate if its the app itself, appfabric,
or some other dependency. Once we that that info, it should be a bit easier to pinpoint the issue.
You're clear on the symptom, not on the problem. The symptom is the 500 error.
What I'm wondering is if you remote into the VM instance, you can find anything in the logs (either event or IIS) that indicate what the failure is. Maybe you can even see that the instance(s) aren't even getting the initial request, pointing to something
within the Azure Fabric or the internet connection that's causing the problem.
A 500 error is fairly generic and could be caused by a million things (I've even seen browsers cache it on rare occasions). So without knowing more about the details behind the 500 error, I'm not sure I can really point you in any direction.