While running some tests related to parallel actions against the Azure data store on the devfabric (via BeginUploadFromStream/EndUploadFromStream), we are experience some **very** poor performance from the local dev store.
When the calls run sequentially, everything is fine....but as soon as you invoke multiple calls that overlap, the DSService.exe consumes a majority of the local CPU (%50 ~ %75), with the remaining CPU taken by SqlServer.exe. Also, during this time,
performance comes to a crawl. It should be noted that the WaWorkerHost processes are idle during this time (as they should be while waiting for callbacks).
Is anyone else experiencing this? And if so, what is going on? I realize the real Azure data store isn't SQL Server based, but this makes development a bit tough, to say the least.
Sure...also, just to add some more information, it seems that the results in a separate solution are a little different - but the end result is the same. The async version takes *much* longer to run, and in some cases isn't adding the data at all (timeout?)...either
way, it seems to be screaming deadlock and/or thread issues.
For example, here is a sample run from a unit test. "Example #1" is a synchronous version, looping an insert 50 times. "Example #2" is using the overlapping BeginUploadFromStream / EndUploadFromStream pattern, and notice the execution
time. Also notice that the second set of items never get inserted (total count should be 100).
Press any key to start...
Example #1 finished in 4274 ms...
Total items = 50
Press any key to continue...
Example #2 finished in 97795ms
Total items = 50
Press any key to continue . . .
Thanks for the help...again, not sure if this is an issue with the dev runtime, or our code - either is possible and completely acceptable! **laugh**.
I have a VS2010 solution ready...where should I send it? I don't see any option here to upload a ZIP file.
Indeed...we set it to at least 25, depending on role size. Just to followup, the "no data written" issue was our fault. We had the streams passed to the async methods wrapped in "using" blocks, which meant when the async method
returned (immediately), the stream was disposed. We've fixed that, but are still seeing very odd delays (almost a queuing-like behavior) when executing parallel calls against the dev fabric storage.
Is it possible that calls to dev storage are queued on-purpose? I know it's a stretch, but we cannot explain the serialized behavior we're experiencing. Please not, it doesn't seem to be limited to our code. If, while our test code is running the parallel
test, we try to use the Neudesic storage explorer, it will freeze as well until the "queue is released" (that's the way it behaves anyway).
Ok, so after enabling SQL logging on the dev fabric, we have found the following error:
"11/22/2010 4:04:04 PM [UnhandledException] EXCEPTION thrown: Microsoft.Cis.Services.Nephos.Common.Protocols.Rest.FatalServerCrashingException: The fatal unexpected exception 'Transaction (Process ID 55) was deadlocked on lock resources with another
process and has been chosen as the deadlock victim. Rerun the transaction.' encountered during processing of request. ---> System.Data.SqlClient.SqlException: Transaction (Process ID 55) was deadlocked on lock resources with another process and has been
chosen as the deadlock victim. Rerun the transaction."
Is this a known issue? We've seen this many times before on other in-house applications that use SQL server....especially on apps that didn't take concurrent calls into consideration.
So...what can be done about this? If needed, we have all log files gathered (10 of them), plus the solution that caused this condition.
Thoughts? Fixes? Hopefully, this is something that can be resolved...is there a chance this issue only existed in the 1.2 tools, and is/will be fixed in 1.3?
Can you run a quick test and see if calling ThreadPool.SetMinThreads to increase the number of idle threads before you run your parallel storage client code makes any difference? This was some time ago and I don't remember the exact details, but I vaguely
remember that this made a noticeable difference in this particular case.