locked
Azure Devfabric storage (DSService) problem... RRS feed

  • Question

  • While running some tests related to parallel actions against the Azure data store on the devfabric (via BeginUploadFromStream/EndUploadFromStream), we are experience some **very** poor performance from the local dev store.

    When the calls run sequentially, everything is fine....but as soon as you invoke multiple calls that overlap, the DSService.exe consumes a majority of the local CPU (%50 ~ %75), with the remaining CPU taken by SqlServer.exe.  Also, during this time, performance comes to a crawl. It should be noted that the WaWorkerHost processes are idle during this time (as they should be while waiting for callbacks).

    Is anyone else experiencing this? And if so, what is going on? I realize the real Azure data store isn't SQL Server based, but this makes development a bit tough, to say the least.

    Ideas? TIA...

    Thursday, November 18, 2010 9:51 PM

Answers

  • This is a known issue with parallel uploads of blocks in the development storage.  This will be fixed in a future release of the SDK.
    • Marked as answer by Brad Calder Sunday, March 27, 2011 2:23 AM
    Sunday, March 27, 2011 2:08 AM

All replies

  • Hi SagerCat,

    Please let me know your OS version and Computer hardware info. Besides , could you please provide a repro project? I would test it on my machine.

    Thanks,


    Mog Liang
    Monday, November 22, 2010 3:16 AM
  • Sure...also, just to add some more information, it seems that the results in a separate solution are a little different - but the end result is the same. The async version takes *much* longer to run, and in some cases isn't adding the data at all (timeout?)...either way, it seems to be screaming deadlock and/or thread issues.

    For example, here is a sample run from a unit test. "Example #1" is a synchronous version, looping an insert 50 times. "Example #2" is using the overlapping BeginUploadFromStream / EndUploadFromStream pattern, and notice the execution time. Also notice that the second set of items never get inserted (total count should be 100).

    Press any key to start...
    
    Example #1 finished in 4274 ms...
    Total items = 50
    Press any key to continue...
    
    Example #2 finished in 97795ms
    Total items = 50
    Press any key to continue . . .
    

    Thanks for the help...again, not sure if this is an issue with the dev runtime, or our code - either is possible and completely acceptable! **laugh**.

    I have a VS2010 solution ready...where should I send it? I don't see any option here to upload a ZIP file.

    Thanks!

    Hardware/Software Info:

    Dell Optiplex 755 / Windows 7 x64 Ultimate / Intel Core2 Quad Q6700 @ 2.66GHz / 8GB RAM

    OS Version: 6.1.7600

    Azure SDK Version: 1.2.10512.1409

    Azure Tools For VS2010: 1.2.30517.1601

     

    Monday, November 22, 2010 4:58 PM
  • Have you followed the advice on upping hte default connection limit in Item 1) of the performance post?
    Monday, November 22, 2010 7:02 PM
    Answerer
  • Indeed...we set it to at least 25, depending on role size. Just to followup, the "no data written" issue was our fault.  We had the streams passed to the async methods wrapped in "using" blocks, which meant when the async method returned (immediately), the stream was disposed. We've fixed that, but are still seeing very odd delays (almost a queuing-like behavior) when executing parallel calls against the dev fabric storage.

    Is it possible that calls to dev storage are queued on-purpose? I know it's a stretch, but we cannot explain the serialized behavior we're experiencing. Please not, it doesn't seem to be limited to our code. If, while our test code is running the parallel test, we try to use the Neudesic storage explorer, it will freeze as well until the "queue is released" (that's the way it behaves anyway).

    Thoughts / comments / help appreciated...

    Monday, November 22, 2010 7:28 PM
  • Adam Sampson has a post showing how to get additional SQL logging while using development storage.
    Monday, November 22, 2010 8:37 PM
    Answerer
  • Ok, so after enabling SQL logging on the dev fabric, we have found the following error:

    "11/22/2010 4:04:04 PM [UnhandledException] EXCEPTION thrown: Microsoft.Cis.Services.Nephos.Common.Protocols.Rest.FatalServerCrashingException: The fatal unexpected exception 'Transaction (Process ID 55) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.' encountered during processing of request. ---> System.Data.SqlClient.SqlException: Transaction (Process ID 55) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."

    Is this a known issue? We've seen this many times before on other in-house applications that use SQL server....especially on apps that didn't take concurrent calls into consideration.

    So...what can be done about this? If needed, we have all log files gathered (10 of them), plus the solution that caused this condition.

    Thoughts? Fixes? Hopefully, this is something that can be resolved...is there a chance this issue only existed in the 1.2 tools, and is/will be fixed in 1.3?

    Monday, November 22, 2010 9:50 PM
  • Any thoughts on this one yet? We would really like to at least know our code works locally before uploading it to Azure (which I'm sure doesn't have this same issue).
    Monday, November 29, 2010 12:13 PM
  • Can you run a quick test and see if calling ThreadPool.SetMinThreads to increase the number of idle threads before you run your parallel storage client code makes any difference? This was some time ago and I don't remember the exact details, but I vaguely remember that this made a noticeable difference in this particular case.

     

    Tuesday, November 30, 2010 3:55 AM
  • Thanks for the reply...no change on our dev machines. We still see the same behavior. Other ideas?
    Tuesday, November 30, 2010 1:54 PM
  • This is a known issue with parallel uploads of blocks in the development storage.  This will be fixed in a future release of the SDK.
    • Marked as answer by Brad Calder Sunday, March 27, 2011 2:23 AM
    Sunday, March 27, 2011 2:08 AM