Strream does not support concurrent IO read and write operations

Unanswered Strream does not support concurrent IO read and write operations

  • 28 iunie 2011 17:08
     
      Are cod

    I am trying to write to a blob in parallel and somewhere in the middle I run into this error. I am writing about 4000 different blobs and this error occures after about 2000 of them are written. It occurs randomly so it is possibly not the data but if there are certain restrictions I would like to hear about them too. The code looks like:

     

     Parallel.ForEach(basePoly, polygon =>
    
         {
    
        
    
        string id = Guid.NewGuid().ToString() + polygon.lowerBBox.X + polygon.lowerBBox.Y;
    
        PolygonEntity pe = new PolygonEntity(polygon);
    
        try
    
        {
    
         BlobHelper blob = new BlobHelper(connectionString);
    
         string blobContent = pe.PolygonData;
    
         blob.PutBlob(DataConstants.MAP_CONTAINER_BLOB, id, blobContent);
    
         QueueObject.PutMessage(DataConstants.WORKER_QUEUE, new CloudQueueMessage(id));
    
        }
    
        catch (Exception e)
    
        {
    
         System.Diagnostics.Trace.Write(e.Message);
    
        }
    
       });
    
       
    
    


     

    The error is at PutBlob call which looks like:

     public bool PutBlob(string containerName, string blobName, string content)
    
      {
    
       try
    
       {
    
        CloudBlobContainer container = BlobClient.GetContainerReference(containerName);
    
        CloudBlob blob = container.GetBlobReference(blobName);
    
        
    
        blob.UploadText(content); // <-- This is where I get the error
    
        return true;
    
       }
    
       catch (StorageClientException ex)
    
       {
    
        if ((int)ex.StatusCode == 404)
    
        {
    
         return false;
    
        }
    
        throw;
    
       }
    
      }
    
    

    The polygon entity class is my serializer-deserializer. The class (partially) looks like:

     public class PolygonEntity 
    
     {
    
      public string PolygonData { get; set; }
    
    
    
      public PolygonEntity(Polygon polygon)
    
      {
    
       ConvertFromPolygon(polygon);
    
      }
    
    
    
      public PolygonEntity(string polygon)
    
      {
    
       PolygonData = polygon;
    
      }
    
    
    
      public void ConvertFromPolygon(Polygon polygon)
    
      {
    
       using (MemoryStream ms = new MemoryStream())
    
       {
    
        BinaryFormatter bf = new BinaryFormatter();
    
        bf.Serialize(ms, polygon);
    
        PolygonData = System.Convert.ToBase64String(ms.ToArray());
    
       }
    
      }}
    
    


    Thanks in advance for your help.
     


    Dinesh Agarwal

Toate mesajele

  • 29 iunie 2011 10:25
    Moderator
     
     

    Hi Dinesh,

    From your code, I can see that you are reusing CloudBlobClient instance within Parallel loop. As CloudBlobClient class is not thread safe, if you reuse it in multiple threads or parallel loop, the requests and responses may mixed up which I think is the cause of "Stream does not support concurrent IO read and write operations" error.

    Please revise the code to:

    public bool PutBlob(string containerName, string blobName, string content)
    {
        try
        {
            // Create a new blob client.
            CloudBlobClient blobClient = ...;
            CloudBlobContainer container = blobClient.GetContainerReference(containerName);

    Thanks.


    Wengchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
  • 29 iunie 2011 23:37
     
     

    Could you provide an exact exception or stack trace for this error.  There is no reason why uploading multiple blobs simultaneously should fail. The error string you mentioned is not one generated by the Storage Client itself, but may something to do with other streams in the system.  Something to consider is that in the application above you are not limiting concurrency, i.e. if you do attempt this on a list of 4k blobs it can potentially attempt many many simultaneous operations which may not be optimal, consider enforcing some limit on parallelism/ possibly scheduling the operations yourself.

     

    Also a side note, the UploadText function simply does a encoding.GetBytes() and then wraps the resulting byte array in a memory stream to upload, from what I am seeing you would be beter served by directly sending in your memory stream or byte array to CloudBlob.UploadFromStream / UploadByteArray.

     

    joe

  • 5 iulie 2011 06:17
    Moderator
     
     

    Hi,

    I will mark the reply as answer. If you find it no help, please feel free to unmark it and follow up.

    Thanks.


    Wengchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
  • 7 iulie 2011 00:54
     
     

    Hi,

    Thank you for the solution. I have now changed the methods to static methods and thus they are thread safe now but the error is still there. 

    I see another solution has been posted so I will work on that now and let you know if that helps. I will post the stack trace soon. 

    BTW, the code works fine for smaller files such as 50 MBs or so but with large files (800MB) it throws the error.

     


    Dinesh Agarwal
  • 7 iulie 2011 02:24
     
     

    Stack trace:

       at Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.get_Result()

       at Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.ExecuteAndWait()

       at Microsoft.WindowsAzure.StorageClient.CloudBlob.UploadFromStream(Stream source, BlobRequestOptions options)

       at Microsoft.WindowsAzure.StorageClient.CloudBlob.UploadByteArray(Byte[] content, BlobRequestOptions options)

       at GPC_Overlay.BlobHelper.PutBlob(String connectionstring, String containerName, String blobName, String content) in C:\Projects\Azure\WindowsFormsApplication1\ClassLibrary1\BlobHelper.cs:line 483

       at GISWebRole._Default.<>c__DisplayClass5.<_launchOverlayJob>b__4(Polygon polygon) in C:\Projects\Azure\WindowsFormsApplication1\GISWebRole\Default.aspx.cs:line 172

       at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()

     

    Hope that helps.


    Dinesh Agarwal
  • 7 iulie 2011 05:35
    Moderator
     
     

    Hi Dinesh,

    Please try creating a new CloudBlobClient instance everytime your call blob.UploadText(content); method.

    Or instead of using blob.UploadText(content); method, please create a new Stream object and call UploadFromStream method as Joe suggested.

    Thanks.


    Wengchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
  • 7 iulie 2011 16:59
     
      Are cod

    Dear Mr. Zeng,

    putblob method is a static method now. I do create a new blobclient object. The new method looks like:

     public static bool PutBlob(string connectionstring, string containerName, string blobName, string content)
     {
      try
      {
      CloudStorageAccount account = CloudStorageAccount.Parse(connectionstring);
      CloudBlobClient blobClient = account.CreateCloudBlobClient();
      blobClient.RetryPolicy = RetryPolicies.Retry(4, TimeSpan.Zero);
      CloudBlobContainer container = blobClient.GetContainerReference(containerName);
      CloudBlob blob = container.GetBlobReference(blobName);
      
      blob.UploadText(content);
      return true;
      }
      catch (StorageClientException ex)
      {
      if ((int)ex.StatusCode == 404)
      {
       return false;
      }
      throw;
      }
     }
    

    Moreover, my code can be downloaded from gpcoverlay.codeplex.com, try to run it with polygon1xml in the first select list and parcelxml in the second.


    Dinesh Agarwal

  • 8 iulie 2011 18:07
     
     

    The error you are hitting is not one generated internal to the Storage Client Library. I would recommend you refactor your code to limit concurrency in some manner and see if it repros then.

     

    joe

  • 8 iulie 2011 19:16
     
     
    Thanks Joe. Can you be little more specific as to how should I limit concurrency? A relevant link might help too. I have tried to find something useful at my own but I am not able to find a way to do this. 
    Dinesh Agarwal
  • 11 iulie 2011 18:39
     
      Are cod

    Dear zeng,

    I have changed my putblob method to

     public static bool PutBlob(string connectionstring, string containerName, string blobName, Polygon polygon)
    
      {
    
       //try
    
       {
    
        CloudStorageAccount account = CloudStorageAccount.Parse(connectionstring);
    
        CloudBlobClient blobClient = account.CreateCloudBlobClient();
    
        blobClient.RetryPolicy = RetryPolicies.Retry(4, TimeSpan.Zero);
    
        CloudBlobContainer container = blobClient.GetContainerReference(containerName);
    
        CloudBlob blob = container.GetBlobReference(blobName);
    
    
    
        using (MemoryStream ms = new MemoryStream())
    
        {
    
         BinaryFormatter bf = new BinaryFormatter();
    
         bf.Serialize(ms, polygon);
    
         ms.Seek(0, SeekOrigin.Begin);
    
         //PolygonData = System.Convert.ToBase64String(ms.ToArray());
    
         <strong>blob.UploadFromStream(ms);
    
    </strong>    }
    
        return true;
    
       }
    
       /* catch (StorageClientException ex)
    
        {
    
         if ((int)ex.StatusCode == 404)
    
         {
    
          return false;
    
         }
    
         throw;
    
        }*/
    
      }
    
    

    Also I changed this:

     System.Threading.ThreadPool.SetMaxThreads(1, 4);
    
       System.Threading.ThreadPool.SetMaxThreads(1, 20);
    
       Parallel.ForEach(basePoly, polygon =>
    
       //foreach (Polygon polygon in basePoly)
    
       {
    
        
    
        string id = Guid.NewGuid().ToString() + polygon.lowerBBox.X + polygon.lowerBBox.Y;
    
        //PolygonEntity pe = new PolygonEntity(polygon);
    
        //try
    
        { 
    
         //string blobContent = pe.PolygonData;
    
         //string blobContent = PolygonEntity.ConvertFromPolygonStatic(polygon);
    
         BlobHelper.PutBlob(connectionString, DataConstants.MAP_CONTAINER_BLOB, id, polygon);
    
         QueueHelper.PutMessageStatic(connectionString, DataConstants.WORKER_QUEUE, new CloudQueueMessage(id));
    
        }
    
        /*catch (Exception e)
    
        {
    
         System.Diagnostics.Trace.Write(e.Message);
    
        }*/
    
       //}
    
       });
    
    

    Still the same error at the same location both in dev environment and cloud. The error is only for large data files, for smaller ones it always works fine. I will greatly appreciate if someone can suggest what more I should try. Thanks in advance.

     



     


    Dinesh Agarwal
  • 13 iulie 2011 20:31
    Proprietar
     
     
    I think this problem is described in the following blog:

    http://blogs.msdn.com/b/windowsazurestorage/archive/2011/02/23/windows-azure-storage-client-library-parallel-single-blob-upload-race-condition-can-throw-an-unhandled-exception.aspx

    Please modify the code as suggested in blog. If you still see the problem please consider opening an incident and we can use the code to see what could be the issue and log bug(s) if needed.
    --Trevor H.
    Send files to Hotmail.com: "MS_TREVORH"
  • 18 iulie 2011 23:47
     
     

    Dear Trevor,

     

    Thank you for looking into the matter and providing this solution. I tried setting parallel thread count to 1 but I still get the same error. If you want I can share the code with you. The file that throws this error is about 780 MB thus if you want to try it out I can share my storage account credentials with you over email. Let me know if you can have a look at my code. Thank you.

     

    Regards,

     


    Dinesh Agarwal
  • 19 iulie 2011 16:28
     
     
    If you can refactor the solution into a simple repro app I can take a look at it. 
  • 8 septembrie 2011 23:25
     
     

    Hi Joe,

    I have refactored it into something very simple. I now create a list of strings each of which is to be written as a blob.  I do not see this same error after I changed to a new storage account but the code still crashes at the same point. I will appreciate if you can have a look at it. Please give me your email id and I will send it to you.

    - Dinesh

     


    Dinesh Agarwal
  • 9 septembrie 2011 16:53
     
     

    please send it to joegiard at microsoft dot com, and ill have a look,

     

    joe

     

  • 9 septembrie 2011 19:13
     
     

    Thanks Joe,

     

    Just sent it to you.


    Dinesh Agarwal
  • 17 noiembrie 2011 01:20
     
     

    Just to update everyone, the error was due to Parallel.Foreach creating too many requests and throttling the storage. I was able to fix this by using a ThreadPool instead of Parallel Foreach.

    I have a related question now, is it possible that anyone can create too many requests via REST APIs and throttle my application similar to a DOS attack?


    Dinesh Agarwal
  • 26 februarie 2012 01:27
     
     

    To Summarize:

    The OP had an original issue in how they scheduled requests that was causing the exception originally mentioned.

    As far as throttling see this post for more on the different storage abstractions and their scalability targets: http://blogs.msdn.com/b/windowsazurestorage/archive/2010/05/10/windows-azure-storage-abstractions-and-their-scalability-targets.aspx<o:p></o:p>

    To answer the OPs last question, the blog post above mentions the scalability targets. Please not these are not hard limits and the service will continue to server requests when additional capacity is available, but since Azure Storage is a multi-tenancy system there are times where requests over these targets may be throttled so ensure fairness between tenants. <o:p></o:p>

    That being said, the client is in control of all requests against their account, so this should be manageable. If you are exposing a public blob or container it is
    possible that the resource may become overloaded with too many requests which will count towards these scalability target and potentially lead to throttling. If you need to expose the same data to many many clients simultaneously I would recommend either spliting copies of the data over containers or even accounts depending on the throughpt you are looking for, or potentially event use the Azure CDN which can enable broad distriution scenarios. 

    Joe Giardino