locked
Table Storage - handling retry error where item already exists RRS feed

  • Question

  • I'm using Table Storage and have read that I should expect to see 'item exists' errors when a connection is severed during an insert (because the Table Store received and processed the request but the response didn't make it to me). What I haven't seen is what are the best practices for handling this kind of error. What if in my scenario it is normal for the item to already exist when I go to insert, how do I tell the aforementioned error condition from the record truly already existing?

    In coming from a relational DB background this scenario feels very unnatural to me and I'm not sure how I should deal with it. Am I forced to query for the existence of the record before insert so I can know if it exists vs my insert succeeded but I didn't get the response?

    What are other people doing to deal with this?

    Wednesday, June 1, 2011 1:51 AM

Answers

  • A few reasons for dup inserts can be:

    1> Network errors can make inserts appear as failed even if it succeeded on the server end.

    2> Certain implementations of http stack may also retry operations silently based on failures in conjunction with expect 100.

    3> Clock skew on different nodes inserting entities can also cause duplicates.

    One way to figure this out may be to provide every insert a unique transaction ID. This ID is stored with every entity. Now upon "Conflict" errors, you can check to see if it is a particular transaction id and take necessary action (ignore if it is the same ID else throw error).

     Hope this helps!

    Thanks,

    Jai

    Thursday, June 2, 2011 5:31 AM

All replies

  • My opinion is to try insert entitity with retries. When it fails and trow exception entity already exists I will try to update that entity to be sure, that everything was saved correctly.

     

    try

                {

                    // entity insertion code

                }

                catch (Exception itemexists)

                {

                    try

                    {

                        // entity update with retries 

                    }

                    catch (Exception someerror)

                    {

                        throw new Exception("item exist and cannot be modifiet. try again later or whatever message");

                    }

                }

     

    And there is thread with similiar Q/A 

    http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/7f11e2d7-97ea-4103-a796-b897a44accd9

     


    Windows Azure Consultant http://cloudikka.wordpress.com/ (Don't open this link, if you don't understand czech language)
    • Edited by dropoutcoder Wednesday, June 1, 2011 8:22 AM extend information
    Wednesday, June 1, 2011 8:16 AM
  • Hi Curious,

    I am not clearly understand your question. Could you please describe it with some code?

    If you mean you want to handle the exception when inserting existing entity by calling TableServiceContext.SaveChangesWithRetries Method, you can catch the DataServiceRequestException exception and check its http error status code. The code is:

    class HandleTableStorageItemAlreadyExists
    {
        public void Run()
        {
            CloudStorageAccount account = CloudStorageAccount.Parse("UseDevelopmentStorage=true");
            CloudTableClient tableClient = account.CreateCloudTableClient();

            string tableName = "MyTable";
            tableClient.CreateTableIfNotExist(tableName);

            TableServiceContext context = tableClient.GetDataServiceContext();
               
            context.AddObject(tableName, new MyItem() { PartitionKey = "1", RowKey = "2", Content = "Hello" });
              
            try
            {
                // This line will throw an exception.
                context.SaveChangesWithRetries();
            }
            catch (DataServiceRequestException ex)
            {
                if (ex.InnerException.GetType() == typeof(DataServiceClientException))
                {
                    DataServiceClientException innerException = (DataServiceClientException)ex.InnerException;
                    if (innerException.StatusCode == 409)
                    {
                        Console.WriteLine("The specified entity already exists.");
                        Console.WriteLine("Message:\n" + innerException.Message);
                    }
                }

                // Don't reuse the DataServiceContext.
                context = null;
            }
        }
    }
    [DataServiceKey("PartitionKey", "RowKey")]
    public class MyItem
    {
        public string PartitionKey { get; set; }
        public string RowKey { get; set; }
        public DateTime Timestamp { get; set; }
        public string Content { get; set; }
    }

    HTTP Error 409 means there is a conflict with the existing content. For more information, please refer to http://www.checkupdown.com/status/E409.html.

    If I misunderstood you, please feel free to let me know.

    Thanks,

    Updated: Drop the DataServiceContext if there is a DataServiceRequestException catched.


    Wengchao Zeng
    Please mark the replies as answers if they help or unmark if not.
    If you have any feedback about my replies, please contact msdnmg@microsoft.com.
    Microsoft One Code Framework
    Wednesday, June 1, 2011 8:37 AM
  • We do have a feature request to provdie "Upsert" functionality which will help in this case. Is that what you are requesting here?

     

    Thanks,

    Jai

    Wednesday, June 1, 2011 3:44 PM
  • Jai, no Upsert is not really what I am after here. Let me explain.

    My service does a lot of table inserts and it is considered an error condition if I try to insert and the record already exists. So I need to be able to tell the difference between the insert failing because the record truly does already exist and it failing because I never received the response to the first insert attempt.

    For a little more detail: my services is logging lots of usage data. I have chosen to use an incoming request parameter that is unique per caller as the partitionKey, and the current time down to the microsecond as the rowKey. So generally speaking I should never encounter a conflict on insert (unless two requests from the same Id come in at the same microsecond). So for me this is an error condition because I would lose data. However if I fall into the scenario where I don't receive the response and try again (using a retry policy) I can safely ignore this error because the data was in fact inserted. How do I tell them apart?

    I am also wondering if there are other situations where I could get into this scenario. For example, say I am doing load testing, could Azure Table storage cause timeouts (which trigger me not getting the response) if it has to load balance partitions to another server? Are there other scenarios that could cause it? I ask because I am seeing these errors during my load testing (3 failures in 15K requests) and it doesn't seem reasonable to have that high a failure rate if the service and storage account are in the same affinity group. Should this be a VERY rare occurrence?

    Wednesday, June 1, 2011 11:38 PM
  • So Petr, is it accurate to say you advocate not using a Retry Policy and do the retries manually so I have more control?
    Wednesday, June 1, 2011 11:39 PM
  • Wenchao, it is not that I don't know how to trap the error. The problem is that I cannot tell two different error conditions apart. See my response to Jai for details.
    Wednesday, June 1, 2011 11:40 PM
  • A few reasons for dup inserts can be:

    1> Network errors can make inserts appear as failed even if it succeeded on the server end.

    2> Certain implementations of http stack may also retry operations silently based on failures in conjunction with expect 100.

    3> Clock skew on different nodes inserting entities can also cause duplicates.

    One way to figure this out may be to provide every insert a unique transaction ID. This ID is stored with every entity. Now upon "Conflict" errors, you can check to see if it is a particular transaction id and take necessary action (ignore if it is the same ID else throw error).

     Hope this helps!

    Thanks,

    Jai

    Thursday, June 2, 2011 5:31 AM