locked
Insert or Update Entity RRS feed

  • Question

  • We use table storage to keep track of status of certain worker processing operations. That status can be NotStarted, Started and Completed. When processing starts we instert an entitity with a null "Completed" field, and when it finishes we do an unconditional update with Completed set to the time of completion.

    Doing this with an Insert followed by Update/Merge doesn't work for us. We use aynchronous storage calls and it often happens, when processing finishes quicky (say, under a few 10s of ms) that the Update/Merge request arrives before Insert. To work around the problem we now create two entities, one when processing starts and one when processing ends. Querying is done frequenly and since there are no batch queries yet, this results in double the storage transactions.

    Are there means to force Updates to insert entities if they don't exist, or is there no way around this? Am I missing something obvious? 

    Wednesday, August 18, 2010 3:33 PM

Answers

  • vbori -

    The lack of an upsert operation was one of the complaints about the Azure Table Service that someone raised the other day. There is a very low-ranked request for upsert on My Great Windows Azure Idea

    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    Thursday, August 19, 2010 4:21 PM
    Answerer
  • Hello, you can write a simple retry logic. Take your first approach, and if the update fails due to the insert operation has not completed yet, you simply wait for a second, and try to update again.
    Lante, shanaolanxing This posting is provided "AS IS" with no warranties, and confers no rights.
    • Proposed as answer by Patriek van Dorp Thursday, August 19, 2010 12:45 PM
    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    Thursday, August 19, 2010 1:44 AM
  • The challenge here is that "the server" is actually a cluster or azure storage processing nodes and your requests may get processed by different nodes. This is also why there's no guarantee the operations will be completed in the order they were requested.

    So I'd go with Yi-Lun's approach.

     

    • Proposed as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:23 AM
    Thursday, August 19, 2010 3:42 PM

All replies

  • Hello, you can write a simple retry logic. Take your first approach, and if the update fails due to the insert operation has not completed yet, you simply wait for a second, and try to update again.
    Lante, shanaolanxing This posting is provided "AS IS" with no warranties, and confers no rights.
    • Proposed as answer by Patriek van Dorp Thursday, August 19, 2010 12:45 PM
    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    Thursday, August 19, 2010 1:44 AM
  • I understand but I would like for this to be an option on the server. This is a high-throughput, multi-tenant application and the status entities are short-lived, throw-away data. In the scenario you are proposing I would have threads waiting to repeat a storage operation while one or more queued jobs are already running.

    Thursday, August 19, 2010 3:29 PM
  • The challenge here is that "the server" is actually a cluster or azure storage processing nodes and your requests may get processed by different nodes. This is also why there's no guarantee the operations will be completed in the order they were requested.

    So I'd go with Yi-Lun's approach.

     

    • Proposed as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:23 AM
    Thursday, August 19, 2010 3:42 PM
  • Thanks Brent. I understand that, so back to my original question, what I'd like to have on the server is an operation with Update OR Insert semantics, which would update if the entity exists and insert if it doesn't, precisely because I don't know in which order requests will be processed and I don't want to count on any order (such as by waiting for a second after a failed update, which doesn't guarantee the order either, but would keep threads spinning longer than it does for us today). I can imagine that it may not be straightforward to implement this sort of thing because of the distributed nature of Azure storage. So I guess the question is -- is this feature out of the question, has it been requested before, is anything in the works, or is it just a bad idea.
    Thursday, August 19, 2010 4:15 PM
  • vbori -

    The lack of an upsert operation was one of the complaints about the Azure Table Service that someone raised the other day. There is a very low-ranked request for upsert on My Great Windows Azure Idea

    • Marked as answer by Yi-Lun Luo Tuesday, August 24, 2010 9:22 AM
    Thursday, August 19, 2010 4:21 PM
    Answerer