locked
Gracefully handling collisions on Unique Keys when saving data? By merging? RRS feed

  • Question

  • If I have a hypothetical table named 'Product' that has say, 3 columns: Id (Primary Key), Name (Unique key), and Owner, and I attempt to create two new mapping objects with the same name and save them:

          Product prod1 = new Product("MyProduct", "John Smith");

          dbContext.Products.Add(prod1);
          dbContext.SaveChanges();

          // ...

          Product prod2 = new Product("MyProduct", "Jane Doe");

          dbContext.Products.Add(prod2)
          dbContext.SaveChanges();

    I get an error on the second submit due to the duplication of the Unique key (in this case column Name).This is expected... But what I'm looking for is a way to have the second change take place, and just perform an update on the existing row in the DB (overwrite). Deleting the existing and adding the Product again from scratch (getting a new ID) is not desirable, because I have many other rows/objects referring to the Product by Id. The reason the scenario arises where I have a collision is because the 'Product' and all of it's associated objects come from a provider, and I do not want to expose the provider to the DB context.

    Is there a way in Entity to accomplish what I am looking for? Preferably with Attributes/Code First? To recap, if I save two objects with colliding values based on UniqueKeys, I'm looking for a way to update the first saved object when I try to save the colliding second object, rather than having an exception thrown.

    Wednesday, January 18, 2012 3:10 AM

Answers

  • Hi,

    Why don't you just query the database rather than always creating a new object ? This way you'll either update an existing object or you'll create a new one exactly as in almost all EF applications in existance.

    Or if you meant that they are all new objects but that you the source can expose its insertion and then subsequent changes, you could cache the key (or the object itself) so that you know if this particular object has been already processed previously (and again retrieve from the db or cache and update it rather than creating a new object).

     


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    • Edited by Patrice ScribeMVP Wednesday, January 18, 2012 12:34 PM
    • Proposed as answer by Allen_MSDN Thursday, January 19, 2012 6:17 AM
    • Marked as answer by Allen_MSDN Monday, January 23, 2012 2:55 AM
    Wednesday, January 18, 2012 12:33 PM
  • You could override SaveChanges and do the same thing but at some point you DO have to check if it already exists in the db so it seems more logical to do that when the object is created rather than when it is saved.

    I'm not sure about your architecture. It doesn't seems to me it implies that the source provider needs some knowledge about the db. My understanding is that the provider will provide a list of objects (either existing or not and perhaps even with duplicates hopefully in the correct order). It doesn't have to know anything about the db.

    Then you'll process this list to update your own db (and only this code needs to do handle the context).

    As I said earlier as an object with the MyProduct key is already here you could just update this object rather than creating a new one. to me the pseudo code could look like assuming you want to always create new objects :

    foreach(src in sourceObjects)
         if(src.Key in CreatedObjects) get  dst from CreatedObjects else new dst
         update dst from src
         add dst to CreatedObjects

    So if the source provides several MyProduct instances ou'll update only one instance (and the last instance wins).

    Now as you didn't checked if an object already exists or not in the db, before calling save changes you'll have to browse them all to mark them accordingly (currently they are all marked as added).

    foreach(o in CreatedObjects)
        if key found in db mark o as modified rather than as added

    At this step you should be able to call save changes. As suggested earlier I would just do that inside the loop so that they mark as needed from the beginning rather than marking them all as "added" before switching some of them to "modified".
        

    Humm. you could have several providers that would be updating the same objects ? What is your overall goal ? It starts looking like a replication ? Depending on your needs a solution such as Sync Framework could be perhaps better ?  Also some db have built in replication capabilities...


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".

    • Edited by Patrice ScribeMVP Thursday, January 19, 2012 12:11 PM Humm...
    • Marked as answer by Allen_MSDN Monday, January 23, 2012 2:55 AM
    Thursday, January 19, 2012 12:05 PM

All replies

  • Hi,

    Why don't you just query the database rather than always creating a new object ? This way you'll either update an existing object or you'll create a new one exactly as in almost all EF applications in existance.

    Or if you meant that they are all new objects but that you the source can expose its insertion and then subsequent changes, you could cache the key (or the object itself) so that you know if this particular object has been already processed previously (and again retrieve from the db or cache and update it rather than creating a new object).

     


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    • Edited by Patrice ScribeMVP Wednesday, January 18, 2012 12:34 PM
    • Proposed as answer by Allen_MSDN Thursday, January 19, 2012 6:17 AM
    • Marked as answer by Allen_MSDN Monday, January 23, 2012 2:55 AM
    Wednesday, January 18, 2012 12:33 PM
  • The reasoning of having all new objects was to entirely separate a complex object graph generated by a provider from the DB itself, so the provider would not have to have any knowledge of the DB, nor would the objects in the graph, until they are returned from the provider and then handled in the main back-end system. This would allow me to truly isolate the DB and the object graph creation. I was curious to see if any such collision handling existed in EF, as if it did, it would eliminate the step of forcing the provider to check for existing objects or having to update the object classes with checks for existing objects on their creation.... It would also help to avoid the scenario where I have two providers building object graphs at the same time that have collisions prior to being saved.

    ... that is the reasoning. But from your answer the collision handling is not supported directly through Entity. So it looks like additional handling it is :)
    Wednesday, January 18, 2012 7:59 PM
  • Hi CodeLabRat,

    Welcome to MSDN Forum.

    I agree with Patrice. You can cache the Unique key and when you add the new record, you can check if the record is already in the database.

    Best Regards


    Allen Li [MSFT]
    MSDN Community Support | Feedback to us
    Thursday, January 19, 2012 6:17 AM
  • You could override SaveChanges and do the same thing but at some point you DO have to check if it already exists in the db so it seems more logical to do that when the object is created rather than when it is saved.

    I'm not sure about your architecture. It doesn't seems to me it implies that the source provider needs some knowledge about the db. My understanding is that the provider will provide a list of objects (either existing or not and perhaps even with duplicates hopefully in the correct order). It doesn't have to know anything about the db.

    Then you'll process this list to update your own db (and only this code needs to do handle the context).

    As I said earlier as an object with the MyProduct key is already here you could just update this object rather than creating a new one. to me the pseudo code could look like assuming you want to always create new objects :

    foreach(src in sourceObjects)
         if(src.Key in CreatedObjects) get  dst from CreatedObjects else new dst
         update dst from src
         add dst to CreatedObjects

    So if the source provides several MyProduct instances ou'll update only one instance (and the last instance wins).

    Now as you didn't checked if an object already exists or not in the db, before calling save changes you'll have to browse them all to mark them accordingly (currently they are all marked as added).

    foreach(o in CreatedObjects)
        if key found in db mark o as modified rather than as added

    At this step you should be able to call save changes. As suggested earlier I would just do that inside the loop so that they mark as needed from the beginning rather than marking them all as "added" before switching some of them to "modified".
        

    Humm. you could have several providers that would be updating the same objects ? What is your overall goal ? It starts looking like a replication ? Depending on your needs a solution such as Sync Framework could be perhaps better ?  Also some db have built in replication capabilities...


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".

    • Edited by Patrice ScribeMVP Thursday, January 19, 2012 12:11 PM Humm...
    • Marked as answer by Allen_MSDN Monday, January 23, 2012 2:55 AM
    Thursday, January 19, 2012 12:05 PM