none
EF 4.1.x DbContext API: Are there possible risks on conflicting data that could lead to weird behaviour using the .Find(params object[] keyValues) method under Load Balanced WebApps (i.e.: Windows Azure)? RRS feed

  • Question

  • Hi,

    First of all, let me congratulate the whole Entity Framework team for the great job done so far!! Please, keep up the great work!

     

    I am using EF 4.1 in a Windows Azure application (Web Roles and Worker Roles) and I sometimes use of the «new» .Find(params object[] keyValues) method to obtain an entity from the store (or the context).

    I want to show my concerns about using this method in a Windows Azure App (or any other load balanced application). And please correct me if I am wrong or if I am making wrong assumptions here.

     

    So, here's the summary you provided for this method:

    Finds an entity with the given primary key values.  If an entity with the given primary key values exists in the context, then it is returned immediately without making a request to the store. Otherwise, a request is made to the store for an entity with the given primary key values and this entity, if found, is attached to the context and returned. If no entity is found in the context or the store, then null is returned.

    Example:

    Let's consider this "Blogs WebApp" that Julie Lerman demonstrates in this video (5m12s) you have on your EF team blog homepage:

    http://msdn.microsoft.com/en-us/data/gg715119

    And now imagine that this is a high-traffic WebApp and that it is deployed in 20(!) azure instances (web roles) and that there are lots of users viewing this really popular blog entry/post.

    Let's say that we now have a user (blog editor) hitting this "Edit Blog" page and that he will be editing this popular blog entry many times.

    Could there be possible risks of writing/editing a deprecated version of the entity if it is already loaded in the context of the instance that is processing a (save) request? Sub-sequentially, when another instance is picking up another request and processing the same entity (another save) couldn't this lead to conflicted data present in the Store vs the Instances Contexts?

    I hope I was clear enough and that you can enlighten me with an answer on how does this "temporary entity in-memory/context caching"  (- I know this is not the right term to describe this), per Instance, could lead to conflicting or deprecated data being saved on the store.

    And regarding possible future features that enable some type of out-of-the-box, built-in, easy configurable entity-set caching: How will you take in consideration the load balanced scenarios (i.e.: Azure)? Will AppFabric Cache or Memacached be the way to go here?

     

    Hope to hear back from you soon. Thanks!

     

    Cheers,

    Carlos Sardo





    Thursday, October 13, 2011 8:33 PM

All replies

  • Hi Carlos,

    if i understand you correctely, you are describing a common issue that not only happens on systems like azure (with multiple instances).

    I am using DbContext per Request (i think thats the suggested and most "save" way). So I think, it is no difference if you are running this DBContext Instances on different azure instances because they are "per" request.

    That means every user request will get a fresh context..

    But that doesnt mean that this is save. I would describe it "by design" that massive parallel access to the same entities can end in the known update anomalies (to name it in datatabase language).

    I would also be interested to know if we understood this correctly :)

    My first intention or solution (if you need a synchronization) is, that you would need to use a shared syncronization or locking mechanism.. On windows azure you would need to use a sync mechanism that is persistent for all instances (for example a tablestorage).

    I am looking forward to read other suggestions. This is an interesting question :)

    Regards

    Holger

     

     

    Friday, October 14, 2011 12:53 PM