none
Includes on Entity Framework Entities and their performance RRS feed

  • Question

  • Hello.

    I have been working with EF for several months now, and just found out few bugs/features that hold me back.

    I have following model:

    My goal here is to load entity A with all the others included in "real time".

    But whatever I try, I end up with results that do not satisfy me. Especialy when I scale up this problem with adding few more entities to it, which is the case in my work project.

     

    Few weeks ago I found out that SQL commands created by EF change a lot depending on whether referential constraints are set or not on 1-0..1 and 1..1 associations. So there are 4 possibilities how to set those associations(A<->B, B<->BPlus)

    I am working with object queries so my request to DB looks like this

    context.ASet.Include("B").Include("B.C").Include("B.BPlus").ToArray();
    

    Option 1: No referential constraints set.

    resulting SQL command has 10 outer joins and 7 of them are joining B entity and 2 are joining BPlus entity, and last one is joining C entity. So instead of three joins I get 10. Which scaled up ends in unmanagable SQL query which has unreal time damands.

     

    Option 2: A<->B has referential constraint set with A as Principal, B<->BPlus has no referential constraint set

    resulting SQL command has 9 outer joins. 6 joining B entity, 2 for BPlus and one for C

     

    Option 3: A<->B has referential constraint, B<->BPlus has referential constraint with B as Principal

    SQL command has "only" 7 joins now(but still 4 more than needed). 3 of which are for B entity, 3 for BPlus and one for C entity.

     

    Option 4: A<->B has no referential constraint set, B<->BPlus has referential constraint with B as Principal

    SQL command has 9 joins again. 4 for B entity, 4 for BPlus entity and one for C entity.

     

    I can supply the sql commands if anyone is interested, or you can try it on your own.

    Can someone tell me how to force EF to create optimal queries or is it impossible?

     

    I hope that someone will know the answer.

     

    Tomi

     


    TomiGaspar
    Tuesday, October 18, 2011 8:54 AM

Answers

  • Sorry, I didn't catch the bug vs feature question was to find out if you can do something about it. IMO no, you have not control on this.

    The last thing I see would be to start from B to see if it mitigates the issue. Sorry for the poor help.


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    • Marked as answer by tomigaspar Tuesday, October 25, 2011 1:14 PM
    Thursday, October 20, 2011 7:29 AM

All replies

  • Hi,

    There is likely a tradeoff between doing a complex request to retrieve all in one go and several simpler ones.

    Have you tried to load them explicitely either by using the Load method or by issuing a request to load each table (you want all rows from A and then all related rows or only a single A entity and its childs ?). You could likely hide this behind your own API. The idea is that issuing several simpler queries could be more efficient than issuing a single complex query...

    Quite theorical as I never had to load a hierarchy once for all but this is what I would likely investigate if I have to...

     

     

     


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    Tuesday, October 18, 2011 9:10 AM
  • Hello Patrice.

    My work scenario is much bigger than one presented here. I have tried loading entities in smaller parts, but in order to do that you have to have MergeTracking on, which is a performance hit on its own, but also then I need to retrieve all entities in that table, and that is also time consuming. So answer to your question is: I tried loading it through several simpler queries, and it is implemented so right now, but if I could load them via includes I could save some time, only if it would work right.

     


    TomiGaspar
    Tuesday, October 18, 2011 9:19 AM
  • Hi

    Why you had to Include all together while retrieving itself? You can just enable LazyLoading and load main entities where as on request only other entities will be loaded right?

    Does this answer?


    If this post answers your question, please click "Mark As Answer". If this post is helpful please click "Mark as Helpful".
    Tuesday, October 18, 2011 9:49 AM
  • Hi Kris444.

    Imagine you have to do lazy loading for thousands of objects, which is my case. Database trips will take forever. I tried. Absolutely not an option.


    TomiGaspar
    Tuesday, October 18, 2011 10:31 AM
  • Or perhaps not, the query will be more complex, it will return more rows and it will likely have more work to materialize thiose entities so IO if you are want to load too much related entities, Include likely ends up being slower.

    I though you have those few tables but if you are using much more why do you need to load them all in memory ? Ah, I saw in the other thread that you are really loading numerous (thousands of objects).

    Usually :
    - if this is for user interaction you could consider filtering those data to avoid loading data that won't ever be used
    - if this is to process them automatically you could consider doing this server side rather than client side (avoiding to move them client side to process them and send them back)
    - if you really have some kind of special needs you have also http://en.wikipedia.org/wiki/In-memory_database.

    IMO the best you could do with EF would be Context.A.ToList(), Context.B.ToList(), Context.C.TOList() and context.BPlus.ToList(). How does it perform compared with the current situation ?

     

     

     

     


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    Tuesday, October 18, 2011 5:29 PM
  • Hi Patrice.

    To make things clearer. My work scenario is this: I have to load about 2000 to 3000 objects for client application, each of these objects is stored in about 3 or 4 tables, which are connected to other tables to store specific structure, because there are different options and configurations(comming from client app) how these data can be loaded. So in the end I have about 6 queries where I would need to include 7 or so entities, which would be 7 joins if I would do it on my own(T-SQL), but EF creates approximately 42 joins. This is killing performance. Right now I am loading it piece by piece, about 3 tables at a time, but this costs time on MergeTracking. I only want to know if this is a bug, or I am setting something wrong in model?
    I have a workaround.

    I just want to know whether it can be made in way its supposed to be made?!

    Can anyone answer this question?

    Tomi


    TomiGaspar
    Wednesday, October 19, 2011 7:35 AM
  • What if you compare context.ASet.Include("B").Include("B.C"). with context.ASet.Include("B.C") ? Does it change the underlying generated SQL ?

    You may want also to see if it seems reported at http://connect.microsoft.com.

    Not sure if it's a bug or some lack of expressiveness making impossible to generate a somewhat more optimized SQL statement (though in pre EF days we wouldn't even have thought to use a single query to load a bunch of related entities). I'll try to get some time to give this a closer look based on your results on the above comparison...


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    Wednesday, October 19, 2011 8:33 AM
  • There is no difference in generated SQL between those two queries.

    I went through connect before, and I did it again, and havent found anything concerning this problem.

    So I created a new bug report.

    And in pre EF times I was using joins to load related objects/data/classes/tables whatever you call them:-)

    Tomi

     


    TomiGaspar
    Wednesday, October 19, 2011 10:51 AM
  • I gave a quick look. It seems an optimization issue. This is more a feeling than anything else but it seems to me that each individual need gets its own join so if the same join is needed to satisfy several objectives, the same join is generated multiple times rather than being reused...


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    Wednesday, October 19, 2011 12:42 PM
  • Ok, but optimization seems to be setup corectly for associations with 1-*, 0..1-*  or *-*. So is this a bug or is it some feature for which I havent found switch to turn it on?

    here is also link to connect bug report>

    https://connect.microsoft.com/VisualStudio/feedback/details/695675/unefficient-sql-generated-by-ef-on-1-1-and-1-0-1-associations-with-include

     

    it is mostly a copy of this article...


    TomiGaspar
    Wednesday, October 19, 2011 12:54 PM
  • Sorry, I didn't catch the bug vs feature question was to find out if you can do something about it. IMO no, you have not control on this.

    The last thing I see would be to start from B to see if it mitigates the issue. Sorry for the poor help.


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".
    • Marked as answer by tomigaspar Tuesday, October 25, 2011 1:14 PM
    Thursday, October 20, 2011 7:29 AM
  • We will see how it end on connect. Hopefully someone from EF team can take a look at it and tell me what the exact problem is.
    Thanks for help Patrice.
    TomiGaspar
    Thursday, October 20, 2011 7:36 AM