none
Large data models in Entity Framework RRS feed

  • Question

  • We are midway through migrating an application to c# in Visual Studio 2012 utilizing entity framework (currently version 5) to integrate with an SQL database. Recently we noticed a significant delay on the first query accessing the database over entity frameworks. There are presently 250+ entities defined on a code first basis against a single database context, which will rise to over 600 when the application is complete.

     

    Following some internet research and coding of a test application to perform a simple query on limited data  (see attached EFPerformanceOnLargeModels.zip from link at the bottom), we have re-produced the problem which appears to be an inherent limitation of entity framework on large data models using a single database context containing 100+ entities. The experienced delay on the first query increases exponentially the more entities are added.  

     

    Timings from our test application show the first query returned from a small (two entity) database context is fairly instant, whereas the first query from a large (600 entity) database context took 28 seconds to return the results. The second query on all tests showed insignificant delays.  As I understand from research, the delay is somewhat down to entity framework compiling views on the first query. An improvement suggested was to pre-compile the views, which we have done on our test application through both “Entity Framework Power Tools”, and “Code First View Generation T4 Templates for C#”. Reduced timings were experienced with pre-compiled views of 9 seconds to return from the first query.

      <ins></ins>

    <ins>0.5 seconds to initialise for 2 entity context</ins>

    <ins>28.2 seconds to initialise for 600 entity context</ins>

    <ins>9.7 seconds to initialise for 600 entity context with pre-generated views – Entity Framework Power Tools</ins>

    <ins>9.8 seconds to initialise for 600 entity context with pre-generated views – Code First View Generation T4 Templates for C#</ins>

     

    Unfortunately a first query time of 9 seconds would still not be acceptable in our application.

     

    Questions

    First of all, is there anything we have done wrong or missed which would further reduce the delay on the first query?

     

    We would ideally like any advice you can give on best practices/approach to implementing large data models.

    1. Is Entity Framework still the best approach for the scale of enterprise application we are developing?
    2. If so, should the model be separated out into multiple database contexts? e.g a separate database context for each module/domain, and could you advise on the recommended maximum number of entities per database context? 
    3. If its best to have multiple database contexts, as we currently have a shared project containing all entity structures, is it best to duplicate and separate out entity structures into their individual module/domain projects? Separating out would result in quite a lot of duplication as entity structures are shared across module/domains, however this would allow navigation properties to be defined on a per module/domain basis keeping the number of entities per database context to a minimum. Is there another approach which would limit duplication?

     

    Thanks in advance for any help you can give on this.



    EFPerformanceOnLargeModels

    Wednesday, January 23, 2013 9:55 AM

Answers

  • Hi,

    At this point we don't have any recommendations particular to your model that could speed up the initialization process. We know that this is a code first specific performance issue and resorting to database first or model first might alleviate most of these pains. It should be noted that with EF5 you can take a database first approach (have an EDMX file in your project) but still generate DbContext-based containers with POCO entities.

    The Entity Framework team has been working to get this section of the model initialization code better optimized for large models such as yours. We haven't yet committed to a delivery date on these improvements yet, but in the future you can follow the status of this issue in our codeplex workitem 848.

    Wednesday, February 27, 2013 1:03 AM

All replies

  • Hi Neil,

    Welcome to the MSDN forum.

    I am trying to involve a senior expert into your thread. Please wait for the response. Sorry for any inconvenience.

    Have a nice day.


    Alexander Sun [MSFT]
    MSDN Community Support | Feedback to us
    Develop and promote your apps in Windows Store
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Thursday, January 24, 2013 7:55 AM
  • Please refer-

    http://blogs.msdn.com/b/adonet/archive/2008/11/24/working-with-large-models-in-entity-framework-part-1.aspx

    http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/82cc2ccb-5d4d-43fb-85e0-3a45d133f43f

    Thursday, February 7, 2013 3:28 PM
  • Thank you for your response.

    I was hoping there would be more recent information as the first link is dated in 2008, but I guess the content still applies. Unfortunately it confirms what we expected; we need to separate out our database context into smaller more manageable models.

    If only there was a way to persist a large generated model to disk, then a single one off time it takes to generate would be more acceptable.     

    I hadn’t realised wrapping the context within a Using statement helps with performance. Thanks for the info.

     

    For reference, I did some more analysis calling a revised test application 300 times, starting with 2 entities, with each time adding a further 2 more entities into the model and recording the time taken to return from the first query (without pre-compiled views). I plotted the results in the chart below. This was on a faster machine than previous tests but the idea was to give an informed decision on the maximum number of entities per context we can achieve with a time we are happy with.  As you can see the delay rises quite rapidly when there are more than 100+ entities involved.

    Monday, February 11, 2013 8:19 AM
  • If only there was a way to persist a large generated model to disk, then a single one off time it takes to generate would be more acceptable.    

    That's kind of what pre-generated views do. But in this case it appears the stuff that pre-generated views doesn't do is still taking a fair amount of time.

    I have passed this on to the EF team perf engineer. I'm not sure he will be able to do anything to help you in the short term, but it might help us to know which parts of EF are making the first query take 9 seconds and try and improve that code path in future versions.


    We are seeing a lot of great Entity Framework questions (and answers) from the community on Stack Overflow. As a result, our team is going to spend more time reading and answering questions posted on Stack Overflow. We would encourage you to post questions on Stack Overflow using the entity-framework tag. We will also continue to monitor the Entity Framework forum.

    Thursday, February 14, 2013 12:38 AM
    Moderator
  • Hi,

    At this point we don't have any recommendations particular to your model that could speed up the initialization process. We know that this is a code first specific performance issue and resorting to database first or model first might alleviate most of these pains. It should be noted that with EF5 you can take a database first approach (have an EDMX file in your project) but still generate DbContext-based containers with POCO entities.

    The Entity Framework team has been working to get this section of the model initialization code better optimized for large models such as yours. We haven't yet committed to a delivery date on these improvements yet, but in the future you can follow the status of this issue in our codeplex workitem 848.

    Wednesday, February 27, 2013 1:03 AM