Ask a questionAsk a question
 

AnswerProblem loading high number of objects.

  • Monday, August 18, 2008 3:48 PMDiego Juez Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hi all,

    I´ve downloaded the AdventureWorks Db and tried a simple example of EF. I've created a model with just two table: Customer and SalesOrderHeader. I've also asigned all the SalesOrderHeader registers to one customer. So there are about 30k customer with no SalesOrderHeader, and 1 customer with 30k SalesOrderHeader.

    If I run the following code the CPU usage rises to 100% for 5 minutes!!!

    private static void TestAdventureWorksExamples() {
                using (var context = new AdventureWorksEntities()) {
                    var customerList = (from customers in context.Customer.Include("SalesOrderHeader") select customers).ToList();
                    foreach (var customer in customerList) {
                        Console.WriteLine("Id={0}, CustomerType={1}, Sales Order Number={2}", customer.CustomerID, customer.CustomerType, customer.SalesOrderHeader.Count);
                    }
                }
            }


    Is there any thing i'm missing? I've also tried the following code with same result:

    Code Snippet

    using (var context = new AdventureWorksEntities())

    {

    var customerList = (from customers in context.Customer select customers).ToList();

    foreach (var customer in customerList)

    {
    if(!customer.SalesOrderHeader.IsLoaded)
    {
    customer.SalesOrderHeader.Load();
    }
    Console.WriteLine("Id={0}, CustomerType={1}, Sales Order Number={2}", customer.CustomerID, customer.CustomerType, customer.SalesOrderHeader.Count);

    }
    }
    }




Answers

  • Tuesday, August 19, 2008 9:06 PMJeff Derstadt - MSFTModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer

    You can turn off change tracking by executing your query with the MergeOption.NoTracking option. You will still get an object graph back from your query, the changes just won't be tracked but you could call Attach after your query returns.

     

    I will make sure we investigate these performance scenarios, so thank you for the profile breakdown. Internally we do use dictionaries, but some of the collections are stored as lists which as you are seeing do not perform well when they scale to those large numbers but seem to do better with smaller numbers. Perhaps an adpative approach would work better for us.

     

    Jeff

     

All Replies

  • Monday, August 18, 2008 7:29 PMJeff Derstadt - MSFTModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    These are the right things to be trying, and are the two ways you can load a collection of related entities.

     

    One way to narrow your performance, is to see what the performance of the query is on the server. You can do this by calling customer.SalesOrderHeader.GetSourceQuery().ToTraceString() and then playing with this query in your SQL browser.

     

    The problem could also be related to the number of items in a single EntityCollction. While there is no purported maximum, it could be that performance does not scale for this scenario. Do you forsee this as a common scenario (30k related items all loaded at once?). If so, we can certainly look into this on our side.

     

    Jeff

  • Tuesday, August 19, 2008 4:23 PMTronTron Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    We have seen similiar behavior with a load a 12k or more objects clocking the CPU for a minute.  After profiling saw alot of time(57%) being spent in the ObjectStateManager.UpdateRelationships
    methods.  ObjectStateManager. TryUpdateExistingRelationships (52%) so on down to EntityKey.InternalEquals and and its equality functions.  The lowest level function is the guid(our keys).equals(object) at 12%.

    Seems internally you are using lists in TryUpdateExistingRelationships the change tracking where maybe a dictionary(or dictionary of dictionaries) would be better?  Though with possible composite keys you decided to use the list iteration and entitykey comparison.  Maybe some optimization if the entity key has only one member.

    We an change how much we load and can use other methods, but seems like should perform better in these scenarios.  Could also turn off changetrackign I guess which may alleviate this, but would break up our object graph I assume.
  • Tuesday, August 19, 2008 9:06 PMJeff Derstadt - MSFTModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer

    You can turn off change tracking by executing your query with the MergeOption.NoTracking option. You will still get an object graph back from your query, the changes just won't be tracked but you could call Attach after your query returns.

     

    I will make sure we investigate these performance scenarios, so thank you for the profile breakdown. Internally we do use dictionaries, but some of the collections are stored as lists which as you are seeing do not perform well when they scale to those large numbers but seem to do better with smaller numbers. Perhaps an adpative approach would work better for us.

     

    Jeff

     

  • Wednesday, August 20, 2008 10:56 AMDiego Juez Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Thanks to all for your help. I executed the code with no tracking and now it works fine, it takes a few seconds to complete.

    This could be a usual scenario, but i don't think the state of all the 30k related objects will be necesary.

    Thanks again for your help.