Microsoft Developer Network > Forums Home > Archived Forums Forums > LINQ Project General > Group By/Having LINQ performance trouble
Ask a questionAsk a question
 

AnswerGroup By/Having LINQ performance trouble

  • Monday, November 02, 2009 10:44 AMalaatm Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I am having a hard time figuring out why my LINQ query takes 6min to finish while my T-SQL that does the same thing in a few milli seconds.

    I have the following tables:
    ItemMaster
    SOHeader
    SOLineItem

    ItemMaster has a 1 to M relation with SOLineItem on the ItemNum column
    SOHeader has a 1 to M relation with SOLineItem on the SONum column

    I want to extract all items that were never sold after 01-01-2008
    In T-SQL, I do the following:

    SELECT 
    
    	MAX(DocDate) AS LastOrderDate,
    
    	ItemNum
    
    FROM
    
    	SOLineItem,
    
    	SOHeader
    
    WHERE
    
    	SOLineItem.SONum = SOHeader.SONum
    
    GROUP BY
    
    	ItemNum
    
    HAVING
    
    	MAX(DocDate) < '2008-01-01'
    
    ORDER BY
    
    	ItemNum
    
    
    This takes less than a second to return my 15500 rows. Now I do the following in LINQ:

    var q = from li in db.SOLineItems
    
            group li by li.ItemMaster.ItemNum into items
    
            let lastOrderDate = items.Max(p => p.SOHeader.DocDate)
    
            where lastOrderDate < new DateTime(2008, 1, 1)
    
            select new {
    
              LastOrderDate = lastOrderDate,
    
              ItemNum = items.Key
    
            };
    
    
    This takes about 6 minutes to return my 15500 rows.

    Even worse if I use:

    group li by li.ItemMaster into items
    instead of
    group li by li.ItemMaster.ItemNum into items

    In case I wanted to return more info about that item in my select statement.

    Any suggestions on my LINQ query?
    Thanks

Answers

  • Monday, November 09, 2009 4:28 PMalaatm Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer

    Ok I figured out what was wrong. I was preparing a test project and a database to send to you but was schocked that it executed almost as fast as ado.net! The one thing that I did not mention is that I was working with views and in my test I used tables. I went back to the views and they were using LTRIM and RTRIM SQL function on all varchar type columns. I removed the trim functions and my LINQ queries execute as fast as if I was using ado.net.

    Thanks all and sorry for wasting your time.

All Replies

  • Monday, November 02, 2009 10:48 AMalaatm Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Just to add some more info here, when I comment out the "where" in my LINQ query it runs very fast in less than a second. So I am assuming I am doing this the wrong way. Any tips will be appreciated.

    I also forgot to post the generated SQL from my LINQ query in my original post:

    SELECT 
    1 AS [C1], 
    [Project2].[C1] AS [C2], 
    [Project2].[ItemNum] AS [ItemNum]
    FROM ( SELECT 
    	[Distinct1].[ItemNum] AS [ItemNum], 
    	(SELECT 
    		MAX([Extent3].[DocDate]) AS [A1]
    		FROM  (SELECT 
          [SOLineItem].[SONum] AS [SONum], 
          [SOLineItem].[LINum] AS [LINum], 
          [SOLineItem].[SOType] AS [SOType], 
          [SOLineItem].[ItemNum] AS [ItemNum], 
          [SOLineItem].[Qty] AS [Qty], 
          [SOLineItem].[QtyAlloc] AS [QtyAlloc], 
          [SOLineItem].[QtyPrevInv] AS [QtyPrevInv], 
          [SOLineItem].[StoreID] AS [StoreID], 
          [SOLineItem].[UnitPrice] AS [UnitPrice], 
          [SOLineItem].[ItemCode] AS [ItemCode], 
          [SOLineItem].[IsHistory] AS [IsHistory]
          FROM [dbo].[SOLineItem] AS [SOLineItem]) AS [Extent2]
    		LEFT OUTER JOIN (SELECT 
          [SOHeader].[SONum] AS [SONum], 
          [SOHeader].[SOType] AS [SOType], 
          [SOHeader].[OrigNum] AS [OrigNum], 
          [SOHeader].[DocDate] AS [DocDate], 
          [SOHeader].[CustomerID] AS [CustomerID], 
          [SOHeader].[IsHistory] AS [IsHistory]
          FROM [dbo].[SOHeader] AS [SOHeader]) AS [Extent3] ON [Extent2].[SONum] = [Extent3].[SONum]
    		WHERE ([Extent2].[ItemNum] = [Distinct1].[ItemNum]) OR (([Extent2].[ItemNum] IS NULL) AND ([Distinct1].[ItemNum] IS NULL))) AS [C1]
    	FROM ( SELECT DISTINCT 
    		[Extent1].[ItemNum] AS [ItemNum]
    		FROM (SELECT 
          [SOLineItem].[SONum] AS [SONum], 
          [SOLineItem].[LINum] AS [LINum], 
          [SOLineItem].[SOType] AS [SOType], 
          [SOLineItem].[ItemNum] AS [ItemNum], 
          [SOLineItem].[Qty] AS [Qty], 
          [SOLineItem].[QtyAlloc] AS [QtyAlloc], 
          [SOLineItem].[QtyPrevInv] AS [QtyPrevInv], 
          [SOLineItem].[StoreID] AS [StoreID], 
          [SOLineItem].[UnitPrice] AS [UnitPrice], 
          [SOLineItem].[ItemCode] AS [ItemCode], 
          [SOLineItem].[IsHistory] AS [IsHistory]
          FROM [dbo].[SOLineItem] AS [SOLineItem]) AS [Extent1]
    	)  AS [Distinct1]
    )  AS [Project2]
    WHERE [Project2].[C1] < convert(datetime, '2008-01-01 00:00:00.000', 121)
    
    
  • Monday, November 02, 2009 11:15 AMKristoferA - Huagati Systems Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    DateTime cutoffDate = new DateTime(2008,1,1);
    
    var q =
    from itm in db.ItemMaster
    from li in (
      from li in db.SOLineItems
      where li.SONum == itm.SONum
        && li.DocDate >= cutoffDate
      select li
      ).DefaultIfEmpty()
    where li.SONum == null
    group itm by itm.ItemNum into ig
    select new { ItemNum = ig.Key, LastOrderDate = ig.Max(d => d.SOLineItems.DocDate) }
    

    Kristofer - Huagati Systems Co., Ltd.
    Cool tools for Linq-to-SQL and Entity Framework:
    huagati.com/dbmltools (add-in with new features for Visual Studio 2008's L2S and EF designers)
    huagati.com/L2SProfiler (Query profiler for Linq-to-SQL and LLBLGen Pro)
  • Thursday, November 05, 2009 6:28 AMLingzhi SunMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Hi alaatm,

     

    How is the problem?  I think Kristofer’s post is helpful, you can modify your LINQ query.  However, are you using Entity Framework?  If so, the DefaultIfEmpty is not supported in the current version of EF, http://msdn.microsoft.com/en-us/library/bb738638.aspx.   But you can still follow direction of Kristofer’s idea.  

     

    If you have any questions, please feel free to let me know.

     

    Have a nice day!

     

     

    Best Regards,
    Lingzhi Sun


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
  • Thursday, November 05, 2009 1:28 PMalaatm Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Yea, I am using EF so the above LINQ query does not work. Also, the ItemMaster entity does not include SONum since both tables are not related.
    I tried a lot of adjustments but the execution is extremly slow. Currently, I am just retrieving the data and doing the filtering on the client side as follows:

    var q = from li in db.SOLineItems
            group li by li.ItemMaster.NSN into items
            let lastOrderDate = items.Max(p => p.SOHeader.DocDate)
            select new SlowMovingItem {
              ItemNum = items.Key,
              LastActivityDate = lastOrderDate
            };
    DateTime cutoffDate = new DateTime(2008, 1, 1);
    var soList = q.AsEnumerable().Where(p => p.LastActivityDate < cutoffDate).OrderBy(p => p.ItemNum);
    
    

    This works well but I was wondering how to achieve this all on the SQL server side.
  • Friday, November 06, 2009 3:11 PMLingzhi SunMSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Hi alaatm,

     

    Here is some really helpful articles about Entity Framework performance,

    http://blogs.msdn.com/adonet/archive/2008/03/27/ado-net-entity-framework-performance-comparison.aspx

    http://blogs.msdn.com/adonet/archive/2008/02/04/exploring-the-performance-of-the-ado-net-entity-framework-part-1.aspx

    http://blogs.msdn.com/adonet/archive/2008/02/11/exploring-the-performance-of-the-ado-net-entity-framework-part-2.aspx

     

    In sum, you can try pre-compiled query, entity SQL, or NoTracking merge option to improve the performance. 

     

    If the performance is still a problem, could you please send me a demo project and db file for further investigation?   My mail address is v-micsun @ microsoft.com. 

     

    Have a nice weekend!

     

     

    Best Regards,
    Lingzhi Sun


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
  • Monday, November 09, 2009 4:28 PMalaatm Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer

    Ok I figured out what was wrong. I was preparing a test project and a database to send to you but was schocked that it executed almost as fast as ado.net! The one thing that I did not mention is that I was working with views and in my test I used tables. I went back to the views and they were using LTRIM and RTRIM SQL function on all varchar type columns. I removed the trim functions and my LINQ queries execute as fast as if I was using ado.net.

    Thanks all and sorry for wasting your time.