none
How to simplify LINQ generated SQL for simple join? RRS feed

  • Question

  • Using the AdventureWorks db for this example, I've dropped both the ProductSubCategory and Product tables into a datacontext.

     

    I'm trying to:

     

    return all ProductSubCategories and associated Products where Product.ListPrice > 2500

     

    Ideally, the generated SQL should look like: 

     

    Code Snippet

    SELECT

    [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate],

    [t1].[ProductID], [t1].[Name] AS [Name2], [t1].[ProductNumber], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductSubcategoryID] AS [ProductSubcategoryID2], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[DiscontinuedDate], [t1].[rowguid] AS [rowguid2], [t1].[ModifiedDate] AS [ModifiedDate2], (

    SELECT COUNT(*)

    FROM [Production].[Product] AS [t3]

    WHERE [t3].[ProductSubcategoryID] = [t0].[ProductSubcategoryID]

    ) AS [value]

    FROM [Production].[ProductSubcategory] AS [t0]

    INNER JOIN [Production].[Product] AS [t1] ON [t1].[ProductSubcategoryID] = [t0].[ProductSubcategoryID]

    WHERE ([t1].[ListPrice] > 2500)

    ORDER BY ...

    So far, the only way I've been able to get the correct query result in LINQ has been:

     

    Code Snippet

    ...

    db.DeferredLoadingEnabled = false;

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubcategory>(ps => ps.Products);

    options.AssociateWith<ProductSubcategory>(

    ps => from p in ps.Products

    where p.ListPrice > 2500

    select p);

    db.LoadOptions = options;

     

    IEnumerable<ProductSubcategory> query =

    from ps in db.ProductSubcategories

    where ps.Products.Count > 0

    select ps;

     

     

    However, the resulting SQL looks like:

     

    Code Snippet

    SELECT [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate], [t1].[ProductID], [t1].[Name] AS [Name2], [t1].[ProductNumber], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductSubcategoryID] AS [ProductSubcategoryID2], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[DiscontinuedDate], [t1].[rowguid] AS [rowguid2], [t1].[ModifiedDate] AS [ModifiedDate2], (

    SELECT COUNT(*)

    FROM [Production].[Product] AS [t3]

    WHERE ([t3].[ListPrice] > @p0) AND ([t3].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))

    ) AS [value]

    FROM [Production].[ProductSubcategory] AS [t0]

    LEFT OUTER JOIN [Production].[Product] AS [t1] ON ([t1].[ListPrice] > @p0) AND ([t1].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))

    WHERE ((

    SELECT COUNT(*)

    FROM [Production].[Product] AS [t2]

    WHERE ([t2].[ListPrice] > @p0) AND ([t2].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))

    )) > @p1

    ORDER BY [t0].[ProductSubcategoryID], [t1].[ProductID]

     

     

    I'd appreciate any help on this.  TIA.

    Monday, May 5, 2008 6:35 PM

Answers

  • ok, i know why it didnt work for you and it worked for, i never wanted to filter the child objects just to get the parent depending on a condition in the childs, well i have been doing some testing all night long and HERE IS THE SOLUTION:

     

    IList<ProductSubCategory> productSubs = null;

     

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubCategory>(ps => ps.Products);

    options.AssociateWith<ProductSubCategory>(ps => ps.Products.Where(ps.Product.ListPrice > 2500);

     

    db.LoadOptions = options;

     

    productSubs = db.ProductSubCategory.Where(

    ps => ps.Products.Contains( ps.Products.Where(p => p.ListPrice > 2500 ).SingleOrDefault())).ToList();

     

     

    now this will return the correct hierarchy along with filtered child objects, notice the AssociateWith, this tells the DB to filter the child objects retrieved.

     

    Hope this solves your problem.

    Sunday, May 11, 2008 7:08 AM

All replies

  • I'm not sure if this is what you're looking for, but I can see two options:

     

    1) You use grouping in your query:

     

    Code Snippet

    IEnumerable<IGrouping<ProductSubcategory,Product>> query = db.Products

    .Where(p => p.ListPrice > 2500)

    .GroupBy(p => p.ProductSubcategory);

     

     

     

    2) You select products and categories into an enumeration of (possible anonymous) objects:

     

    Code Snippet

    var query = db.Products

    .Where(p => p.ListPrice > 2500)

    .Select(p => new

    {

    Product = p,

    ProductSubcategory = p.ProductSubcategory

    });

     

    Monday, May 5, 2008 8:18 PM
  • The ProductSubCategory is the parent and the Product is the child.  Sorry if that wasn't clear.

     

    I need to return a collection of ProductSubCategories (IEnumerable<ProductSubCategory>), so anonymous types are not an option.

     

    Again, my goal is to produce the simple SQL below with just the INNER JOIN:

     

    Code Snippet

    SELECT

    ...

    FROM [Production].[ProductSubcategory] AS [t0]

    INNER JOIN [Production].[Product] AS [t1] ON [t1].[ProductSubcategoryID] = [t0].[ProductSubcategoryID]

    WHERE ([t1].[ListPrice] > 2500)

    ORDER BY ...

     

     

    I would think there's a straightforward solution to such simple SQL.
    Monday, May 5, 2008 8:30 PM
  • The additional Count(*) subquery exists because of you predicate "ps.Products.Count > 0". 

    Tuesday, May 6, 2008 2:26 AM
    Moderator
  • Right.  I'm aware of the source of the count subquery.  Without adding "ps.Products.Count > 0", the sql ends up as:

     

    Code Snippet

    SELECT [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate], [t1].[ProductID], [t1].[Name] AS [Name2], [t1].[ProductNumber], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductSubcategoryID] AS [ProductSubcategoryID2], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[DiscontinuedDate], [t1].[rowguid] AS [rowguid2], [t1].[ModifiedDate] AS [ModifiedDate2], (

    SELECT COUNT(*)

    FROM [Production].[Product] AS [t2]

    WHERE ([t2].[ListPrice] > @p0) AND ([t2].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))

    ) AS [value]

    FROM [Production].[ProductSubcategory] AS [t0]

    LEFT OUTER JOIN [Production].[Product] AS [t1] ON ([t1].[ListPrice] > @p0) AND ([t1].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))

    ORDER BY [t0].[ProductSubcategoryID], [t1].[ProductID]

     

     

    code:

    Code Snippet

    using (AdventureWorksDbDataContext db = new AdventureWorksDbDataContext()) {

    db.DeferredLoadingEnabled = false;

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubcategory>(ps => ps.Products);

    options.AssociateWith<ProductSubcategory>(ps => from p in ps.Products

    where p.ListPrice > 2500

    select p);

    db.LoadOptions = options;

    IEnumerable<ProductSubcategory> query =

    from ps in db.ProductSubcategories

    select ps;

    ...

    }

     

     

    The LEFT OUTER JOIN ends up returning ALL ProductSubCategories instead of just the ones containing Products with ListPrice > 2500.  Adding the "ps.Products.Count > 0" was just a way to limit the ProductSubCategories to those with Products matching my ListPrice criteria.

     

    All I want is the SQL above with an INNER JOIN rather than a LEFT OUTER JOIN.

    Tuesday, May 6, 2008 2:55 AM
  • Since you are directly accessing from the DataContext, could you do a SelectMany?

    var subsCats = db.Products
    .SelectMany(p=>p.SubCategories
    .Where(sc => p.ListPrice > 2500));


    Friday, May 9, 2008 3:29 PM
  • I couldn't get your example to work exactly.  I think I wasn't clear about the AdventureWorks schema:

     

    [ProductSubcategory]

    ProductSubcategoryId INT PK

    ...

    [Product]

    ProductId INT PK

    ProductSubcategoryId INT FK

    ...

     

    So each ProductSubcategory has zero, one, or more Products.  So from your query, there is no p.ProductSubcategories from Product.

     

    I did try the following:

     

    Code Snippet

    var query = db.ProductSubcategories

    .SelectMany(

    sc => sc.Products.Where(p => p.ListPrice > 2500)

    );

     

     

    which results in the following SQL:

     

    Code Snippet

    SELECT ...

    FROM [Production].[ProductSubcategory] AS [t0], [Production].[Product] AS [t1]

    WHERE ([t1].[ListPrice] > 2500) AND ([t1].[ProductSubcategoryID] = [t0].[ProductSubcategoryID])

     

     

    The SQL is pretty much what I'm looking for.  HOWEVER, the resulting collection is IEnumerable<Product> instead of IEnumerable<ProductSubcategory> with each ProductSubcategory object containing a collection of Products.

     

    VERY close, but not quite what I'm looking for.  I tried (unsuccessfully) adding DataLoadOptions to bring back the desired hierarchy.  I'm just not yet familiar enough with LINQ syntax to accomplish this.

    Friday, May 9, 2008 10:15 PM
  • ahh, Im sorry, I did missunderstand the relationship.

    does this work?

      db.Products
                        .Where(p => p.ListPrice > 2500)
                        .Select(p => p.ProductSubcategory);

    edit: sorry, i think you already have this query in a post above.


    here is another way to get the same info, it doesnt do an inner join though.

    db.ProductSubcategories
                        .Where(ps => ps.Products.Count(p => p.ListPrice > 2500) > 0);
    Saturday, May 10, 2008 2:53 PM
  • Ok, one more time.  I think this might be what your looking for:

    db.Products
                        .Where(p => p.ListPrice > 2500)
                        .Join(db.ProductSubcategories, p => p.ProductSubcategoryID, ps => ps.ProductSubcategoryID,
                        (p, ps) => new { ps })
                        .Select(x => x.ps);


    the sql statement generated is:

    SELECT [t1].[ProductSubcategoryID], [t1].[ProductCategoryID], [t1].[Name], [t1].
    [rowguid], [t1].[ModifiedDate]
    FROM [Production].[Product] AS [t0]
    INNER JOIN [Production].[ProductSubcategory] AS [t1] ON [t0].[ProductSubcategory
    ID] = ([t1].[ProductSubcategoryID])
    WHERE [t0].[ListPrice] > @p0



    Saturday, May 10, 2008 3:49 PM
  • The SQL is PERFECT, but...

     

    The returned object hierarchy is flattened, meaning that for each Product, the parent ProductSubCategory is returned.  I'm looking for the one-to-many relationship to be reflected in the objects returned by the LINQ query.

     

    For example, the query should return two ProductSubCategories ("Road Bike" & "Mountain Bike") with each ProductSubCategory containing multiple child products.

     

    Your sample query is returning one ProductSubCategory / Product hierarchy for each product. 

     

    I tried to add a DataLoadOption but that brings me back to a LEFT OUTER JOIN situation where all ProductSubCategories are returned which disregards the Product.ListPrice > 2500 WHERE condition.

     

    Any suggestions?

    Saturday, May 10, 2008 5:38 PM
  • if i got it right, this will return ProductSubCategories along with the corresponding products.

     

    IList<ProductSubCategory> productSubs = null;

     

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubCategory>(ps => ps.Products);

     

    productSubs = ProductSubCategory.Where(

    ps => ps.Products.Contains( ps.Products.Where(p => p.ListPrice > 2500 ).SingleOrDefault())).ToList();

     

     

    Saturday, May 10, 2008 5:52 PM
  •  

    Your query does return the hierarchy correctly, but the child Product objects are not limited by the p.ListPrice > 2500.

     

    Here's the SQL produced by your query:

    Code Snippet

     

    SELECT [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate], [t1].[ProductID], [t1].[Name] AS [Name2], [t1].[ProductNumber], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductSubcategoryID] AS [ProductSubcategoryID2], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[DiscontinuedDate], [t1].[rowguid] AS [rowguid2], [t1].[ModifiedDate] AS [ModifiedDate2], (

    SELECT COUNT(*)

    FROM [Production].[Product] AS [t4]

    WHERE [t4].[ProductSubcategoryID] = [t0].[ProductSubcategoryID]

    ) AS [value]

    FROM [Production].[ProductSubcategory] AS [t0]

    LEFT OUTER JOIN [Production].[Product] AS [t1] ON [t1].[ProductSubcategoryID] = [t0].[ProductSubcategoryID]

    WHERE EXISTS(

    SELECT NULL AS [EMPTY]

    FROM [Production].[Product] AS [t2]

    WHERE (EXISTS(

    SELECT NULL AS [EMPTY]

    FROM [Production].[Product] AS [t3]

    WHERE ([t2].[ProductID] = [t3].[ProductID]) AND ([t3].[ListPrice] > @p0) AND ([t3].[ProductSubcategoryID] = [t0].[ProductSubcategoryID])

    )) AND ([t2].[ProductSubcategoryID] = [t0].[ProductSubcategoryID])

    )

    ORDER BY [t0].[ProductSubcategoryID], [t1].[ProductID]

     

     

    Saturday, May 10, 2008 7:10 PM
  • thats strange, that query worked for me before , as you can notice the where: ([t3].[ListPrice] > @p0)

     

    anyway, try this:

     

    ProductSubcategory.Where(ps => ps.Products.All(p => p.ListPrice > 2500)).ToList();

    Saturday, May 10, 2008 7:23 PM
  • I assume that you mean:

     

    var query = db.ProductSubcategories.Where(ps => ps.Products.All(p => p.ListPrice > 2500)).ToList();

     

    which results in no rows.  Here's the SQL:

    Code Snippet

     

    SELECT [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate]

    FROM [Production].[ProductSubcategory] AS [t0]

    WHERE NOT (EXISTS(

    SELECT NULL AS [EMPTY]

    FROM [Production].[Product] AS [t1]

    WHERE (((CASE WHEN [t1].[ListPrice] > 2500 THEN 1ELSE 0 END)) = 0

    ) AND ([t1].[ProductSubcategoryID] = [t0].[ProductSubcategoryID])

    )

    )

     

     

    Saturday, May 10, 2008 7:37 PM
  • ok, i know why it didnt work for you and it worked for, i never wanted to filter the child objects just to get the parent depending on a condition in the childs, well i have been doing some testing all night long and HERE IS THE SOLUTION:

     

    IList<ProductSubCategory> productSubs = null;

     

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubCategory>(ps => ps.Products);

    options.AssociateWith<ProductSubCategory>(ps => ps.Products.Where(ps.Product.ListPrice > 2500);

     

    db.LoadOptions = options;

     

    productSubs = db.ProductSubCategory.Where(

    ps => ps.Products.Contains( ps.Products.Where(p => p.ListPrice > 2500 ).SingleOrDefault())).ToList();

     

     

    now this will return the correct hierarchy along with filtered child objects, notice the AssociateWith, this tells the DB to filter the child objects retrieved.

     

    Hope this solves your problem.

    Sunday, May 11, 2008 7:08 AM
  • Yazeed, thanks for your input.  I think that you missed part of my original post.  I don't have a problem returning the correct objects.  The problem I'm having is returning the correct objects AND getting the desired SQL.

     

    Your query results in the following SQL:

    Code Snippet

     

    SELECT [t0].[ProductSubcategoryID], [t0].[ProductCategoryID], [t0].[Name], [t0].[rowguid], [t0].[ModifiedDate], [t1].[ProductID], [t1].[Name] AS [Name2], [t1].[ProductNumber], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductSubcategoryID] AS [ProductSubcategoryID2], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[DiscontinuedDate], [t1].[rowguid] AS [rowguid2], [t1].[ModifiedDate] AS [ModifiedDate2], (
        SELECT COUNT(*)
        FROM [Production].[Product] AS [t4]
        WHERE ([t4].[ListPrice] > 2500) AND ([t4].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))
        ) AS [value]
    FROM [Production].[ProductSubcategory] AS [t0]
    LEFT OUTER JOIN [Production].[Product] AS [t1] ON ([t1].[ListPrice] > 2500) AND ([t1].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))
    WHERE EXISTS(
        SELECT NULL AS [EMPTY]
        FROM [Production].[Product] AS [t2]
        WHERE (EXISTS(
            SELECT NULL AS [EMPTY]
            FROM [Production].[Product] AS [t3]
            WHERE ([t2].[ProductID] = [t3].[ProductID]) AND ([t3].[ListPrice] > 2500) AND ([t3].[ListPrice] > 2500) AND ([t3].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))
            )) AND ([t2].[ListPrice] > 2500) AND ([t2].[ProductSubcategoryID] = ([t0].[ProductSubcategoryID]))
        )
    ORDER BY [t0].[ProductSubcategoryID], [t1].[ProductID]

     

     

    The whole point of my post is to get the proper hierarchical result AND produce a SQL statement resembling:

     

    Code Snippet

    SELECT ...

    FROM [Production].[ProductSubcategory] AS [t0]

    INNER JOIN [Production].[Product] AS [t1] ON ([t0].[ProductSubcategoryID]) = [t1].[ProductSubcategoryID]

    WHERE [t1].[ListPrice] > 2500

     

     

     

    Sunday, May 11, 2008 11:29 PM
  • oh sorry for that, i thought you had a problem to get the right hierarchy, however, my last post does get you the right hierarchy but a complex query i guess Smile

     

    Monday, May 12, 2008 4:40 PM
  • Right, the problem is really how to simplify the SQL.  Fwiw, my initial code sample (below) returns the correct result with a simpler query than your sample.  The problem is that the SQL is a bit more complicated than what should be necessary.

     

    Code Snippet

    db.DeferredLoadingEnabled = false;

    DataLoadOptions options = new DataLoadOptions();

    options.LoadWith<ProductSubcategory>(ps => ps.Products);

    options.AssociateWith<ProductSubcategory>(

    ps => from p in ps.Products

    where p.ListPrice > 2500

    select p);

    db.LoadOptions = options;

     

    IEnumerable<ProductSubcategory> query =

    from ps in db.ProductSubcategories

    where ps.Products.Count > 0

    select ps;

     

     

    Monday, May 12, 2008 5:25 PM