locked
Why do these two queries have different results? RRS feed

  • Question

  • Hello team,

    I am doing two queries,

    These two bring back different results.

    SELECT i.IngredientName,ic.IngredientClassDescription, COUNT(ri.RecipeID) As CountOfMeatRecipes
    FROM ((Ingredient_Classes ic INNER JOIN Ingredients i ON ic.IngredientClassID = i.IngredientClassID)
                                INNER JOIN Recipe_Ingredients ri ON ri.IngredientID = i.IngredientID)
                                WHERE ic.IngredientClassDescription = 'meat'
                                GROUP By i.IngredientName, ic.IngredientClassDescription

    Brings 4 rows



    SELECT i.IngredientName,ic.IngredientClassDescription,
    (SELECT COUNT(ri.RecipeID)
     FROM Recipe_Ingredients ri WHERE ri.IngredientID = i.IngredientID)AS CountOfRecipes
     FROM Ingredient_Classes ic INNER JOIN Ingredients i ON ic.IngredientClassID = i.IngredientClassID
                                WHERE ic.IngredientClassDescription = 'meat'

    Brings 11 rows

    One is done as join and Group by and the second query is done by a subquery (chapter 13 from SQL Queries for mere   mortal)

    Author asks why these two queries have different results?

    Thanks,

    GGGGGNNNNN


    GGGGGNNNNN

    Thursday, November 6, 2014 11:53 PM

Answers

  • >>
    -I understand the reason for the first query better. I think the ingredientDesciption is filtered after joining to Ingredients and then to Recipe_Ingredients tables, is this logic correct? Each time the ingredientDescription is filtered.
    <<

    Yes.  However, it is probably more correct to say the ingredients (rather than IngredientDescriptions) are filtered.  Remember than the INNER JOIN is a FILTER since the INNER JOIN requires MATCHING records from the 2 Tables used in the Inner Join.  In particular for the Inner Join between [Recipe_Ingedients] and [Ingredients] in the first Query/SQL, if a IngredientID does not exist in the Table [Recipe_Ingredients] (i.e. the ingredient is NOT USED in recipes), the IngredientID (and hence, ingredient) is filtered OUT and not returned.

    >>
    -What do you mean when you say: "you parsed the query"? Do you mean you split it to smaller queries?
    <<

    That means I add line-breaks and indents to see the different components (SELECT clause, FROM clause, WHERE clause, etc...), the different Joins and the overall logic of the SQL String.  You find that the book does the same thing for clarity.

    >>
    -If yes, then for the second query, I came to capture the recipes that their count is zero? What query should we write to capture the recipeIds whose count is zero?
    <<

    Sure can by adding the same SubSQL as another condition in the WHERE clause like:

    SELECT i.IngredientName,ic.IngredientClassDescription, 
      (  SELECT COUNT(ri.RecipeID) 
         FROM Recipe_Ingredients ri 
         WHERE ri.IngredientID = i.IngredientID ) AS CountOfRecipes
      
    FROM Ingredient_Classes ic INNER JOIN Ingredients i 
      ON ic.IngredientClassID = i.IngredientClassID 
      
    WHERE (ic.IngredientClassDescription = 'meat')
      And
      ( ( SELECT COUNT(ri.RecipeID) 
          FROM Recipe_Ingredients ri 
          WHERE ri.IngredientID = i.IngredientID ) = 0 )

    However, if you want to return the "meat" ingredients that are NOT ACTUALLY USED in the recipes, it is more efficient to use the "Frustrated Outer Join" technique (search the Web for more info) with the SQL:

    SELECT i.IngredientName,ic.IngredientClassDescription FROM ( Ingredients i INNER JOIN Ingredient_Classes ic ON i.IngredientClassID = ic.IngredientClassID ) LEFT JOIN Recipe_Ingredients ri ON i.IngredientID = ri.IngredientID WHERE (ic.IngredientClassDescription = 'meat')
    And (ri.IngredientID Is Null)


    The logic of this SQL is basically to return all  "meat" ingredients except when the "meat" ingredient is NOT used in recipes  (i.e. the condition ri.IngredientID Is Null).  This should return 7 rows as per the other SQL.

     

    >>
    -In term of the joins and tables, why the second query returns the count of RecipeIds whose count is zero?
    <<

    The FROM clause + the WHERE clause construction return ALL "meat" ingredient in the Table [Ingredients] (returning ALL 11 "meat" ingredients).  The SubQuery/SubSQL involving the Table [Recipe_Ingredients] only supply the count of usage of each "meat" ingredient.  If the ingredient is not used (i.e. corresponding IngredientID does not exist in [Recipe_Ingredients]), the the count is zero.  In this (second) SQL, think of the SubQuery/SubSQL as a calculated value and it does NOT affect the filtering/selection of the records in the main SQL.

     

    I am sure the authors explain all this in the book.  Did you read the explanation accompanying each SQL carefully???

          


    Van Dinh

    • Marked as answer by cloudsInSky Monday, November 10, 2014 8:39 PM
    Monday, November 10, 2014 3:50 PM
  • >>
    When you say parse, you mean you break the query into pieces by SELECT, FROM, Where?
    <<

    Yes.  Parsing into different clauses and components (e.g. the SubSQL as a calculated value in the SELECT clause in the second SQL) help you to think logically in your head what each clause/component does in the whole Query.  For an example of a moderately complex Query/SQL where I use parsing, see recent MSDN thread Query that shows which prices have changed in the most current daily refresh,

     


    Van Dinh

    • Marked as answer by cloudsInSky Tuesday, November 11, 2014 7:24 PM
    Monday, November 10, 2014 9:06 PM

All replies

  • GGGGGNNNNN,

    First query return grouped values, secound returns all values from ingredients table with CountOfRecipes.

    Firs query return also only records from ingredients wich have matching records in table recipe_ingredients


    Michał

    Friday, November 7, 2014 7:08 AM
  • I parse the 2 Queries as:

    SELECT i.IngredientName,ic.IngredientClassDescription, 
      COUNT(ri.RecipeID) As CountOfMeatRecipes
      
    FROM 
    (
      (  Ingredient_Classes ic INNER JOIN Ingredients i 
           ON ic.IngredientClassID = i.IngredientClassID )
      INNER JOIN Recipe_Ingredients ri 
        ON ri.IngredientID = i.IngredientID
    )
      
    WHERE ic.IngredientClassDescription = 'meat'
      
    GROUP BY i.IngredientName, ic.IngredientClassDescription

    and

    SELECT i.IngredientName,ic.IngredientClassDescription, 
      (  SELECT COUNT(ri.RecipeID) 
         FROM Recipe_Ingredients ri 
         WHERE ri.IngredientID = i.IngredientID ) AS CountOfRecipes
      
    FROM Ingredient_Classes ic INNER JOIN Ingredients i 
      ON ic.IngredientClassID = i.IngredientClassID 
      
    WHERE ic.IngredientClassDescription = 'meat'

    The first Query/SQL returns the "meat" ingredients that are ACTUALLY USED in the recipes and the number of times each of the selected "meat" ingredient used.

    The  second Query/SQL returns ALL "meat" ingredients from your Table [Ingredients] and for each ingredient, how many times it is actually used in the Recipes.

    Basically, you have 11 "meat" ingredients in the Table [Ingredients] and out of these 11, only 4 are actually used in the recipes.  Thus, the second Query/SQL returns 11 rows, one for each "meat" ingredient while the first Query/SQL returns 4 rows, one for each USED "meat" ingredient.


    Van Dinh





    • Edited by Van Dinh Friday, November 7, 2014 9:49 PM
    Friday, November 7, 2014 8:30 PM
  • Hello Van,

    Thank you for the response. This is definitely a great response. I have problems in understanding that in term of how joining the same tables brings different results.

    I completely I understand:

    The first Query/SQL returns the "meat" ingredients that are ACTUALLY USED in the recipes and the number of times each of the selected "meat" ingredient used.

    The  second Query/SQL returns ALL "meat" ingredients from your Table [Ingredients] and for each ingredient, how many times it is actually used in the Recipes

    -I understand the reason for the first query better. I think the ingredientDesciption is filtered after joining to Ingredients and then to Recipe_Ingredients tables, is this logic correct? Each time the ingredientDescription is filtered.

    -What do you mean when you say: "you parsed the query"? Do you mean you split it to smaller queries?

    -If yes, then for the second query, I came to capture the recipes that their count is zero? What query should we write to capture the recipeIds whose count is zero?

    -In term of the joins and tables, why the second query returns the count of RecipeIds whose count is zero? I can't understand this part at all. How the records are mapping to each other?

    This is definitely an answer but I want to hear more about it from you.

    Thanks a lot of your help.

    GGGGGNNNNN


    GGGGGNNNNN

    Monday, November 10, 2014 6:17 AM
  • >>
    -I understand the reason for the first query better. I think the ingredientDesciption is filtered after joining to Ingredients and then to Recipe_Ingredients tables, is this logic correct? Each time the ingredientDescription is filtered.
    <<

    Yes.  However, it is probably more correct to say the ingredients (rather than IngredientDescriptions) are filtered.  Remember than the INNER JOIN is a FILTER since the INNER JOIN requires MATCHING records from the 2 Tables used in the Inner Join.  In particular for the Inner Join between [Recipe_Ingedients] and [Ingredients] in the first Query/SQL, if a IngredientID does not exist in the Table [Recipe_Ingredients] (i.e. the ingredient is NOT USED in recipes), the IngredientID (and hence, ingredient) is filtered OUT and not returned.

    >>
    -What do you mean when you say: "you parsed the query"? Do you mean you split it to smaller queries?
    <<

    That means I add line-breaks and indents to see the different components (SELECT clause, FROM clause, WHERE clause, etc...), the different Joins and the overall logic of the SQL String.  You find that the book does the same thing for clarity.

    >>
    -If yes, then for the second query, I came to capture the recipes that their count is zero? What query should we write to capture the recipeIds whose count is zero?
    <<

    Sure can by adding the same SubSQL as another condition in the WHERE clause like:

    SELECT i.IngredientName,ic.IngredientClassDescription, 
      (  SELECT COUNT(ri.RecipeID) 
         FROM Recipe_Ingredients ri 
         WHERE ri.IngredientID = i.IngredientID ) AS CountOfRecipes
      
    FROM Ingredient_Classes ic INNER JOIN Ingredients i 
      ON ic.IngredientClassID = i.IngredientClassID 
      
    WHERE (ic.IngredientClassDescription = 'meat')
      And
      ( ( SELECT COUNT(ri.RecipeID) 
          FROM Recipe_Ingredients ri 
          WHERE ri.IngredientID = i.IngredientID ) = 0 )

    However, if you want to return the "meat" ingredients that are NOT ACTUALLY USED in the recipes, it is more efficient to use the "Frustrated Outer Join" technique (search the Web for more info) with the SQL:

    SELECT i.IngredientName,ic.IngredientClassDescription FROM ( Ingredients i INNER JOIN Ingredient_Classes ic ON i.IngredientClassID = ic.IngredientClassID ) LEFT JOIN Recipe_Ingredients ri ON i.IngredientID = ri.IngredientID WHERE (ic.IngredientClassDescription = 'meat')
    And (ri.IngredientID Is Null)


    The logic of this SQL is basically to return all  "meat" ingredients except when the "meat" ingredient is NOT used in recipes  (i.e. the condition ri.IngredientID Is Null).  This should return 7 rows as per the other SQL.

     

    >>
    -In term of the joins and tables, why the second query returns the count of RecipeIds whose count is zero?
    <<

    The FROM clause + the WHERE clause construction return ALL "meat" ingredient in the Table [Ingredients] (returning ALL 11 "meat" ingredients).  The SubQuery/SubSQL involving the Table [Recipe_Ingredients] only supply the count of usage of each "meat" ingredient.  If the ingredient is not used (i.e. corresponding IngredientID does not exist in [Recipe_Ingredients]), the the count is zero.  In this (second) SQL, think of the SubQuery/SubSQL as a calculated value and it does NOT affect the filtering/selection of the records in the main SQL.

     

    I am sure the authors explain all this in the book.  Did you read the explanation accompanying each SQL carefully???

          


    Van Dinh

    • Marked as answer by cloudsInSky Monday, November 10, 2014 8:39 PM
    Monday, November 10, 2014 3:50 PM
  • Van,

    Thank you for all these great explanations. As far as I know I didn't get any explanations in regards to what I have put in question here. There might be some scattered explanations here and there in the book that the author expects via those, we should be able to figured out the difference. What I have asked is the question at the end of chapter.

    I think what you have explained here is very understandable for me. The only thing that I didn't understand is the parse. When you say parse, you mean you break the query into pieces by SELECT, FROM, Where?

    When you say: you have parsed the query, your parse is exactly look like what my query is at the first place. Having said that your parse doesn't change anything on the query ( I am not probably right). You might have parsed it visually, might not you?

    I learned a lot from this post. Thanks for all

    Regards,

    GGGGGNNNNN


    GGGGGNNNNN

    Monday, November 10, 2014 8:38 PM
  • >>
    When you say parse, you mean you break the query into pieces by SELECT, FROM, Where?
    <<

    Yes.  Parsing into different clauses and components (e.g. the SubSQL as a calculated value in the SELECT clause in the second SQL) help you to think logically in your head what each clause/component does in the whole Query.  For an example of a moderately complex Query/SQL where I use parsing, see recent MSDN thread Query that shows which prices have changed in the most current daily refresh,

     


    Van Dinh

    • Marked as answer by cloudsInSky Tuesday, November 11, 2014 7:24 PM
    Monday, November 10, 2014 9:06 PM