none
Convert *= / =* to outer joins (ANSI compliant)

    Question

  • Hi,

     

    I'm working on converting *= and =* to 'left outer join' and 'right outer join'.

     

    I noticed the difference in behavior between the =* and the phrase "right outer join" and between *=  and 'left outer join'. The result set is different. Here is an example:

     

    select a.au_id, b.title, c.qty

    from titleauthor a, titles b, sales c

    where (a.title_id =* b.title_id)

    and (a.title_id =* c.title_id)

     

    I try to conver the above to:

     

    select a.au_id, b.title, c.qty

    from titleauthor a

    right outer join titles b

    on (a.title_id = b.title_id )

    right outer join sales c

    on (a.title_id = c.title_id )

     

    The first results into 391 rows in pubs database of sql-server 2000 and the second produces 34 rows. It seems not that straight forward to convert.

     

    The question is: If I want to get 391 row, what should I change/add in the second sql statement?

     

    Thank you. Appreciate your help.

     

    Joan

     

    Tuesday, April 10, 2007 12:35 AM

Answers

  • I think this:

    select a.au_id, b.title, c.qty

    from titleauthor a, titles b, sales c

    where (a.title_id =* b.title_id)

    and (a.title_id =* c.title_id)

    Is actually not a right join between these tables directly, but actually more this query:

    --changed the a b c aliases to just be the table for clarity
    select titleauthor.au_id, titles.title, sales.qty
    from sales
    cross join titles
    left outer join titleauthor
    on titles.title_id = titleauthor.title_id
    and sales.title_id = titleauthor.title_id

    The commas can be translated directly to CROSS JOIN, with the where being the same. So in your query, since there is no link between sales and titles, it does a cross join. You can see this by looking at the plan:

    Code Snippet

    SET SHOWPLAN_ALL ON
    GO

    select a.au_id, b.title, c.qty
    from titleauthor a, titles b, sales c
    where (a.title_id =* b.title_id)
    and (a.title_id =* c.title_id)

    SET SHOWPLAN_ALL OFF
    GO

    |--Hash Match(Right Outer Join, HASH:([a].[title_id])=([b].[title_id]),
    RESIDUAL:([pubs].[dbo].[titleauthor].[title_id] as
    [a].[title_id]=[pubs].[dbo].[titles].[title_id] as [b].[title_id]
    AND [pubs].[dbo].[titleauthor].[title_id] as
    [a].[title_id]=[pubs].[dbo].[sales].[title_id] as [c].[title_id]))
    |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[titleidind] AS [a]))
    |--Nested Loops(Inner Join)
    |--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [b]))

    |--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales] AS [c]))


     Note, no join criteria. Hopefully this helps, it is one of the reasons why the syntax was changed Smile
    Tuesday, April 10, 2007 2:51 AM
    Moderator
  • You mean the query?  It is clear enough, it is just a matter of what you are asking.  The ANSI Style is far more clear to express queries as they are going to be/can be expressed.

    The old style was ambiguous like this and was based on the concept that you (logically) first cross join each of the tables in the FROM, then for each row check the criteria.  Works like a champ for INNER joins, but OUTER joins not so much. 
    Tuesday, April 10, 2007 3:01 AM
    Moderator
  • This query may fix your problem..

     

    Code Snippet

    Select
       CU.Name
     , CI.Amount
     , CI.PolicyItemId 
     , CI.timestamp
     , CU.CreditSurchargeId
     , CU.ValueType_ES
     , CU.IsDefault
     , CU.IsModifiable
     , CU.IsCredit
     , CP.ObjectCategoryId
     , CU.Type_ES
     , CP.DisplayOrder 
     , CP.Amount
     , CP.Effectivedateid
    from
     CreditSurchargePolicyLine CP

      JOIN CreditSurchargeUnit CU ON
       CP.CreditSurchargeId = CU.CreditSurchargeId
       AND CP.ObjectSubjectId = CU.ObjectSubjectId

      LEFT OUTER JOIN CreditSurchargePolicyLineItem CI ON
       CU.CreditSurchargeId = CI.CreditSurchargeId
       And CI.PolicyItemId = 30153677
       And CI.PolicyLineItemId = 1

    where
    CP.PolicyLineId = 4
    and CP.ObjectSubjectId = 4
    and CP.Status_ES = 'A'
    and CP.EffectiveDateId = 1
    and CP.ObjectCategoryId in (0, 1)
    and CP.CreditSurchargeId <> 79 
    and CP.CreditSurchargeId <> 78 
    and CP.CreditSurchargeId <> 83 

     

     

    Tuesday, April 10, 2007 4:23 AM

All replies

  • I think this:

    select a.au_id, b.title, c.qty

    from titleauthor a, titles b, sales c

    where (a.title_id =* b.title_id)

    and (a.title_id =* c.title_id)

    Is actually not a right join between these tables directly, but actually more this query:

    --changed the a b c aliases to just be the table for clarity
    select titleauthor.au_id, titles.title, sales.qty
    from sales
    cross join titles
    left outer join titleauthor
    on titles.title_id = titleauthor.title_id
    and sales.title_id = titleauthor.title_id

    The commas can be translated directly to CROSS JOIN, with the where being the same. So in your query, since there is no link between sales and titles, it does a cross join. You can see this by looking at the plan:

    Code Snippet

    SET SHOWPLAN_ALL ON
    GO

    select a.au_id, b.title, c.qty
    from titleauthor a, titles b, sales c
    where (a.title_id =* b.title_id)
    and (a.title_id =* c.title_id)

    SET SHOWPLAN_ALL OFF
    GO

    |--Hash Match(Right Outer Join, HASH:([a].[title_id])=([b].[title_id]),
    RESIDUAL:([pubs].[dbo].[titleauthor].[title_id] as
    [a].[title_id]=[pubs].[dbo].[titles].[title_id] as [b].[title_id]
    AND [pubs].[dbo].[titleauthor].[title_id] as
    [a].[title_id]=[pubs].[dbo].[sales].[title_id] as [c].[title_id]))
    |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[titleidind] AS [a]))
    |--Nested Loops(Inner Join)
    |--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [b]))

    |--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales] AS [c]))


     Note, no join criteria. Hopefully this helps, it is one of the reasons why the syntax was changed Smile
    Tuesday, April 10, 2007 2:51 AM
    Moderator
  • Hi,

     

    Is it because the question is not clear?

     

    Joan

    Tuesday, April 10, 2007 2:55 AM
  • You mean the query?  It is clear enough, it is just a matter of what you are asking.  The ANSI Style is far more clear to express queries as they are going to be/can be expressed.

    The old style was ambiguous like this and was based on the concept that you (logically) first cross join each of the tables in the FROM, then for each row check the criteria.  Works like a champ for INNER joins, but OUTER joins not so much. 
    Tuesday, April 10, 2007 3:01 AM
    Moderator
  • Hi Louis,

     

    Thank you so much!

     

    I'll try to fix my query. I might have question later.

     

    Thanks again.

     

    Joan

     

     

     

    Tuesday, April 10, 2007 3:09 AM
  • Hi Louis,

     

    Here I get another example. The old code is as follows:

     

    Select CU.Name, CI.Amount, CI.PolicyItemId, CI.timestamp, CU.CreditSurchargeId, CU.ValueType_ES  ,CU.IsDefault, CU.IsModifiable, CU.IsCredit, CP.ObjectCategoryId, CU.Type_ES, CP.DisplayOrder  ,CP.Amount, CP.Effectivedateid
    from CreditSurchargePolicyLine CP, CreditSurchargeUnit CU, CreditSurchargePolicyLineItem CI
    where CP.PolicyLineId = 4
    and CP.ObjectSubjectId = 4
    and CP.Status_ES = 'A'
    and CP.CreditSurchargeId = CU.CreditSurchargeId
    and CP.ObjectSubjectId = CU.ObjectSubjectId
    and CU.CreditSurchargeId *= CI.CreditSurchargeId
    and CI.PolicyItemId = 30153677
    and CI.PolicyLineItemId = 1
    and CP.ObjectCategoryId in (0, 1)
    and CP.CreditSurchargeId <> 79 
    and CP.CreditSurchargeId <> 78 
    and CP.CreditSurchargeId <> 83
    and CP.effectivedateID = 1

     

    I converted as follows:

     

    Select CU.Name, CI.Amount, CI.PolicyItemId, CI.timestamp, CU.CreditSurchargeId, CU.ValueType_ES, CU.IsDefault, CU.IsModifiable, CU.IsCredit, CP.ObjectCategoryId, CU.Type_ES, CP.DisplayOrder  ,CP.Amount, CP.Effectivedateid
    from CreditSurchargePolicyLine CP JOIN CreditSurchargeUnit CU ON CP.CreditSurchargeId = CU.CreditSurchargeId
    AND CP.ObjectSubjectId = CU.ObjectSubjectId
    LEFT OUTER JOIN CreditSurchargePolicyLineItem CI ON (CU.CreditSurchargeId = CI.CreditSurchargeId)
    where CP.PolicyLineId = 4
    and CP.ObjectSubjectId = 4
    and CP.Status_ES = 'A'
    and CI.PolicyItemId = 30153677
    and CI.PolicyLineItemId = 1
    and CP.EffectiveDateId = 1
    and CP.ObjectCategoryId in (0, 1)
    and CP.CreditSurchargeId <> 79 
    and CP.CreditSurchargeId <> 78 
    and CP.CreditSurchargeId <> 83 

     

    The old code returns 23 rows, which it should be. But the new code returns 11 rows, which is not correct. But I can't tell what is wrong with the new code.

     

    Thank you!

     

    Joan

    Tuesday, April 10, 2007 3:50 AM
  • This query may fix your problem..

     

    Code Snippet

    Select
       CU.Name
     , CI.Amount
     , CI.PolicyItemId 
     , CI.timestamp
     , CU.CreditSurchargeId
     , CU.ValueType_ES
     , CU.IsDefault
     , CU.IsModifiable
     , CU.IsCredit
     , CP.ObjectCategoryId
     , CU.Type_ES
     , CP.DisplayOrder 
     , CP.Amount
     , CP.Effectivedateid
    from
     CreditSurchargePolicyLine CP

      JOIN CreditSurchargeUnit CU ON
       CP.CreditSurchargeId = CU.CreditSurchargeId
       AND CP.ObjectSubjectId = CU.ObjectSubjectId

      LEFT OUTER JOIN CreditSurchargePolicyLineItem CI ON
       CU.CreditSurchargeId = CI.CreditSurchargeId
       And CI.PolicyItemId = 30153677
       And CI.PolicyLineItemId = 1

    where
    CP.PolicyLineId = 4
    and CP.ObjectSubjectId = 4
    and CP.Status_ES = 'A'
    and CP.EffectiveDateId = 1
    and CP.ObjectCategoryId in (0, 1)
    and CP.CreditSurchargeId <> 79 
    and CP.CreditSurchargeId <> 78 
    and CP.CreditSurchargeId <> 83 

     

     

    Tuesday, April 10, 2007 4:23 AM
  • Thank you ManiD! Thank you all!

     

     

     

     

    Tuesday, April 10, 2007 4:41 AM
  • You are welcome.

     

    Remember when you use left/right outer join, if you want to apply any filter attach that filter on JOIN condition itself - rather than

    at where clause.

     

    -Mani.D

    Tuesday, April 10, 2007 4:51 AM
  • >>Remember when you use left/right outer join, if you want to apply any filter attach that filter on JOIN condition itself - rather than at where clause.<<


    In spirit this is usually right, but this isn't quite true. You have to be careful and cognizant about where to put FILTER criteria, but it can go either place.  You just have to realize that:

    • In the JOIN clause, a condition is applied to the joining of the two sets of data. And from the left side of a LEFT join will be returned no matter what (or the right side of a RIGHT join or both sides of a FULL join for that matter Smile

    • In the WHERE clause, the condition is applied the the set of rows produced from the FROM clause.  So if a row was returned in the FROM clause as the result of an LEFT OUTER JOIN, if you then try to filter out the data by comparing data from the right table, all of the values will be NULL.  So they will be filtered out unless you realize this. 

    Wednesday, April 11, 2007 1:49 AM
    Moderator