none
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

    Question

  • there are two TSQL sentences :sentence 1 and sentence 2

    The sentence 1 can be excuted successfully ,but there will be throw an exception when I excuted the sentence 2,why?

    --sentence 1

    use [AdventureWorks]
    go
    DECLARE @c VARCHAR(10)
    SET @c = ''

    SELECT CustomerID
    FROM dbo.Customers AS C
    WHERE CustomerID IN (SELECT  @c + CustomerID FROM dbo.Orders)
    OPTION (HASH JOIN, RECOMPILE)

    --sentence 2

    use [AdventureWorks]
    go
    DECLARE @c VARCHAR(10)
    SET @c = ''

    SELECT CustomerID
    FROM dbo.Customers AS C
    WHERE CustomerID IN (SELECT  @c + CustomerID FROM dbo.Orders)
    OPTION (HASH JOIN)



    • Edited by Jacky_shen Friday, April 06, 2012 4:24 PM
    Friday, April 06, 2012 3:54 PM

Answers

  • The internal workings of the parser/compiler/optimizer can only be assumed, but it seems to be an order of processing issue here.

    Using DBCC freeproccache has no effect here so existing plans interfering with the @c variable value should not be the issue.

    Seeing that the following code works :

    DECLARE @c INT
    SET @c = 100
    
    SELECT CustomerID
    FROM Sales.Customer
    WHERE CustomerID IN (SELECT  CustomerID FROM Sales.SalesOrderHeader) 
          AND CustomerID > @c
    
    OPTION (HASH JOIN)
    

    and this does not:

    DECLARE @c INT
    SET @c = 100
    
    SELECT CustomerID
    FROM Sales.Customer
    WHERE CustomerID IN (SELECT  @c + CustomerID 
    FROM Sales.SalesOrderHeader) 
    OPTION (HASH JOIN)
    
    and reading this snippet from BOL
    the RECOMPILE query hint uses the current values of any local variables in the query 

    suggests that at the time the engine tries to evaluates that a HASH JOIN is viable it encounters an internal error trying to use the @c which seems to have an undefined value or is simply not allocated in memory.

    So it appears that some optimizer hints are executed/evaluated in a specific order that precedes in this case variable evaluation/allocation.

    I believe the use of RECOMPILE is mandatory in this rare case where local variables play an active part in the hash join operation

    But the usage is tricky and could change from version to version, so maybe its better to consider other means of writing such queries or make use dynamic SQL

    • Marked as answer by Jacky_shen Saturday, April 07, 2012 3:32 PM
    Saturday, April 07, 2012 8:55 AM
  • you said"But if the variable is there, the hash join is not possible for some reason."

    So what's the reason? If I know the reason,I can avoid the error when I use the hash join hint


    I guess that as a long-time SQL Server MVP that I am expected to know the answer. And if I gave it some thought I might be able to come up with something. But the truth is that I don't really care. My guess is that the optimizer has a restricution so that it is not able to build a hash probe from an expression which includes a variable.

    Now, this does not mean that my mumbly answer is not without value. Rather, I think there is a lesson to be learnt. Computing is a lot about abstraction. Ideally, we should just write our SQL queries, and the optimizer will always come up with the best query plan, and we should have no reason to consider what's going on under the covers. Even less have reason to use hints.

    Now, we all know that the world is far from that perfect and if you want to be a successful database professional, you need to be able read and understand query plans. Does that mean that you need to understand every little detail? I say no. When we look at a query plan for a slow query, we need to get some understanding where it is slow (which is far from trivial, since there is not really any information about this in the plan; the percentages you see are only estimates) and more importantly why it is slow.

    The two most common reasons that a plan is slow are:
    1) Lack of good index.
    2) Estimates are way off from the actual values.

    In the first case, we add the index and that's that.

    In the second case, there are a couple of ways to go. One is to make sure that statistics are up to date. There are some situations where statistics quickly go out of date, for instance when you have a monotonically increasing key. There are also situations where the first invocation of a procedure has an unusual value, leading to a plan which is good for that invocation, but not for the rest. If you have joins in the mix, it gets worse, because for join predicates the optimizer only looks at the density.

    Because of these imperfections, there are certainly reasons to use hints. But are all hints equal? To add a hint like HASH JOIN you should be sure that hash join will always be the correct way to run the query. Keep in mind that your data distribution will change over time. If you find that hash join today gives you 20% better performance over a loop join, just forget it. It can be the other way in three months.

    When troubleshooting a performance problem, it can certainly be a good idea to attempt different join hints only to see the effect. But keep in mind that as long as you don't understand your data, it's just a stab in the dark. If you see that a hash join improves your performance drastically, the next point on the list is to understand why. And maybe when you have that understanding the solution is different.

    With this approach, there is little reason to worry about adding a hint that will not work. When you are playing around you will find that the hash join did not work this time, no harm done. If you go around and litter the code with join hints, because you think it's this the fad of the week, it's a different matter. But you should not do that.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by Jacky_shen Saturday, April 07, 2012 3:32 PM
    Saturday, April 07, 2012 9:14 AM
  • By now,I know there are two situations will throw the error,
    one is the example in this thread,and the other is the example Craig described in his article,
    so is there any other situation will throw the same exception? maybe yes,maybe not,but we don't know.

    There certainly are. Far from all queries can be evaluated with a hash join. For instance, consider this query:

    SELECT O.*
    FROM   Orders O
    JOIN   Customers C ON O.CustomerID BETWEEN C.City and C.Region

    Nevermind that the query is completely useless, but if you slap on a HASH JOIN hint, you get an error. This follows from what Craig says in his blog post: a hash join requires at least on equi-join predicate, and here is none.

    If we know the reason under the hook,maybe we can avoid the similar errors.

    Nah. You should always test when adding a join hint. If nothing else, you nede to verify that the hint actually helps performance.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by Jacky_shen Sunday, April 08, 2012 2:05 AM
    Saturday, April 07, 2012 5:12 PM

All replies

  • Have you tried executing

    SET FORCEPLAN OFF

    and then executing the two queries?

    Also, which version of SQL Server are you using?

    Friday, April 06, 2012 4:14 PM
  • Hi kent,

    No,I have not executed "SET FORCEPLAN OFF"


    • Edited by Jacky_shen Thursday, March 21, 2013 6:49 AM
    Friday, April 06, 2012 4:21 PM
  • Well, I got the same problem (using SQL Server 2008 R2) and SQL Server 2012

    Msg 8622, Level 16, State 1, Line 4
    Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

    What is your ultimate goal here?


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    Friday, April 06, 2012 4:33 PM
  • It appears that without the RECOMPILE hint the optimizer is trying to fetch a value for the @c variable from a plan that does not exist ending up with a void value for it and making it impossible to use the HASH JOIN hint concatenating @c with CustomerID
    Friday, April 06, 2012 4:36 PM
  • I think one of the points of the article is to be careful when using optimizer hints; sometimes, such as this time, adding optimizer hints can precipitate screwy or unexpected results.
    Friday, April 06, 2012 5:03 PM
  • Yes, the query with RECOMPILE fails on SQL 2005, and is should also fail on SQL 2008 SP1 and SQL 2008 R2 RTM.

    On SQL 2008 SP3, the RECOMPILE hint causes the variable to be replaced with its actual value, so that the query reads:

    SELECT CustomerID
    FROM dbo.Customers AS C
    WHERE CustomerID IN (SELECT  CustomerID FROM dbo.Orders)
    OPTION (HASH JOIN)

    Which can be done with a hash join.

    But if the variable is there, the hash join is not possible for some reason.

    On SQL 2005 and some versions of SQL 2008, the RECOMPILE hint works differently, so that the variable is handled as a parameter. That is, the value is sniffed, but the plan must work for any value of the parameter.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, April 06, 2012 9:42 PM
  • Hi Naomi ,

    just as kent said,I just wanna avoid the potential error when I use the hint.


    • Edited by Jacky_shen Saturday, April 07, 2012 2:10 AM
    Saturday, April 07, 2012 2:05 AM
  • Hi jtclipper

    it sounds reasonable,thank you.

    Saturday, April 07, 2012 2:06 AM
  • Hi Kent ,

    Yes,you are right.

    • Edited by Jacky_shen Saturday, April 07, 2012 2:17 AM
    Saturday, April 07, 2012 2:06 AM
  • Hi  Erland ,

    thank you.

    you said"But if the variable is there, the hash join is not possible for some reason."

    So what's the reason? If I know the reason,I can avoid the error when I use the hash join hint




    • Edited by Jacky_shen Saturday, April 07, 2012 3:07 AM
    Saturday, April 07, 2012 2:08 AM
  • The internal workings of the parser/compiler/optimizer can only be assumed, but it seems to be an order of processing issue here.

    Using DBCC freeproccache has no effect here so existing plans interfering with the @c variable value should not be the issue.

    Seeing that the following code works :

    DECLARE @c INT
    SET @c = 100
    
    SELECT CustomerID
    FROM Sales.Customer
    WHERE CustomerID IN (SELECT  CustomerID FROM Sales.SalesOrderHeader) 
          AND CustomerID > @c
    
    OPTION (HASH JOIN)
    

    and this does not:

    DECLARE @c INT
    SET @c = 100
    
    SELECT CustomerID
    FROM Sales.Customer
    WHERE CustomerID IN (SELECT  @c + CustomerID 
    FROM Sales.SalesOrderHeader) 
    OPTION (HASH JOIN)
    
    and reading this snippet from BOL
    the RECOMPILE query hint uses the current values of any local variables in the query 

    suggests that at the time the engine tries to evaluates that a HASH JOIN is viable it encounters an internal error trying to use the @c which seems to have an undefined value or is simply not allocated in memory.

    So it appears that some optimizer hints are executed/evaluated in a specific order that precedes in this case variable evaluation/allocation.

    I believe the use of RECOMPILE is mandatory in this rare case where local variables play an active part in the hash join operation

    But the usage is tricky and could change from version to version, so maybe its better to consider other means of writing such queries or make use dynamic SQL

    • Marked as answer by Jacky_shen Saturday, April 07, 2012 3:32 PM
    Saturday, April 07, 2012 8:55 AM
  • you said"But if the variable is there, the hash join is not possible for some reason."

    So what's the reason? If I know the reason,I can avoid the error when I use the hash join hint


    I guess that as a long-time SQL Server MVP that I am expected to know the answer. And if I gave it some thought I might be able to come up with something. But the truth is that I don't really care. My guess is that the optimizer has a restricution so that it is not able to build a hash probe from an expression which includes a variable.

    Now, this does not mean that my mumbly answer is not without value. Rather, I think there is a lesson to be learnt. Computing is a lot about abstraction. Ideally, we should just write our SQL queries, and the optimizer will always come up with the best query plan, and we should have no reason to consider what's going on under the covers. Even less have reason to use hints.

    Now, we all know that the world is far from that perfect and if you want to be a successful database professional, you need to be able read and understand query plans. Does that mean that you need to understand every little detail? I say no. When we look at a query plan for a slow query, we need to get some understanding where it is slow (which is far from trivial, since there is not really any information about this in the plan; the percentages you see are only estimates) and more importantly why it is slow.

    The two most common reasons that a plan is slow are:
    1) Lack of good index.
    2) Estimates are way off from the actual values.

    In the first case, we add the index and that's that.

    In the second case, there are a couple of ways to go. One is to make sure that statistics are up to date. There are some situations where statistics quickly go out of date, for instance when you have a monotonically increasing key. There are also situations where the first invocation of a procedure has an unusual value, leading to a plan which is good for that invocation, but not for the rest. If you have joins in the mix, it gets worse, because for join predicates the optimizer only looks at the density.

    Because of these imperfections, there are certainly reasons to use hints. But are all hints equal? To add a hint like HASH JOIN you should be sure that hash join will always be the correct way to run the query. Keep in mind that your data distribution will change over time. If you find that hash join today gives you 20% better performance over a loop join, just forget it. It can be the other way in three months.

    When troubleshooting a performance problem, it can certainly be a good idea to attempt different join hints only to see the effect. But keep in mind that as long as you don't understand your data, it's just a stab in the dark. If you see that a hash join improves your performance drastically, the next point on the list is to understand why. And maybe when you have that understanding the solution is different.

    With this approach, there is little reason to worry about adding a hint that will not work. When you are playing around you will find that the hash join did not work this time, no harm done. If you go around and litter the code with join hints, because you think it's this the fad of the week, it's a different matter. But you should not do that.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by Jacky_shen Saturday, April 07, 2012 3:32 PM
    Saturday, April 07, 2012 9:14 AM
  • Hi jtclipper,
    Thank you for providing the example and the explanation.
    IF I can make sure your explanation/speculation is absolutely right whenever,I will give the responce to this thread.
    Thanks again.
    Saturday, April 07, 2012 3:23 PM
  • Hi Erland,

    Thanks for your enthusiastic help and detail explanation and advice.
    Absolutely ,your answers is valuable,and also from performance tuning aspect, your advice is helpful and reasonable.
    There are some reasons prompted me to insist on knowwing the answer under the hook.

    By now,I know there are two situations will throw the error,
    one is the example in this thread,and the other is the example Craig described in his article,
    so is there any other situation will throw the same exception? maybe yes,maybe not,but we don't know.
    If we know the reason under the hook,maybe we can avoid the similar errors.
    just as if we know 1/0,2/0 is illegal,we can avoid to write 1/0,2/0.
    But,If we know "Divide by zero" is illegal,we not only can avoid to write 1/0,2/0, we can also avoid  write 3/0,4/0,1*1/0,1*2/0.....various variants.

    The other reasons are just from experiment or test aspect, no really use case.


    • Edited by Jacky_shen Saturday, April 07, 2012 3:28 PM
    Saturday, April 07, 2012 3:24 PM
  • By now,I know there are two situations will throw the error,
    one is the example in this thread,and the other is the example Craig described in his article,
    so is there any other situation will throw the same exception? maybe yes,maybe not,but we don't know.

    There certainly are. Far from all queries can be evaluated with a hash join. For instance, consider this query:

    SELECT O.*
    FROM   Orders O
    JOIN   Customers C ON O.CustomerID BETWEEN C.City and C.Region

    Nevermind that the query is completely useless, but if you slap on a HASH JOIN hint, you get an error. This follows from what Craig says in his blog post: a hash join requires at least on equi-join predicate, and here is none.

    If we know the reason under the hook,maybe we can avoid the similar errors.

    Nah. You should always test when adding a join hint. If nothing else, you nede to verify that the hint actually helps performance.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by Jacky_shen Sunday, April 08, 2012 2:05 AM
    Saturday, April 07, 2012 5:12 PM
  • Got it,thank you so much Erland.
    Sunday, April 08, 2012 2:05 AM