none
Order of Execution of SQL Queries RRS feed

  • Question

  • I am curious about the order of execution of SQL Queries.  I did a bit of research on this.  I found this:

     

    http://stackoverflow.com/questions/4596467/order-of-execution-of-the-query

     

    In that link, someone stated the following:

     

    'SQL has no order of execution. Is a declarative language. The optimizer is free to choose any order it feels appropriate to produce the best execution time. Given any SQL query, is basically impossible to anybody to pretend it knows the execution order. If you add detailed information about the schema involved (exact tables and indexes definition) and the estimated cardinalities (size of data and selectivity of keys) then one can take a guess at the probable execution order.'

     

    Really?  There is NO ORDER???

     

    In the link below, Dave states the following:

     

    http://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/

     

    What actually sets SQL Server apart from other programming languages is the way SQL Server processes its code. Generally, most programming languages process statement from top to bottom. By contrast, SQL Server processes them in a unique order which is known as Logical Query Processing Phase. These phases generate a series of virtual tables with each virtual table feeding into the next phase (virtual tables not viewable). These phases and their orders are given as follows:

     

    1. FROM

    2. ON

    3. OUTER

    4. WHERE

    5. GROUP BY

    6. CUBE | ROLLUP

    7. HAVING

    8. SELECT

    9. DISTINCT

    10. ORDER BY

    11. TOP

     

    Again, I was just curious about this concept.  I’ve been using SQL for a long time.  I’ve thought about this before, but not seriously.  Today, I’m thinking about it from an academic perspective.

     

    Thanks for the insight, everyone!!

     

     


    Ryan Shuell

    Saturday, August 25, 2012 3:09 PM

Answers

  • In one sense, there is a prescribed order.  And the above order is a (simplified) version of the order for a SELECT statement.  At one time, it was pretty complete, but not you have to include function calls, cte's, etc.

    But in another sense, when it comes to executing the query, there is no prescribed order.  When running the query, SQL is allowed to do anything it wants as long as the result is the same as if it had been done in the above order.  So if you do

    Select *
    From Customers c
    Left Outer Join Orders o On c.CustNbr = o.CustNbr
    Where c.SalesmanID = 25;

    SQL is allowed to, and probably will, do the WHERE before the Join.  It's more efficient.  Why match all the the customers with other SalesmanID's to there orders and then throw the result away.  So SQL will probably do the WHERE first, then the LEFT OUTER JOIN ... ON.

    Tom

    • Marked as answer by ryguy72 Saturday, August 25, 2012 11:22 PM
    Saturday, August 25, 2012 3:20 PM
  • There are a couple of things to keep a part here.

    First of all, if you have a suite of statements, these statements are executed in order, and in SQL Server there is no optimisation between statements, but statements are optimised separately. For instance, if you
    say:

    SELECT @cnt = COUNT (*) FROM bigtable
    SELECT @max = MAX(somecolumn) FROM bigtable

    SQL Server will never attempt to resovle this with a single access to bigtable.

    Next, is logical and physical execution order. There is a well-defined logical execution order, as defined by the list starting with FROM. However, the optimizer may recast the computation order, as long it does not affect the final result.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by ryguy72 Saturday, August 25, 2012 11:22 PM
    Saturday, August 25, 2012 5:32 PM

All replies

  • In one sense, there is a prescribed order.  And the above order is a (simplified) version of the order for a SELECT statement.  At one time, it was pretty complete, but not you have to include function calls, cte's, etc.

    But in another sense, when it comes to executing the query, there is no prescribed order.  When running the query, SQL is allowed to do anything it wants as long as the result is the same as if it had been done in the above order.  So if you do

    Select *
    From Customers c
    Left Outer Join Orders o On c.CustNbr = o.CustNbr
    Where c.SalesmanID = 25;

    SQL is allowed to, and probably will, do the WHERE before the Join.  It's more efficient.  Why match all the the customers with other SalesmanID's to there orders and then throw the result away.  So SQL will probably do the WHERE first, then the LEFT OUTER JOIN ... ON.

    Tom

    • Marked as answer by ryguy72 Saturday, August 25, 2012 11:22 PM
    Saturday, August 25, 2012 3:20 PM
  • There are a couple of things to keep a part here.

    First of all, if you have a suite of statements, these statements are executed in order, and in SQL Server there is no optimisation between statements, but statements are optimised separately. For instance, if you
    say:

    SELECT @cnt = COUNT (*) FROM bigtable
    SELECT @max = MAX(somecolumn) FROM bigtable

    SQL Server will never attempt to resovle this with a single access to bigtable.

    Next, is logical and physical execution order. There is a well-defined logical execution order, as defined by the list starting with FROM. However, the optimizer may recast the computation order, as long it does not affect the final result.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Marked as answer by ryguy72 Saturday, August 25, 2012 11:22 PM
    Saturday, August 25, 2012 5:32 PM
  • My thanks go out to both of you!!


    Ryan Shuell

    Saturday, August 25, 2012 11:22 PM
  • Hi!

    I found this discussion and I have a question regarding execution of parts of WHERE clauses.

    If I have the following WHERE clause (LastName is a column, @inName is an input parameter to a procedeure)

    WHERE

    (@inName is null) or

    (@inName is not null and LastName=@inName and myFunction(LastName)='ALLOWED')

    If @inName is null i don't want the search for LastName=@inName nor the function to be executed.

    Is there a way to achieve this without turning to dynamic SQL? Will the optimizer discover that @inName is constant and automatically avoid the

    evaluation of the second line when @inName is null?

    Am I breaking the set based paradigm by wanting to control which of the set based conditions that are executed?

    Regards Ove

    Monday, August 27, 2012 9:24 AM
  • If you are on SQL 2008 SP1 CU 5 (10.0.2746) or later, SQL 2008R2 CU1 (10.50.1702) or later, or any version of SQL 2012, then just add OPTION(RECOMPILE) to the query.  That is something like

    SELECT <column list>
    FROM <whatever>
    WHERE (@inName is null) or
       (@inName is not null and LastName=@inName and myFunction(LastName)='ALLOWED')
    OPTION(RECOMPILE)

    That will force the SELECT to be recompiled every time, and the optimizer will notice the current value in @inName and if it is NULL, the function will not be called.

    Do not do this if you are on earlier versions of SQL 2008 or SQL 2008R2 as doing this in those early versions exposed you to a serious bug.

    Tom


    • Proposed as answer by Ove Kernell Tuesday, August 28, 2012 7:28 AM
    Monday, August 27, 2012 1:04 PM
  • I believe we can see the answer of this question in following T-SQL Query Processing Flow Diagram.

    • Proposed as answer by Johnny.Wang.MS Saturday, December 15, 2012 3:37 AM
    Saturday, December 15, 2012 3:32 AM
  • If you are on SQL 2008 SP1 CU 5 (10.0.2746) or later, SQL 2008R2 CU1 (10.50.1702) or later, or any version of SQL 2012, then just add OPTION(RECOMPILE) to the query.  That is something like

    SELECT <column list>
    FROM <whatever>
    WHERE (@inName is null) or
       (@inName is not null and LastName=@inName and myFunction(LastName)='ALLOWED')
    OPTION(RECOMPILE)

    That will force the SELECT to be recompiled every time, and the optimizer will notice the current value in @inName and if it is NULL, the function will not be called.

    Do not do this if you are on earlier versions of SQL 2008 or SQL 2008R2 as doing this in those early versions exposed you to a serious bug.

    Tom


    OR you could use the self eliminating parameter like 

    SELECT <column list>
    FROM <whatever>
    WHERE   (@inName is  null OR ( LastName=@inName and myFunction(LastName)='ALLOWED') )

    this is safe for all version starting from SQL2000 (I am not sure if this works on earlier SQLs as I only used it on 2000 and up)

    the condition will return all records from " <whatever> " if parameter is null as "WHERE" will be "true"

    or it will evaluate the "OR part of condition and return results based on that."


    • Edited by VBP1 Thursday, July 18, 2013 12:09 PM mistype
    Thursday, July 18, 2013 12:08 PM
  • OR you could use the self eliminating parameter like 

    SELECT <column list>
    FROM <whatever>
    WHERE   (@inName is  null OR ( LastName=@inName and myFunction(LastName)='ALLOWED') )

    this is safe for all version starting from SQL2000 (I am not sure if this works on earlier SQLs as I only used it on 2000 and up)

    This is exactly what Tom suggested, except that you left out OPTION (RECOMPILE). And in which case, the query  no longer fulfils the criteria set up in the original post. The result will be desired. However, the query plan will still include an operator to serarch for LastName, because the optimizer does not know which value @inName has at run-time.

    By adding OPTION (RECOMPILE) - and if you are one the versions that Tom lists in his post - the optimizer will hand @inName as a constant, which means that if @inName, the table will be scanned.  And if @inName is non-NULL, SQL Server will seek the index on LastName.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Thursday, July 18, 2013 8:35 PM