none
If else statement, stored procedure performance

    Question

  • I am looking for an explanation and am trying to understand the internals better. This is in reference to this article.

    http://www.sqlmag.com/Article/ArticleID/44717/sql_server_44717.html

     

    I agree that if an SP has different branches (multiple if else) potentially causing different plans should be split into multiple SP's. However I can't agree with the statement that if else branching in a SP will cause a recompile. I am setting up the test data for this here and please bare with me.

    Code Snippet

    use tempdb

    go

    if object_id('dbo.Customer', 'U') is not null

          drop table dbo.Customer

    go 

    create table dbo.Customer

    (

          customerID        int identity(1,1) not null,

          FirstName         varchar(20) not null,

          LastName          varchar(20) not null,

          CreateDateTime    datetime not null

    )

     

    alter table dbo.Customer add constraint PK_Customer primary key clustered (customerID)   

    alter table dbo.Customer add constraint DF_Customer_CreateDateTime default getdate() for CreateDateTime

     

    insert dbo.Customer (FirstName, LastName)

          values ('John', 'Patterson')

    go

    insert dbo.Customer (FirstName, LastName)

          values ('Joe', 'Somebody')

    go

    insert dbo.Customer (FirstName, LastName)

          values ('Peter', 'Anderson')

    go

    insert dbo.Customer (FirstName, LastName)

          values ('Kelly', 'Nobody')

    go

     

    if object_id('dbo.Customer_Get', 'P') is not null

          drop procedure dbo.Customer_Get

    go

     

    create procedure dbo.Customer_Get @getAll int, @customerID int = null

    as

    begin

    if (@getAll is not null)

          select FirstName, LastName from dbo.Customer

    else

          select FirstName, LastName from dbo.Customer where customerID = @customerID

    end

    go

     

     

    ...cntd (Please look at the reply for complete text)

    Tuesday, November 04, 2008 4:02 AM

Answers

  • Sankar Reddy,

     

    I think that the author meant that if you need SQL Server to generate different plan for each branch, based on the parameters passed, then you will have to recompile the sp, or in 2005 / 2005 use option (recompile) at the statement level. Partitioning the sp, will create the plan for each sp just when it is executed.

     

    Hope this helps.

     

     

    AMB

     

     

    Thursday, November 06, 2008 2:22 PM
  • Sankar,

     

    I'd start by recommending that you read Erland Sommarskogs article on Dynamic Searching in TSQL. 

     

    http://www.sommarskog.se/dyn-search-2005.html

     

    I was interested by the SQLMag article, the subject of your post, and Alejandro's response so I took some time to play with this for a little bit.  I found a neat little TVF on the old SQLCAT blog that would split the compiled plan for the SP and executing sql text from the output of sys.dm_exec_sql_text so you can compare the different executions independently and easily:

     

    http://blogs.msdn.com/sqlcat/archive/2005/09/23/473367.aspx

     

    Using this, I did a number of tests.  I wasn't satisfied at first with your sample table size, so I ran an insert into your table from AdventureWorks:

     

    Code Snippet

    INSERT INTO tempdb.dbo.Customer(FirstName, LastName)

    select FirstName, LastName

    from (select top 2000 FirstName from Person.Contact order by ContactID) a

    cross join (select top 2000 LastName from Person.Contact order by ContactID) b

     

     

    This builds 4 million records in about 4 minutes on my laptop which is horrible for performance.  It didn't make much difference to the execution plans, but I had to check it out and make sure that you weren't just getting funny results from a small dataset, and they weren't.

     

    I did a number of tests, and they always had the same plan, whether doing option recompile, or not, and calling the procedure in different orders with different parameters.  What was really interesting was that when I abstracted the selects into their own separate procedures, the execution plans were identical to the statement plans in the single procedure that were used by running it with the various parameters.  I even ran it like this:

     

    Code Snippet

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = 1, @customerID = null

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = null, @customerID = 2

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = null, @lastname = 'Somebody'

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

     

     

    and the plans were still identical to the initial ones created.  The only difference was the use of FULL optimization over the four million row output versus TRIVIAL for the 4 row output from your sample, which is to be expected.  So it begs the question, where would it need to recompile for each? 

     

    Erlands article provides specific examples where WITH RECOMPILE helps, but the code is decidedly different for those cases than the simple code that you are executing here, and I plan to do some testing around his code as well.

    Friday, November 07, 2008 4:25 AM
  • Jonathan,

     

    I wouldn't expect having different plans, in this case, if we put each statement in a separated sp. The statement:

     

    select customerid, firstname, lastname from dbo.customers

     

    will use a scan on the table, clustered index, or a covering nonclustered index because it has no filter, while the statement:

     

    select customerid, firstname, lastname from dbo.customers where customerid = @customerid

     

    will use a clustered index seek, no matter the value of @customerid, because that column is unique, so no matter which value, the expected number of rows will be 1.

     

    Let me use another example where we can see that using "option (recompile)" or putting the statement in a different sp will give us different result. The option recompile is good when the distribution of the values, in the column in question, is not balanced, so we will get a plan based on the statistics for current parameter value, and not the value sniffed during compilation time, and also the recompilation will be just at the statement level and not the whole procedure like it was in SS 2000.

     

    In SSMS, press ctrl-k to include actual execution plan, and execute this script.

     

    USE [AdventureWorks]

    GO

     

    DBCC freeproccache

    GO

     

    DECLARE @sql NVARCHAR(4000)

    SET @sql = N'

    if @productid is null

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail

    else

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail where productid = @productid

    '

    EXEC sp_executesql @sql, N'@productid int', NULL

    GO

     

    --DBCC freeproccache

    --GO

     

    DECLARE @sql NVARCHAR(4000)

    SET @sql = N'

    if @productid is null

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail

    else

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail where productid = @productid

    '

    EXEC sp_executesql @sql, N'@productid int', 870

    GO

     

    Notice that the plan for the second batch is using index seek on [AdventureWorks].[Sales].[SalesOrderDetail].[IX_SalesOrderDetail_ProductID], this is because in the first batch the plan choosen for this statement was based on the value NULL. Like executing:

     

    DBCC freeproccache

    GO

     

    DECLARE @sql NVARCHAR(4000)

     

    SET @sql = N'

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount

    from Sales.SalesOrderDetail

    where productid = @productid

    '

     

    EXEC sp_executesql @sql, N'@productid int', NULL

    GO

     

    Now, there are 4,688 rows for [productid] = 870, and if we remove the comment "--" for the intermediate "dbcc free proccache", you will see a new plan, using and index scan. Same plan will be choosen if we use "option (recompile)" for the stament using "where productid = @productid".

     

     

    AMB

    Friday, November 07, 2008 4:35 PM

All replies

  • We have a dummy SP with one branch leading to CI scan and the other leading to CI seek. I have setup Profiler trace also with CacheHit, CacheMiss, Recompile events. Lets run it with both sets of data.

    Code Snippet

    exec dbo.Customer_Get @getAll = 1, @customerID = null

    go

    exec dbo.Customer_Get @getAll = null, @customerID = 2

     

     

     

    What I see in profiler trace is listed below.

    Code Snippet

    SP:CacheMiss --exec dbo.Customer_Get @getAll = 1, @customerID = null

    SP:CacheInsert --exec dbo.Customer_Get @getAll = 1, @customerID = null

    SP:CacheMiss --exec dbo.Customer_Get @getAll = null, @customerID = 2

    SP:CacheHit --exec dbo.Customer_Get @getAll = null, @customerID = 2

     

     

    I don't see a recompile. Ok. Lets have another look from a different angle.

     

    Code Snippet

    select object_name(objectid), *

          from master.sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_query_plan (qs.plan_handle)

    where dbid = (select db_id()) and objectid = (object_id('dbo.Customer_Get'));

     

     

      The above query will give us 2 rows.Sql Handle and Plan Handle of both the rows is same. What's interesting is the creation_time column. Its the same. We don't have 2 plan handles, only one for both the execution plans. Execution_time is different as we thought. If we scroll thru the right and look at the query_plan, we can see the query plans are exactly the same. In the first compilation alone, it builts the plan for all branches and its using the same plan and I don't see a recompile happening.

     

    I am trying to understand what is meant by recompile in this context. Certainly its not parsing the whole query, building the tree and coming up with the plan. Is recompile meant that only one plan per objectid is stored for the SP and if a different branch comes (branch b), it will look for that (branch b) in cache, will get a cache miss (branch b) and it will use the plan generated from the initial compilation for branch b. Any comments if I got this incorrect?

     

    Tuesday, November 04, 2008 4:04 AM
  • Sankar Reddy,

     

    I think that the author meant that if you need SQL Server to generate different plan for each branch, based on the parameters passed, then you will have to recompile the sp, or in 2005 / 2005 use option (recompile) at the statement level. Partitioning the sp, will create the plan for each sp just when it is executed.

     

    Hope this helps.

     

     

    AMB

     

     

    Thursday, November 06, 2008 2:22 PM
  • Sankar,

     

    I'd start by recommending that you read Erland Sommarskogs article on Dynamic Searching in TSQL. 

     

    http://www.sommarskog.se/dyn-search-2005.html

     

    I was interested by the SQLMag article, the subject of your post, and Alejandro's response so I took some time to play with this for a little bit.  I found a neat little TVF on the old SQLCAT blog that would split the compiled plan for the SP and executing sql text from the output of sys.dm_exec_sql_text so you can compare the different executions independently and easily:

     

    http://blogs.msdn.com/sqlcat/archive/2005/09/23/473367.aspx

     

    Using this, I did a number of tests.  I wasn't satisfied at first with your sample table size, so I ran an insert into your table from AdventureWorks:

     

    Code Snippet

    INSERT INTO tempdb.dbo.Customer(FirstName, LastName)

    select FirstName, LastName

    from (select top 2000 FirstName from Person.Contact order by ContactID) a

    cross join (select top 2000 LastName from Person.Contact order by ContactID) b

     

     

    This builds 4 million records in about 4 minutes on my laptop which is horrible for performance.  It didn't make much difference to the execution plans, but I had to check it out and make sure that you weren't just getting funny results from a small dataset, and they weren't.

     

    I did a number of tests, and they always had the same plan, whether doing option recompile, or not, and calling the procedure in different orders with different parameters.  What was really interesting was that when I abstracted the selects into their own separate procedures, the execution plans were identical to the statement plans in the single procedure that were used by running it with the various parameters.  I even ran it like this:

     

    Code Snippet

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = 1, @customerID = null

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = null, @customerID = 2

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

    GO

    exec dbo.Customer_Get @getAll = null, @lastname = 'Somebody'

    GO

    select pln.*, req.*

    from sys.dm_exec_query_stats req

    CROSS APPLY statement_level_query_plan(plan_handle) as pln

    where statement_text like

    '%' +

    replace(

    left(

    substring((select text from master.sys.dm_exec_sql_text(sql_handle)),

    statement_start_offset/2,

    1+ case when statement_end_offset = -1

    then LEN((select text from master.sys.dm_exec_sql_text(sql_handle))) - statement_start_offset/2

    else statement_end_offset/2 - statement_start_offset/2

    end)

    ,3000)

    , '[','[[]') + '%'

    GO

    DBCC FREEPROCCACHE

     

     

    and the plans were still identical to the initial ones created.  The only difference was the use of FULL optimization over the four million row output versus TRIVIAL for the 4 row output from your sample, which is to be expected.  So it begs the question, where would it need to recompile for each? 

     

    Erlands article provides specific examples where WITH RECOMPILE helps, but the code is decidedly different for those cases than the simple code that you are executing here, and I plan to do some testing around his code as well.

    Friday, November 07, 2008 4:25 AM
  • Jonathan,

     

    I wouldn't expect having different plans, in this case, if we put each statement in a separated sp. The statement:

     

    select customerid, firstname, lastname from dbo.customers

     

    will use a scan on the table, clustered index, or a covering nonclustered index because it has no filter, while the statement:

     

    select customerid, firstname, lastname from dbo.customers where customerid = @customerid

     

    will use a clustered index seek, no matter the value of @customerid, because that column is unique, so no matter which value, the expected number of rows will be 1.

     

    Let me use another example where we can see that using "option (recompile)" or putting the statement in a different sp will give us different result. The option recompile is good when the distribution of the values, in the column in question, is not balanced, so we will get a plan based on the statistics for current parameter value, and not the value sniffed during compilation time, and also the recompilation will be just at the statement level and not the whole procedure like it was in SS 2000.

     

    In SSMS, press ctrl-k to include actual execution plan, and execute this script.

     

    USE [AdventureWorks]

    GO

     

    DBCC freeproccache

    GO

     

    DECLARE @sql NVARCHAR(4000)

    SET @sql = N'

    if @productid is null

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail

    else

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail where productid = @productid

    '

    EXEC sp_executesql @sql, N'@productid int', NULL

    GO

     

    --DBCC freeproccache

    --GO

     

    DECLARE @sql NVARCHAR(4000)

    SET @sql = N'

    if @productid is null

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail

    else

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount from Sales.SalesOrderDetail where productid = @productid

    '

    EXEC sp_executesql @sql, N'@productid int', 870

    GO

     

    Notice that the plan for the second batch is using index seek on [AdventureWorks].[Sales].[SalesOrderDetail].[IX_SalesOrderDetail_ProductID], this is because in the first batch the plan choosen for this statement was based on the value NULL. Like executing:

     

    DBCC freeproccache

    GO

     

    DECLARE @sql NVARCHAR(4000)

     

    SET @sql = N'

    select SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, UnitPriceDiscount

    from Sales.SalesOrderDetail

    where productid = @productid

    '

     

    EXEC sp_executesql @sql, N'@productid int', NULL

    GO

     

    Now, there are 4,688 rows for [productid] = 870, and if we remove the comment "--" for the intermediate "dbcc free proccache", you will see a new plan, using and index scan. Same plan will be choosen if we use "option (recompile)" for the stament using "where productid = @productid".

     

     

    AMB

    Friday, November 07, 2008 4:35 PM
  • Alajendro & Jonathan,

     

    Thanks for taking time to reply to this thread. Alajendro's initial succint answer and the next reply hit the nail on head.

     

    Erland's article and Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 are good reading materials on this topic.

     

    Thanks,

    Sankar

    Saturday, November 08, 2008 5:37 AM
  • Sankar Reddy,

     

    No doubt that those two white papers, and yes I consider Erland's article a white paper on that matter, are very good materials for this theme. I also recommend reading:

     

    Statistics Used by the Query Optimizer in Microsoft SQL Server 2005

     

    Inside Microsoft® SQL Server(TM) 2005: Query Tuning and Optimization

     

    Plan Cache Concepts Explained

     

     

    AMB

    Saturday, November 08, 2008 4:49 PM