none
Query runs a long time when I include actual execution plan

    Question

  • I'm trying to troubleshoot why a query runs 1 second vs 18 seconds in 2 very similar environments, but different servers.   The database where it runs fast is a restore of a backup of  the database where it runs slow.  To help me figure out what was going on, I decided to include the actual execution plan and do some analysis between the 2 runs.  When I included the actual execution plan in the "fast" server, the query took 20 seconds.  It's a little too coincidental  that its so close to the 18 seconds it takes in the "slow" server.  So I'm wondering, what could be causing the slowdown by simply enabling "Include Actual Execution Plan".   And is there any other way to compare actual plans besides setting it in the SMS GUI?  
    • Edited by fd1212 Monday, November 19, 2012 11:58 PM
    • Moved by Tom PhillipsModerator Tuesday, November 20, 2012 2:51 PM Query optimization question (From:SQL Server Database Engine)
    Monday, November 19, 2012 11:57 PM

Answers

  • There's only a couple of ways to get Actual Execution Plan, 1) from Management Studio, 'Include Actual Execution Option' or 2) from Profiler (or server-side trace), add the 'Performance - Showplan XML Statistics Profile' event.

    Actual Execution Plans are much more useful than estimated ones so it's good to get hold of them.  The top-level percentage is also just an estimate (even in Actual execution plans) so whilst you can use it as a guide, it can be misleading.  The plans held in the cache will never include Actuals information, because this is per execution, and a plan may be used for multiple executions.  You can get some execution stats from sys.dm_exec_query_stats, but really the best way to collect this type of information is using Profiler ( or server-side trace ).  The new-fangled way of doing this is using Extended Events, which although available in SQL 2008(R2) weren't that easy to use.  There is much fuller implementation of these in SQL 2012, including GUI.

    If you're working on a dev or test box, fire up Profiler, use the blank template and add in the SP:StmtCompleted and SQL:StmtCompleted events.  These are available under the 'Stored Procedures' and 'TSQL' sections respectively.  This should include the CPU, Duration, Reads and Writes columns.  Add a filter for your SPID (which is the session in Management Studio you are connected with - the one you are going to run the proc in ).

    Traces can record a lot of information, so be sure to add the SPID filter, optionally add filters for Duration ( eg > 1000 = > 1 sec ) or other columns of interest to you.

    Start the trace, run the proc.  This should give you a line by line breakdown of your stored proc with a duration for each line.  If you like, repeat this process on both servers, save the trace files to a table for comparison.

    The actual execution plan, and detailed level trace should help you track down why your proc is behaving differently on different servers.  Areas where I would pay close attention are (some of which have already been mentioned):

    • server cores and memory, max memory
    • tempdb setup ( eg how many data files does tempdb have )
    • server-level 'Max Degree of Parallelism'
    • server service pack level ( available with @@version - are they identical? )
    • server load - is one under heavy load
    • fragmentation / statistics up-to-date

    You can also use the free version of the tool 'SQL Sentry Plan Explorer'.

    Apologies for some repetition of some material in this lengthy chain.

    HTH

    Tuesday, November 20, 2012 5:15 PM

All replies

  • try

    1

    set statistics profile on

    go

    -- ur t-sql

    set statistics profile off

    go

    2  by SQL profiler

    select show all events, under  performance category,  select showplan statistics profile and/or showplan xml statistics profile



    Please click the Mark as Answer button if a post solves your problem!



    Tuesday, November 20, 2012 1:38 AM
  • Hi Fd1212,

    That is expected..because if the 2 servers has different H/W (CPU,RAM,disk/san)configurations,different sp_configuration settings,Same cost threshold for parralelism etc..also additional load also matters..

    the other way you can do is display estimated execution(It doesn't executes the query but it creates plan but it is not that much actual) plan but it is good to go with the actual execution plan only...


    Thanks, Rama Udaya.K "“You only live once, but if you do it right, once is enough. ― Mae West". ---------------------------------------- Please remember to mark the replies as answers if they help and UN-mark them if they provide no help,Vote if they gives you information.

    • Proposed as answer by vr.babu Tuesday, November 20, 2012 5:45 AM
    Tuesday, November 20, 2012 2:15 AM
  • The servers are identical in HW and SAN configuration.   The ESTIMATED plan displays similar information on both servers, that's why I went to ACTUAL, hoping it would give more information.   But when I use "Include Actual Execution Plan", the procedure on the "fast" server goes from 1 second to 20 seconds.    Any other ideas?   
    Tuesday, November 20, 2012 3:38 AM
  • I tried the "set statistics profile" option already, but did not find anything useful, probably because it's displaying a lot of "Estimated" results:

    Estimated Rows

    Estimated IO

    Estimated CPU

    I need actuals to do some comparison.  Estimated are the same on both servers.

    SQL profiler displays  data at too high a level.  The CPU and DURATION are much higher in 1 server over the other.  That's what started my investigation :)

    I need to know what is causing CPU/DURATION to be higher.  Any other ideas?

    Tuesday, November 20, 2012 3:46 AM
  • You should be able to get the plan from

    select DB_NAME(s.dbid) as [db_name],usecounts,cacheobjtype,s.objectid,[text],qp.query_plan
    from sys.dm_exec_cached_plans p
    cross apply sys.dm_exec_sql_text(plan_handle) as s
    cross apply sys.dm_exec_query_plan(plan_handle) as qp
    where cacheobjtype = 'Compiled Plan'
    and text like '% part of your query here%'



    Regards, Matt Bowler MCITP, My blog | SQL Services

    Tuesday, November 20, 2012 5:46 AM
  • try

    1

    set statistics profile on

    go

    -- ur t-sql

    set statistics profile off

    go

    2  by SQL profiler

    select show all events, under  performance category,  select showplan statistics profile and/or showplan xml statistics profile



    Please click the Mark as Answer button if a post solves your problem!



    I set the filter as you suggested and ran my procedure.  I saved the profiler results into a table are there are over 22,000 rows all beginning with "ShowPlanXML......"  What can I do with this data to help me solve my problem?  That amount of data is really not useful unless I know what to look for.  thanks  

    • Edited by fd1212 Tuesday, November 20, 2012 6:34 AM
    Tuesday, November 20, 2012 6:22 AM
  • You should be able to get the plan from

    select DB_NAME(s.dbid) as [db_name],usecounts,cacheobjtype,s.objectid,[text],qp.query_plan
    from sys.dm_exec_cached_plans p
    cross apply sys.dm_exec_sql_text(plan_handle) as s
    cross apply sys.dm_exec_query_plan(plan_handle) as qp
    where cacheobjtype = 'Compiled Plan'
    and text like '% part of your query here%'



    Regards, Matt Bowler MCITP, My blog | SQL Services

    I ran your SQL and got 27 rows back.  I'll need to go through those.  Does anyone have any ideas on my original issue?  Why would including the actual plan in the results increase the run time from 1 second to 20?  I have a feeling that could be the cause of other procs that I'm seeing have the same issue where they are much faster in an identical environment.
    Tuesday, November 20, 2012 6:32 AM
  • Not sure why this would be the case. SQL Server is generating (or using) query plans whether we decide to look at them or not. There is a little bit of overhead usually involved with rendering the plan in SSMS but I couldn't see that taking 20 seconds.

    Is everything else equal between runs? Is the buffer cache being cleared? Are there other workloads running?

    First step is to compare the plans from the cache - hopefully you can narrow down the results from the query above.

    If the plans are different - we need to try and understand why.

    If the plans are the same - and the SQL Server configurations are the same then we might need to look to OS/network/hardware...


    Regards, Matt Bowler MCITP, My blog | SQL Services

    Tuesday, November 20, 2012 6:57 AM
  • Did you tried running from other SQL server. will that also taking time?

    Also did update statistics is up-to-date?

    While query running did you checked what was waiting for?


    Thanks, Rama Udaya.K "“You only live once, but if you do it right, once is enough. ― Mae West". ---------------------------------------- Please remember to mark the replies as answers if they help and UN-mark them if they provide no help,Vote if they gives you information.

    Tuesday, November 20, 2012 10:56 AM
  • IF you choose  "include actual execution plan "  ,it  makes sense to need more time to return the results set to you ,because the SSMS need some time to draw the graph of the actual execution plan  on the SSMS  especially at the first time you run the SQL after ur choosing, but you cannot account the time used to generated  the draw the graph as the parts of the total execution time .

    the formula of the  total execution time =SQL Server parse +compile time+ SQL Server Execution Times.

    now you can do two experiment.

    1, not include actual execution plan

     SET Statistics Time On
     Go
     -- Your t-sql code
      SET Statistics Time Off
     Go

    2, include actual execution plan

    SET Statistics Time On
    Go
    -- Your t-sql code
    SET Statistics Time Off
    Go

    then on the messages tab ,you will find the SQL Server parse ,compile time and SQL Server Execution Times.

    tell me if  the total time of the both of the examples are similar.



    Please click the Mark as Answer button if a post solves your problem!



    • Edited by Eric--(Feng) Tuesday, November 20, 2012 12:39 PM
    Tuesday, November 20, 2012 12:33 PM
  • IF you choose  "include actual execution plan "  ,it  makes sense to need more time to return the results set to you ,because the SSMS need some time to draw the graph of the actual execution plan  on the SSMS  especially at the first time you run the SQL after ur choosing, but you cannot account the time used to generated  the draw the graph as the parts of the total execution time .

    the formula of the  total execution time =SQL Server parse +compile time+ SQL Server Execution Times.

    now you can do two experiment.

    1, not include actual execution plan

     SET Statistics Time On
     Go
     -- Your t-sql code
      SET Statistics Time Off
     Go

    2, include actual execution plan

    SET Statistics Time On
    Go
    -- Your t-sql code
    SET Statistics Time Off
    Go

    then on the messages tab ,you will find the SQL Server parse ,compile time and SQL Server Execution Times.

    tell me if  the total time of the both of the examples are similar.



    Please click the Mark as Answer button if a post solves your problem!



    The total time is still much greater when including the actual plan.  I wont post the entire timing results, just the part that shows the difference.  The rest is pretty much the same other than this snippet:

    Display Actual Plan UNCHECKED:

    ----------------------------------------------

     SQL Server Execution Times:
       CPU time = 0 ms,  elapsed time = 0 ms.
    SQL Server parse and compile time: 
       CPU time = 0 ms, elapsed time = 0 ms.

     SQL Server Execution Times:
       CPU time = 1372 ms,  elapsed time = 1369 ms.

    (53 row(s) affected)

     SQL Server Execution Times:
       CPU time = 1450 ms,  elapsed time = 1446 ms.

     SQL Server Execution Times:
       CPU time = 0 ms,  elapsed time = 0 ms.

    Display Actual Plan CHECKED:

    ----------------------------------------------

     SQL Server Execution Times:
       CPU time = 0 ms,  elapsed time = 0 ms.
    SQL Server parse and compile time: 
       CPU time = 0 ms, elapsed time = 0 ms.

    (53 row(s) affected)

    (1 row(s) affected)

     SQL Server Execution Times:
       CPU time = 19625 ms,  elapsed time = 19665 ms.

     SQL Server Execution Times:
       CPU time = 19688 ms,  elapsed time = 19737 ms.

     SQL Server Execution Times:
       CPU time = 0 ms,  elapsed time = 0 ms.

    I notice there's an additional "1 row affected" when the option is checked.  Must be inserting into some internal table?


    • Edited by fd1212 Tuesday, November 20, 2012 12:59 PM
    Tuesday, November 20, 2012 12:59 PM
  • please run the following SQL with both including the actual plan and without  including the actual plan , (I want to check the io statistics and query plan)and post the entire text  query plan with the headers:Rows Executes StmtText StmtId NodeId ...... ,and all the messages on the messages tab ,if you cannot post them once a time ,you can post them several times.

    1,without including the actual plan

     SET Statistics Time On
     SET Statistics io On
     set statistics profile on
     Go

     -- Your t-sql code

        set statistics profile off
        SET Statistics io off
      SET Statistics Time Off

    Go

    2, including the actual plan

    SET Statistics Time On
    SET Statistics io On
    set statistics profile on
    Go
     -- Your t-sql code

    set statistics profile off
    SET Statistics io off
    SET Statistics Time Off

    Go


    Please click the Mark as Answer button if a post solves your problem!

    Tuesday, November 20, 2012 1:06 PM
  • ITs an expected behaviour that when you enable the ACTUAL execution plan it will take extra time. The time it takes depends what query you are running. If its while/cursor based, it has to draw the graph for each loop. Hence it will take time.

    The second (1 row(s) affected) is for graph draw. It is not necessarily be the insert that causes this extra message. Even Select statement will give the rows affected.

    Try the below:

    Drop table T1
    Create Table T1(id int)
    Insert into t1 Select 1
    Go 100


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Tuesday, November 20, 2012 1:33 PM
  • Whats the best format to post the information, specifically the query plan?  

    Note: In addition to the user data that is returned, setting those options returned 3 additional result sets.  I compared the result sets between the 2 runs (with and without including actual plan) and the results ARE IDENTICAL in every row and column.  (with the exception of the STMTID).  Is that information still useful even though they are identical between the 2 tests?


    Tuesday, November 20, 2012 1:42 PM
  • Whats the best format to post the information, specifically the query plan?  

    don't worry about the format, for query plan, just copy( select copy with headers) and paste it to here, it will restore to good format again when I fetch the text from here to my SSMS editor.


    Please click the Mark as Answer button if a post solves your problem!

    Tuesday, November 20, 2012 1:45 PM
  • ITs an expected behaviour that when you enable the ACTUAL execution plan it will take extra time. The time it takes depends what query you are running. If its while/cursor based, it has to draw the graph for each loop. Hence it will take time.

    The second (1 row(s) affected) is for graph draw. It is not necessarily be the insert that causes this extra message. Even Select statement will give the rows affected.

    Try the below:

    Drop table T1
    Create Table T1(id int)
    Insert into t1 Select 1
    Go 100


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    I think this may be the case, and I am chasing a red herring.  My initial issue was a proc running 1 sec vs 18 secs in what I believe to be identical environments.   It's just a coincidence that the overhead to create the Actual Plan in the "fast" environment caused the proc to run 20 seconds.  I thought that had some significance but now I'm thinking that's not a problem.    I'm going to assume the returned query plan is accurate and the additional time is just overhead.  Now I can compare the 2 actual plans and maybe something will jump out at me.    I may ask for assistance on the 2 plans and trying to find out where the problem may be.
    Tuesday, November 20, 2012 2:05 PM
  • OK, now I have 2 actual query plans from 2 separate queries.  One runs in 1 second, the other 18 seconds (when the plan is not included in the output).  When the actual plans are included, the queries take 31 sec and 41 sec respectively.   Looking at the visual graph, the %'s of each step are the same between the 2 plans.  Where can I see actual duration so I can tell which step(s) are taking significantly longer?  Percentages are good when analyzing a single plan, but not much help when comparing these 2 plans.  thanks for all your help!
    Tuesday, November 20, 2012 3:21 PM
  • You may look at the each query cost. That will be a starting point. Then you can start analysing the query. Also use STATISTICS IO and TIME to get more info on the same.

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Tuesday, November 20, 2012 3:49 PM
  • You may look at the each query cost. That will be a starting point. Then you can start analysing the query. Also use STATISTICS IO and TIME to get more info on the same.

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    The costs are nearly identical.  Interestingly the plans were pulled from the dm tables/views which I believed were the actual plans.  When I hover over the graphical display there are 7 data elements that are all labeled "Estimated".

    Where can I get STATISTICS IO and TIME that you mention?    Does the graphical version of the query plan show duration?  Thats all I need, I feel like I'm so close.  

    Thanks

    Tuesday, November 20, 2012 4:28 PM
  • There's only a couple of ways to get Actual Execution Plan, 1) from Management Studio, 'Include Actual Execution Option' or 2) from Profiler (or server-side trace), add the 'Performance - Showplan XML Statistics Profile' event.

    Actual Execution Plans are much more useful than estimated ones so it's good to get hold of them.  The top-level percentage is also just an estimate (even in Actual execution plans) so whilst you can use it as a guide, it can be misleading.  The plans held in the cache will never include Actuals information, because this is per execution, and a plan may be used for multiple executions.  You can get some execution stats from sys.dm_exec_query_stats, but really the best way to collect this type of information is using Profiler ( or server-side trace ).  The new-fangled way of doing this is using Extended Events, which although available in SQL 2008(R2) weren't that easy to use.  There is much fuller implementation of these in SQL 2012, including GUI.

    If you're working on a dev or test box, fire up Profiler, use the blank template and add in the SP:StmtCompleted and SQL:StmtCompleted events.  These are available under the 'Stored Procedures' and 'TSQL' sections respectively.  This should include the CPU, Duration, Reads and Writes columns.  Add a filter for your SPID (which is the session in Management Studio you are connected with - the one you are going to run the proc in ).

    Traces can record a lot of information, so be sure to add the SPID filter, optionally add filters for Duration ( eg > 1000 = > 1 sec ) or other columns of interest to you.

    Start the trace, run the proc.  This should give you a line by line breakdown of your stored proc with a duration for each line.  If you like, repeat this process on both servers, save the trace files to a table for comparison.

    The actual execution plan, and detailed level trace should help you track down why your proc is behaving differently on different servers.  Areas where I would pay close attention are (some of which have already been mentioned):

    • server cores and memory, max memory
    • tempdb setup ( eg how many data files does tempdb have )
    • server-level 'Max Degree of Parallelism'
    • server service pack level ( available with @@version - are they identical? )
    • server load - is one under heavy load
    • fragmentation / statistics up-to-date

    You can also use the free version of the tool 'SQL Sentry Plan Explorer'.

    Apologies for some repetition of some material in this lengthy chain.

    HTH

    Tuesday, November 20, 2012 5:15 PM
  • There's only a couple of ways to get Actual Execution Plan, 1) from Management Studio, 'Include Actual Execution Option' or 2) from Profiler (or server-side trace), add the 'Performance - Showplan XML Statistics Profile' event.

    Actual Execution Plans are much more useful than estimated ones so it's good to get hold of them.  The top-level percentage is also just an estimate (even in Actual execution plans) so whilst you can use it as a guide, it can be misleading.  The plans held in the cache will never include Actuals information, because this is per execution, and a plan may be used for multiple executions.  You can get some execution stats from sys.dm_exec_query_stats, but really the best way to collect this type of information is using Profiler ( or server-side trace ).  The new-fangled way of doing this is using Extended Events, which although available in SQL 2008(R2) weren't that easy to use.  There is much fuller implementation of these in SQL 2012, including GUI.

    If you're working on a dev or test box, fire up Profiler, use the blank template and add in the SP:StmtCompleted and SQL:StmtCompleted events.  These are available under the 'Stored Procedures' and 'TSQL' sections respectively.  This should include the CPU, Duration, Reads and Writes columns.  Add a filter for your SPID (which is the session in Management Studio you are connected with - the one you are going to run the proc in ).

    Traces can record a lot of information, so be sure to add the SPID filter, optionally add filters for Duration ( eg > 1000 = > 1 sec ) or other columns of interest to you.

    Start the trace, run the proc.  This should give you a line by line breakdown of your stored proc with a duration for each line.  If you like, repeat this process on both servers, save the trace files to a table for comparison.

    The actual execution plan, and detailed level trace should help you track down why your proc is behaving differently on different servers.  Areas where I would pay close attention are (some of which have already been mentioned):

    • server cores and memory, max memory
    • tempdb setup ( eg how many data files does tempdb have )
    • server-level 'Max Degree of Parallelism'
    • server service pack level ( available with @@version - are they identical? )
    • server load - is one under heavy load
    • fragmentation / statistics up-to-date

    You can also use the free version of the tool 'SQL Sentry Plan Explorer'.

    Apologies for some repetition of some material in this lengthy chain.

    HTH

    Thanks for your response.   Here's an update, and also a few questions.

    • servers are identical (CPU, RAM, SAN, etc)
    • Both running SQL Server 10.50.2550 64-bit
    • I ran my tests at 3 AM when the load was less than 1%, and no background processes running
    • Stats are up to date in both databases
    • MDOP is set to 6 on each server
    • Each DB is configured with 8 tempdb data files
    • I already did some profiling, that's what led me to conclude it's related to query plans.  In addition to the duration being much higher, the CPU and READS were also much higher.   Writes were the same.  

    One question about using StmtCompleted events.  Is a SELECT statement considered 1 statement in it's entirety?  The procedure only has 1 select statement (with joins, function calls, cross applys, etc).  Does the trace get to a more granular level than a SELECT statement?  Thats why I thought the query  plan would be helpful because it shows all the components of the select.    I will try a more detailed trace after hours but only if it can break down the information to a more granular level.

    I do have actual Plans that I got from Management Studio (Include Actual Execution Option).  However I could not find a statistic that showed duration.  All I could find was %.  The %'s were the same for both plans, just the cumulative times were way off.

    thanks again for sticking with this.   We have a number of procs that are showing similar behavior.  I think if I can figure out this one, or at least identify the cause, I will be able to apply that knowledge to the other procs.   I didn't think I'd have this much trouble identifying the problem.  The amount of tools, stats, etc in SQL server is extensive.   I just need to find out the one that will help show the discrepancy.  Maybe it's the detailed trace as you mentioned.

    Tuesday, November 20, 2012 5:49 PM
  • If the statement literally only has one statement then no, detailed trace won't help. Post up the XML plans on Skydrive, Dropbox etc and post a link. Just to rule out the obvious, you have checked volumes? You mention stats but not fragmentation. How many CPUs has the box got? Try and capture the wait stats for the proc next time you execute ( eg snapshot from sys.dm_os_waiting_tasks by session_id into a table very second whilst the proc is running ).
    Tuesday, November 20, 2012 6:23 PM
  • If the statement literally only has one statement then no, detailed trace won't help. Post up the XML plans on Skydrive, Dropbox etc and post a link. Just to rule out the obvious, you have checked volumes? You mention stats but not fragmentation. How many CPUs has the box got? Try and capture the wait stats for the proc next time you execute ( eg snapshot from sys.dm_os_waiting_tasks by session_id into a table very second whilst the proc is running ).

    No additional waits were seen while running the procedure.   The db where it runs quickly is a restore of a backup of the other one.  Done nightly.  Is it safe to assume all the internals (stats, fragmentation, etc) will be the same in the restored DB?    Obviously the volumes are the same.   I'll see if I can get the XML on a shared site.  thanks again


    Tuesday, November 20, 2012 6:34 PM
  • Yes internals should be the same. We'll keep trying : ) Containing to rule out obvious, confirm no non-system traces are running? (sys.traces) The high volume of SHOWPLAN events you mentioned previously suggests scalar functions in the SELECT. We'll know for sure from the plan. These can be performance killers and are made worse by things like long-running detail traces.
    Tuesday, November 20, 2012 6:55 PM
  • Yes internals should be the same. We'll keep trying : ) Containing to rule out obvious, confirm no non-system traces are running? (sys.traces) The high volume of SHOWPLAN events you mentioned previously suggests scalar functions in the SELECT. We'll know for sure from the plan. These can be performance killers and are made worse by things like long-running detail traces.
    no other traces.  The select contains 4 CROSS APPLYs to functions that return table variables.  
    Tuesday, November 20, 2012 7:04 PM
  • It also contains 3 scalar functions in the select list.  Just doing some research.  I didn't realize scalar was so bad.  That explains why the original writer has some UDFs  that return tables when in reality they will only ever return 1 value.  Maybe those were written after he learned scalar wasn't as efficient.
    Tuesday, November 20, 2012 7:31 PM
  • Can you run this on both servers:

    select * from sys.configurations

    where name in ('cost threshold for parallelism','min memory per query (KB)','max server memory (MB)'

    Also, and this might sound silly but I've encountered it before, can you check the power saving settings for the CPUs on both servers....


    Regards, Matt Bowler MCITP, My blog | SQL Services

    Tuesday, November 20, 2012 7:54 PM
  • I apologize. but I'm going to have to put this on hold.  We currently have 260 scalar-value functions used very heavily throughout the application.   There's even scalar functions that call other scalar functions.  I have a lot of research and documenting to do over the break, but I think eliminating, or identifying the "worse" functions is now a higher priority than finding out why a query is fast in 1 environment and slow in another.  

    thanks for all your help


    • Edited by fd1212 Tuesday, November 20, 2012 8:15 PM
    Tuesday, November 20, 2012 8:11 PM
  • Hi fd,

    scalar functions are ok for returning scalar values; it's a way of wrapping up regularly used code in one place.  However, as soon as you put them in a resultset, eg SELECT, or use them in a JOIN or WHERE clause, performance can really suffer because they called once per row.  They can also prevent parallel plans and if you are tracing at statement level this will absolutely destroy their performance on a resultset of any meaningful size.  They are much maligned - there is an article called 'Death by UDF' in the second SQL Server MVP Deep Dives book, Paul White calls them evil, Itzik Ben-Gan refers to them in the T-SQL Programming book in a section called 'Performance Issues':

    "You should be aware that invoking scalar UDFs in queries has a high cost when you provide the function with attributes
    from the outer table as inputs."

    Overall they make your code more sensitive to things like volume, tracing, plan changes and server performance.  I myself dislike them intensely, having had to deal with performance issues directly related to them.

    However, take a step back.  On one of your servers, this query is performing ok.  It's not breaking any world records but it's working.  On another similarly spec'd server it is not performing.  There must be a reason why.  Before you go ahead refactoring all your 260+ scalar functions, at high dev and test cost, invoking a huge regression test on your entire application, focus on getting to the bottom of this problem.  What if, you could solve this problem by stopping that long-running trace, or simply recompiling the proc or adding an appropriate hint, or realising there is a huge SAN backup going on.  All of these are much lower risk than a large-scale refactor. Longer term you can look to remove your scalar functions in a controlled way.

    I would also direct you to Jeff Moden's article on a similar topic:

    http://www.sqlservercentral.com/articles/T-SQL/91724/

    I hope that makes sense.


    • Edited by wBob Wednesday, November 21, 2012 12:43 PM hyperlink
    Wednesday, November 21, 2012 12:38 PM
  • Hi fd,

    scalar functions are ok for returning scalar values; it's a way of wrapping up regularly used code in one place.  However, as soon as you put them in a resultset, eg SELECT, or use them in a JOIN or WHERE clause, performance can really suffer because they called once per row.  They can also prevent parallel plans and if you are tracing at statement level this will absolutely destroy their performance on a resultset of any meaningful size.  They are much maligned - there is an article called 'Death by UDF' in the second SQL Server MVP Deep Dives book, Paul White calls them evil, Itzik Ben-Gan refers to them in the T-SQL Programming book in a section called 'Performance Issues':

    "You should be aware that invoking scalar UDFs in queries has a high cost when you provide the function with attributes
    from the outer table as inputs."

    Overall they make your code more sensitive to things like volume, tracing, plan changes and server performance.  I myself dislike them intensely, having had to deal with performance issues directly related to them.

    However, take a step back.  On one of your servers, this query is performing ok.  It's not breaking any world records but it's working.  On another similarly spec'd server it is not performing.  There must be a reason why.  Before you go ahead refactoring all your 260+ scalar functions, at high dev and test cost, invoking a huge regression test on your entire application, focus on getting to the bottom of this problem.  What if, you could solve this problem by stopping that long-running trace, or simply recompiling the proc or adding an appropriate hint, or realising there is a huge SAN backup going on.  All of these are much lower risk than a large-scale refactor. Longer term you can look to remove your scalar functions in a controlled way.

    I hope that makes sense.

    thanks for the advice, but we won't put resources to this task until we've (I) done adequate analysis.  Thats why I mentioned I have a lot of research and documenting to do.  Those 270 functions are used 1000's of times throughout 2270 stored procedures.  Some are perfectly fine (used in a SET command).  But they are also used in SELECT lists, WHERE clauses, and JOINS.  I'm almost 100% sure some of these are really hurting overall performance due to high CPU and IO.  Now I need to find out which ones to focus on first.  Some may easily be converted to in-line, but the majority of them are multi-statement, data-access functions.    I modified my test query to remove the scalar function calls and use subqueries instead.  I couldn't put the entire logic in the subquery so it's not giving me the exact result set I need, I still have some work to do with it.  But it now runs in 1-2 seconds without the scalar function calls.    I'm pretty sure that was my issue causing the difference between the 2 environments.    From what I've read, when using scalar functions in select list the optimizer basically gives up.  Even with identical hardware and databases I'm guessing the optimizer in 1 environment defaulted to a more efficient path than in the other db.    
    Wednesday, November 21, 2012 1:08 PM
  • You could verify this with a USE PLAN hint ( ie copy the XML plan from the good server and use it as a hint ), eg

    select yourSelect
    option ( use plan N'yourGoodPlan' )

    Or a plan guide.  Least then you would know.
    Wednesday, November 21, 2012 1:39 PM
  • Also, did you know you're block quoting the entire previous post in each of your posts, so the thread is getting a bit bloated now : )

    I believe this might be a feature of nntp bridge.  Are you using that?

    Wednesday, November 21, 2012 1:45 PM
  • I put the scalar functions back in and used the hint with the good plan.  Still took 18 seconds.  Replaced scalar functions again with subqueries, 1 second.
    Wednesday, November 21, 2012 1:58 PM