locked
query performance issue RRS feed

  • Question

  • i'm trying to run a query in SSMS.   If I run with an actual value, it runs in about 3 seconds.   If I run with a local variables, it takes around 100 seconds.    What can I do to improve performance with the variables?

    runs in 3 seconds

    select * from tablex where columnx = '123'

    runs in 100 seconds

    declare @varx   varchar(3)

    set @varx = '123'

    select * from tablex where columnx = @varx

    Wednesday, February 29, 2012 6:50 PM

All replies

  • Run the same query twice, how long does it take ?

    declare @varx   varchar(3)

    set @varx = '123'

    select * from tablex where columnx = @varx

    select * from tablex where columnx = @varx               -->this execution, how long does it take ?

    Wednesday, February 29, 2012 7:29 PM
  • Looks like the two queries are using two deferent query plans. This may be due to a out dated statistcs (and wrong selectivity of values). Update the table and index statistics and try again.

    - Krishnakumar S

    Wednesday, February 29, 2012 7:33 PM
  • When you have a value, then SQL Server decides on an execution plan based on that value. When you have a variabel, the optimizer doesn't know the value of that variable so it has to guess selectivity. You can for instance aadd OPTION(REOMPILE) at the end of the query to make SQL Server sniff the value of the variable. Here's good info about this behaviour: http://msdn.microsoft.com/en-us/library/ee343986.aspx

    Tibor Karaszi, SQL Server MVP | web | blog

    • Proposed as answer by ConsultAjay Wednesday, February 29, 2012 7:42 PM
    Wednesday, February 29, 2012 7:34 PM
  • it takes the same amount of time both times
    Wednesday, February 29, 2012 7:52 PM
  • I tried the option recompile and it runs in 73 seconds that way.   I'll try updating the statistics
    Wednesday, February 29, 2012 8:12 PM
  • Then it is time to work the execution plans, as suggested. OPTION(REOMPILE) should make for sniffing variable value, but possibly this was introduced in some more ecent version/build.

    Tibor Karaszi, SQL Server MVP | web | blog

    Wednesday, February 29, 2012 8:25 PM
  • The execution plan recommended an index and now it runs in around 75 seconds.    Would still like to get it to 3 seconds.  
    Wednesday, February 29, 2012 9:52 PM
  • Are you selecting all the columns in the table using SELECT * FROM...(avoid using *)? If you have specific column lists then a covering index or INCLUDE columns option in the index will be helpful for performance. Share some or details on what you're looking to acheive.

    - Krishnakumar S

    Wednesday, February 29, 2012 10:09 PM
  • I'm not doing a select '*' (my example above was just a basic example of what i'm trying to do, actual query involves several tables and is a more involved).    It seems like if hard coding the values runs in 3 seconds, there should be a way to make the same query run about the same time with variables.    That's what I'm looking to achieve.    That's all I've been changing, and with the actual values it runs in 3 seconds versus around 100 seconds with variables.
    Thursday, March 1, 2012 12:26 AM
  • You are comparing two similar scripts, in which one has hard coded value and the other has a variable in place, to improve the performance of the later query. The problem in this approach is the selectivity of column values that Tibor and I mentioned in the previous posts plays its part and lead to wrong conclusions.

    Selectivity of a column tells the query optomizer how unique a value (or range of values) is when it is used in a WHERE clause or a JOIN statement. High selectivity of a value in a column means the value or range of values are highly unique among the table or indexes. The DBCC SHOW_STATISTICS command will give the statistics information of a table or index and the 'All Density' column from the result give you an idea of selectivity. A low density value represents high selectivity and the query optimizer may use that index to limit the number of rows returned to keep the query cost low.

    When you execute the query select * from tablex where columnx = '123' the query optimizer creates  a AdHoc query plan (also called adhoc workload and keep the plan in cache) that helps the engine to fetch the rows with lowest cost for columnx value of '123'. This is based on the statistics analysis and selectivity of the value. When you rewrite the query to use a variable then the query optimizer compile the query to a parameterized plan (also called prepared workload) to reuse the plan with other variable values. This plan can be different from the first Adhoc plan to fetch rows for other variable values without much cost the server. This decision is also based on the statistics available. Run your two queries in a single batch and review the graphical execution plan. You will get some idea on where the performance hit for the second query.

    The best method to improve the query is to take only your second parameterized query and analyse the plan to find room for improvements. Some general guidlines: avoid table and clustered index scans, try to include index seeks, find missing indexes, and covering indexes, find out dated statistics, find fragmented indexes etc.. The following link will be worth a read:

    http://msdn.microsoft.com/en-us/library/ms176005.aspx

    If you share the execution plan someone here can help you to improve the query.

    - Krishnakumar S

    Thursday, March 1, 2012 10:28 AM
  • It is not about whatever index might be recommended when you look at the execution plan. It is about understanding the execution plan, and understanding *why* there is a difference between specifying a value vs. having a variable. When you are comfortable with you knowledge about the difference between a value and a variable, you will be better equipped to handle your specific situation. Having said that:

    What version and build number are you on? OPTION RECOMPIE should work for you, assuming high enough version. Here's an example of exactly this case (look at the execution plans), based on the Adventureworks database:

    CREATE INDEX x ON Sales.SalesOrderDetail(OrderQty)
    
    --Low selectivity, table scan (clustered index scan)
    SELECT * FROM Sales.SalesOrderDetail
    WHERE OrderQty = 2
    GO
    
    --High selectivity, index seek.
    SELECT * FROM Sales.SalesOrderDetail
    WHERE OrderQty = 34
    GO
    
    --High selectivity, but optiozer doesn't know that, table scan 
    DECLARE @a int = 34
    SELECT * FROM Sales.SalesOrderDetail
    WHERE OrderQty = @a
    GO
    
    --High selectivity, optimizer does know that, thanks to OPTION(RECOMPILE). Index seek
    DECLARE @a int = 34
    SELECT * FROM Sales.SalesOrderDetail
    WHERE OrderQty = @a
    OPTION(RECOMPILE)
    GO
    
    --High selectivity, we tell that to the optimizer. Index seek
    DECLARE @a int = 34
    SELECT * FROM Sales.SalesOrderDetail
    WHERE OrderQty = @a
    OPTION(OPTIMIZE FOR (@a = 34))


    Tibor Karaszi, SQL Server MVP | web | blog

    Thursday, March 1, 2012 10:49 AM
  • Using Local Variables is not a great for performance, because it keeps the query optomizer from making a good choice during planning. 

    Try making this a parameterized query instead of a Local Variable.  Check out the post below.

    http://sqlserverpedia.com/blog/sql-server-bloggers/the-problem-with-local-variables/

    Thursday, March 1, 2012 12:29 PM