none
CTE or other replacement for this Query RRS feed

  • Question

  • TI have this query below and on using a profiler trace, I notice that the last select query is where the bottleneck is or at least the part of the query which is taking the longest.

    SELECT
    
    		A.TACCODE
    		A.PERSONURN
    		B.ADDRESSURN
    		0 AS
     RATING
    into
     #ratinginfo
    FROM
     PERSONPROFILE A
    JOIN
     ADDRESSPROFILE B
    ON
     A.ADDRESSURN = B.ADDRESSURN
    
    
    update
     rt
    set
     RATING = 'A'
    
    from
     #ratinginfo rt
    where
     TACCODE  = 
    (
    select
     MAX
    (TACCODE)
    from
     #ratinginfo
    where
     TACCODE = rt.TACCODE
    )
    
    select
     * from
     #ratinginfo
    order
     by
     PERSONURN, ADDRESSURN
    
    

    select
     * from
     #ratinginfo
    order by PERSONURN, ADDRESSURN

    My thoughts are that an index can be placed on the temp table which will cover the 2 indexex, but on the other hand I am thinking if I can avoid the write to the temp table in the first place, I mnight be able to get some gains there, I have read about CTE's and not too sure if they will be of any benefit here.

    Thanks.

    • Edited by MrFlinstone Monday, December 7, 2009 1:22 PM
    Monday, December 7, 2009 1:19 PM

Answers

  • Thanks for the explanation. Your requests makes a lot more sense now.

    This should do the trick.


    ;WITH cte
    AS
    (
    SELECT
    		A.TACCODE
    		A.PERSONURN
    		B.ADDRESSURN
    		'0' AS RATING,
    		B.SCALE,
    		ROW_NUMBER() OVER(PARTITION BY A.TACCODE ORDER BY B.SCALE DESC) AS seq
    FROM PERSONPROFILE A
    INNER JOIN ADDRESSPROFILE B
    	ON A.ADDRESSURN = B.ADDRESSURN
    )
    SELECT 
    	TACCODE
    	PERSONURN
    	ADDRESSURN
    	CASE WHEN seq = 1 THEN 'A' ELSE RATING END AS RATING,
    	SCALE
    FROM cte

    http://jahaines.blogspot.com/
    • Marked as answer by MrFlinstone Monday, December 7, 2009 6:18 PM
    Monday, December 7, 2009 2:43 PM
    Moderator

All replies

  • I also looked at the tempdb information, I can see that when this query runs, it can have about 7 threads within SQL, and I suspect that each of those threads relates to my tempdb file, as the server has 8 cores and I have 8 files for log file for the tempdb.

    The query causing the bottleneck is

    select
     * from
     #ratinginfo
    order by PERSONURN, ADDRESSURN


    I also noticed that its the query above which is also causing the most de-allocations in tempdb.


    Monday, December 7, 2009 1:21 PM
  • I am not sure I understand your query logic here.  The statement below should always UPDATE the same TACCODE because you are grabbing the max where the values are equal correct?  You should be able to select the rating as simply A instead of 0... I believe, but it is tough to say for sure without knowing your data.

    update
     rt
    set
     RATING = 'A'
    
    from
     #ratinginfo rt
    where
     TACCODE  = 
    (
    select
     MAX
    (TACCODE)
    from
     #ratinginfo
    where
     TACCODE = rt.TACCODE
    )
    




    http://jahaines.blogspot.com/
    Monday, December 7, 2009 1:43 PM
    Moderator
  • You can't do UPDATE to a CTE and then a SELECT. CTE's are like derived tables they exist for the life of the batch. Also, as Adam said, what is the logic? Why are you moving your data to a temp table, then updating from it? Why don't you just update using JOINS?


    Abdallah, PMP, ITIL, MCTS
    Monday, December 7, 2009 1:47 PM
  • TI have this query below and on using a profiler trace, I notice that the last select query is where the bottleneck is or at least the part of the query which is taking the longest.

    SELECT
    
    
    
    		A.TACCODE
    
    		A.PERSONURN
    
    		B.ADDRESSURN
    
    		0 AS
    
     RATING
    
    into
    
     #ratinginfo
    
    FROM
    
     PERSONPROFILE A
    
    JOIN
    
     ADDRESSPROFILE B
    
    ON
    
     A.ADDRESSURN = B.ADDRESSURN
    
    
    
    
    
    update
    
     rt
    
    set
    
     RATING = 'A'
    
    
    
    from
    
     #ratinginfo rt
    
    where
    
     TACCODE  = 
    
    (
    
    select
    
     MAX
    
    (TACCODE)
    
    from
    
     #ratinginfo
    
    where
    
     TACCODE = rt.TACCODE
    
    )
    
    
    
    select
    
     * from
    
     #ratinginfo
    
    order
    
     by
    
     PERSONURN, ADDRESSURN
    
    
    
    

    select
    
     * from
    
     #ratinginfo
    order by PERSONURN, ADDRESSURN

    My thoughts are that an index can be placed on the temp table which will cover the 2 indexex, but on the other hand I am thinking if I can avoid the write to the temp table in the first place, I mnight be able to get some gains there, I have read about CTE's and not too sure if they will be of any benefit here.

    Thanks.


    hi,

    I have a doubt in your query...

    select

    MAX

    (TACCODE)

    from

    #ratinginfo

    where

    TACCODE = rt.TACCODE

    What will be the max(TACCODE) when the same TACCODE is passed in where condition ?

    It will result the same

    MAX(1) will be 1 only...

    I dont understand the meaning of this query!!!

    I believe all your records will be updated with RATING = 'A'

    it is like a direct update statement 

    update #ratinginfo set rating = 'A'

    Thanks

    Thanks, Gnanasekar Babu Note: Please click on the vote button if the answer helps you
    Monday, December 7, 2009 1:57 PM
  • Sorry Guys


    I had a typo in the query, not actually on the DB at the moment.

    My thoughts are that with the use of a CTE and RANKING functions, the query can be rewritten. Its writing into a temp table first so as to join back to itself to get the max scale and update the rating, on checking the temp table, soemtimes I can see that scale is NULL sometimes.

    see below


    SELECT
    		A.TACCODE
    		A.PERSONURN
    		B.ADDRESSURN
    		0 AS RATING,
    		B.SCALE
    into #ratinginfo
    FROM PERSONPROFILE A
    JOIN ADDRESSPROFILE B
    ON A.ADDRESSURN = B.ADDRESSURN
    
    
    update rt
    set RATING = 'A'
    from #ratinginfo rt
    where TACCODE  = 
    (
    select MAX(SCALE)
    from #ratinginfo
    where TACCODE = rt.TACCODE
    )
    
    select * from #ratinginfo
    order by PERSONURN, ADDRESSURN
    


    Monday, December 7, 2009 2:34 PM
  • Thanks for the explanation. Your requests makes a lot more sense now.

    This should do the trick.


    ;WITH cte
    AS
    (
    SELECT
    		A.TACCODE
    		A.PERSONURN
    		B.ADDRESSURN
    		'0' AS RATING,
    		B.SCALE,
    		ROW_NUMBER() OVER(PARTITION BY A.TACCODE ORDER BY B.SCALE DESC) AS seq
    FROM PERSONPROFILE A
    INNER JOIN ADDRESSPROFILE B
    	ON A.ADDRESSURN = B.ADDRESSURN
    )
    SELECT 
    	TACCODE
    	PERSONURN
    	ADDRESSURN
    	CASE WHEN seq = 1 THEN 'A' ELSE RATING END AS RATING,
    	SCALE
    FROM cte

    http://jahaines.blogspot.com/
    • Marked as answer by MrFlinstone Monday, December 7, 2009 6:18 PM
    Monday, December 7, 2009 2:43 PM
    Moderator
  • Thanks for the answer, my problem now is that I am selecting the data into a temp table, and selecting the final results back from the temp table seems to take ages. Is there anything else that I can do ?

    Also, I selected directly from the CTE, that was still taking ages as well. Anything one can do ?
    Monday, December 7, 2009 6:19 PM
  • How many rows are you returning? Do you have proper indexing? Can you share the table structure?

     


    Abdallah, PMP, ITIL, MCTS
    Monday, December 7, 2009 6:29 PM
  • You may have a problem with performance because you are not filtering the PERSONPROFILE table.  So the execution plan will probably yield a scan of PERSONPROFILE (the expense is related to how many rows are in the table and if you have clustered index or not) and a nested loop join on ADDRESSPROFILE, which will seek if you have an index on ADDRESSPROFILE.ADDRESSURN.  Just make sure you have all columns listed in the query that are from ADDRESSPROFILE in your index, either in the index key or in the include clause; otherwise, a key/rid lookup will be used unless the look is more expensive than a scan.  If you post your execution plan we can better assit you.


    http://jahaines.blogspot.com/
    Monday, December 7, 2009 6:47 PM
    Moderator
  • Thanks for the reply, here is the query with all the data I want.


    ;WITH cte
    AS
    (
    SELECT
    		A.TACCODE
    		A.PERSONURN
    		B.ADDRESSURN
    		'0' AS RATING,
    		B.SCALE,
    		B.Address,
    		B.SiteRunCom,
    		B.CampaignCode
    		B.CampaignAddress
    		B.CampaignFlag,
    		A.Suspended
    		ROW_NUMBER() OVER(PARTITION BY A.TACCODE ORDER BY B.SCALE DESC) AS seq
    FROM PERSONPROFILE A
    INNER JOIN ADDRESSPROFILE B
    	ON A.ADDRESSURN = B.ADDRESSURN
    )
    SELECT 
    	TACCODE
    	PERSONURN
    	ADDRESSURN
    	CASE WHEN seq = 1 THEN 'A' ELSE RATING END AS RATING,
    	SCALE,
    	Address,
    	SiteRunCom,
    	CampaignCode
    	CampaignAddress
    	CampaignFlag,
    	Suspended
    
    into #summary
    FROM cte
    
    select * from #summary
    order by TACCODE,SiteRunCom
    


    What I can see is that select * from #summary order by TACCODE,SiteRunCom is taking a long time to complete, not sure if this is i/o/hardware related or a query that can be tuned.

    Thanks.


    Monday, December 7, 2009 9:56 PM
  • how many rows are in #summary? Replace the SELECT * with the columns names. Also, since this is a temp table, you can create an index on it. That will boost performance. Can you show the execution plan. Run the following code and then run the SELECT

    SET SHOWPLAN_TEXT ON


    Abdallah, PMP, ITIL, MCTS
    Monday, December 7, 2009 10:02 PM
  • The number of rows is over 15,000. The rows are quite a lot. I can either use a temp table or query the CTE directly, but the advantage of the temp table is that i can have indexes on it etc. question is, is it worth haviog such a large number of indexes on columns in a temp table, I know I can use an include index, is it worth it ?
    Monday, December 7, 2009 10:28 PM
  • 15,000 rows shouldn't take forever, but apparently they are taking some time. Add a nonclustered index and include all the rows if you want them returned in your query. Also, change the SELECT * as I mentioned before.

    CREATE NONCLUSTERED INDEX IX_Taccode ON #Summary(Taccode,SiteRunCom)
    INCLUDE (PERSONURN,
    	ADDRESSURN,
    	RATING,
    	SCALE ,
    	Address ,
    	CampaignCode,
    	CampaignAddress,
    	CampaignFlag,
    	Suspended)


    Abdallah, PMP, ITIL, MCTS
    Monday, December 7, 2009 10:38 PM
  • I believe you will be better off not using a temp table.  If you do all the logic inline, the optimizer can take advantage of all the indexes and stats in place.  I dont think 15000 rows warrants the use of a temp table in this situation. Make sure you have indexes in place and you should be alright.  If you want to speed this query up, you should try creating the non clustered covering index on ADDRESSPROFILE, but also create a nonclustered index on PERSONPROFILE (make sure to cover all the columns).  The optimizer may be able to perform a merge join, which will be faster than a nested loop join; however, this will not spare you the expensense of the sort operation.
    http://jahaines.blogspot.com/
    Monday, December 7, 2009 10:54 PM
    Moderator
  • I have done that, i.e removed the * and placed the covering index.

    What i find is that running a thread of the query isnt too bad, but once I increase the thread to be more than one, performance goes bad.
    I am suspecting tempdb. what DMV can i use to monitor/determine tempdb performance.
    Tuesday, December 8, 2009 12:28 AM
  • Here is a helpful link that has some of the DMV's to use for tempdb. 


    Abdallah, PMP, ITIL, MCTS
    • Edited by Abdshall Tuesday, December 8, 2009 3:11 AM Edited to eliminate confusion
    Tuesday, December 8, 2009 12:32 AM
  • Thanks for the information.

    I have looked at some DMVs for tempdb, and I can see that there is a high wait rate on tempdb for writes. I noticed that if I use just the CTE without writing into a temp table first, the wait stats for read wasnt too high and writes was none as i wasnt writing anything.

    For the CTE, what i notice is that if i have 1 thread running, it runs really fast, but as soon as the second thread starts, the performance gets really worse. now i am thinking would it be possible to use a derived table ?

    Also, as for using a normal table. The query can be called by a large number of users, and using a permanent table will cause problems as the table will need to be dropped and created, with a temp table, a table is independent to each session.

    Tuesday, December 8, 2009 12:57 AM
  • Abdshall,

    I was not recommending an additional permanent table, I was suggesting MrFlinstone use the cte/derived table solution I posted earlier, but to make sure the proper indexing is in place. 

    <<For the CTE, what i notice is that if i have 1 thread running, it runs really fast, but as soon as the second thread starts, the performance gets really worse. now i am thinking would it be possible to use a derived table ?

    A CTE in the context I posted operates exactly like a derived table.  You need to make sure the proper indexes are in place. If you are having tempdb contention, it is likely that you have not optimally configured tempdb or you have an IO bottleneck or some other type of bottleneck. 

    Optimizing tempdb: http://msdn.microsoft.com/en-us/library/ms175527.aspx
    Performance Tuning: http://technet.microsoft.com/en-us/library/cc966413.aspx
    http://jahaines.blogspot.com/
    Tuesday, December 8, 2009 3:04 AM
    Moderator
  • Abdshall,

    I was not recommending an additional permanent table, I was suggesting MrFlinstone use the cte/derived table solution I posted earlier, but to make sure the proper indexing is in place. 

    <<For the CTE, what i notice is that if i have 1 thread running, it runs really fast, but as soon as the second thread starts, the performance gets really worse. now i am thinking would it be possible to use a derived table ?

    A CTE in the context I posted operates exactly like a derived table.  You need to make sure the proper indexes are in place. If you are having tempdb contention, it is likely that you have not optimally configured tempdb or you have an IO bottleneck or some other type of bottleneck. 

    Optimizing tempdb: http://msdn.microsoft.com/en-us/library/ms175527.aspx
    Performance Tuning: http://technet.microsoft.com/en-us/library/cc966413.aspx
    http://jahaines.blogspot.com/
    Adam, I didn't mean an additional table, I do apologize for the confusion. I'll rephrase the sentence so it's not confusing to the OP and others.



    Abdallah, PMP, ITIL, MCTS
    Tuesday, December 8, 2009 3:09 AM