Can someone help, TOP 100 is significantly faster than TOP 101
-
Friday, December 11, 2009 7:13 PM
Any help appreciated
--fast
select top 100 * from test where c1 < 30000 order by c2--slow
select top 101 * from test where c1 < 30000 order by c2In the Webcast 1 (www.sqlworkshops.com/webcasts) they actually show how to make TOP 101 faster in SQL Server 2008, but they hide their solution for SQL Server 2005. Can someone post a way to make TOP 101 faster in SQL Server 2005, this will help me.
BobB
create table test (c1 int primary key clustered, c2 int, c3 char(2000))
declare @count int
set @count = 1
while @count <= 50000
begin
insert into test values (@count, @count, 'a')
set @count = @count + 1
end- Edited by Robert W Beck Monday, December 14, 2009 6:04 AM additional info
All Replies
-
Friday, December 11, 2009 8:01 PMModeratorThink about what it is doing...
It finds (quickly via an Index Seek) the 30,000 rows (where c1<30000). But since you specified that you want ALL columns (i.e. SELECT *), it pulls in the two integer columns (c1 and c2) and the 2000-character-long c3 column. So that's 30,000 rows * 2008 bytes = 6million bytes that it has to (physically) sort.
According to the documentation for the "Top N Sort" operator (http://technet.microsoft.com/en-us/library/ms189054(SQL.90).aspx), it says that for "small values of N" it does the sort in memory.
I'm guessing that it determined that 100 is considered a "small value" for the 6million bytes that you needed to sort and 101 was NOT a "small value" and so it had to resort to more a more generic method of sorting that probably involved some disk work.
If you change your SELECT * to just SELECT c2 then you won't see a difference between 100 and 101.
It all has to do with the size of the rows... and I guess you found the threshhold.
That's my guess... unless someone else has some input.
--Brad (My Blog)- Proposed As Answer by Tharindu Dhaneenja Sunday, December 13, 2009 3:48 PM
- Unproposed As Answer by Kalman TothMicrosoft Community Contributor, Moderator Sunday, December 13, 2009 10:02 PM
-
Friday, December 11, 2009 8:09 PMI would agree with Brad. I looked at the execution plans for both and the only thing that I found different(significantly) is CompileTime and CompileCPU.
TOP 100 CompileTime="2" CompileCPU="2"
TOP 101 CompileTime="63" CompileCPU="29"
Abdallah, PMP, ITIL, MCTS -
Friday, December 11, 2009 8:10 PMModeratorOne other thing...
Because of the size of the rows you need to sort... you can re-work the query like so and it will go lightning fast for TOP values over 100:
SELECT A.c1,A.c2,B.c3
FROM (SELECT TOP 101 c1,c2 FROM Test WHERE c1<30000 ORDER BY c2) A
JOIN Test B ON A.c1=B.c1
This way, it's only sorting small 8-byte rows and not 2008-byte rows.
--Brad (My Blog)- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Friday, December 11, 2009 8:49 PM
- Edited by Brad_SchulzModerator Friday, December 11, 2009 8:54 PM Oops... Changed A.c3 to B.c3
- Unproposed As Answer by Kalman TothMicrosoft Community Contributor, Moderator Sunday, December 13, 2009 10:04 PM
-
Friday, December 11, 2009 8:13 PMHi
It's very good point, and I agree that there is some kind of a threshold.
However when you switch io statistics on, there is no difference and there is no
refernce of "Worktables" or any other
sort of temprorary tables, and the execution plan is exactly the same.
I also tried to replace top N with top N PERCENT , and then the results (99 and 102, 200, 50 rows)
come not as fast as with TOP 100 , but not as slow as with 101.
There is one moreinteresting point - the response time is equal for any number of returning rows
when you use TOP N PERCENT -
Friday, December 11, 2009 8:33 PMModeratorI looked into it a little more, and here's what I concluded.
The TOP N PERCENT query will sort EVERYTHING and THEN will perform the TOP operation.
The TOP N query will use the special TOP N SORT operator.
I'm guessing that the TOP N SORT operator is very efficient for values of N<=100. For values >100, it is a dog.
The tradition SORT-only operator is very efficient. It doesn't take long for it to sort the 30000 records and then feed the results to the TOP operator.
The following two queries, which both select 101 records, go very fast:
SELECT TOP 0.336667 PERCENT FROM Test WHERE c1<30000 ORDER BY c2
SELECT TOP (101.0/(SELECT COUNT(*) FROM Test WHERE c1<30000)*100) PERCENT *
FROM Test
WHERE c1<30000
ORDER BY c2
But, as we have all found, the following is slow as molasses:
SELECT TOP 101 FROM Test WHERE c1<30000 ORDER BY c2
I think that 100 is an arbitrary tipping point for the TOP N SORT operator. In just doing some testing with 99, 100, and 101, and doing a query on a large table in Adventureworks you can see a difference.
SELECT TOP N * FROM AdventureWorks.Sales.SalesOrderDetail ORDER BY CarrierTrackingNumber
For N = 99 or 100, the Reads are 1246 and the Duration is around 500. Once N=101 or greater, the Reads shoot up to 2240 and the Duration to 1600 or higher.
--Brad (My Blog)- Edited by Brad_SchulzModerator Friday, December 11, 2009 8:52 PM Typo
-
Friday, December 11, 2009 9:04 PMModerator
If you were looking for the ALL/ANY blog entry, it's here: http://bradsruminations.blogspot.com/2009/08/all-any-and-some-three-stooges.html
I try to attach keywords to the blog posts, so you can type "ALL" or "ANY" in the "Search This Blog" Google Search box in the upper-right-hand corner of any blog page.
You can also go thru the Blog Archive (also on the right-hand side under my "About Me" section) and poke through the various months.
--Brad (My Blog) -
Saturday, December 12, 2009 8:38 PMModeratorI found this subject to be really interesting, and I thought I'd blog about it.
But I figured before I do that...
I wrote an e-mail to Conor Cunningham (SQL Server Query Optimization Development Lead at MSFT) about this subject and my suspicions posted above.
He said the person who is responsible for this area of the optimizer is on vacation until January and he'd get back to me then.
I'll try to follow up here once I find out anything definitive.
--Brad (My Blog) -
Saturday, December 12, 2009 9:22 PMModerator
Robert & All,
Certainly attention getting thread title!
Consider this: if we rebuild the index after table population and eliminate buffer reads, it appears there is no significant difference, or at least not as dramatic as without the "equalizers". Can you confirm? Following test is on SQL Server 2008.
Related link: Measuring Query Execution Time
Thanks.
SET NOCOUNT ON create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO ALTER INDEX ALL ON [dbo].[test] REBUILD --fast SET STATISTICS IO ON DBCC DROPCLEANBUFFERS select top 100 * from test where c1 < 30000 order by c2 /* Table 'test'. Scan count 2, logical reads 4086, physical reads 24, read-ahead reads 7503, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ --slow DBCC DROPCLEANBUFFERS select top 101 * from test where c1 < 30000 order by c2 /* Table 'test'. Scan count 2, logical reads 4455, physical reads 11, read-ahead reads 7503, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ SET STATISTICS IO OFF GO DROP TABLE test
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Sunday, December 13, 2009 3:42 PMhi,
This is one of good post I have seen. Needs to thanks Brad b'cos he explain this scenario nicely.I am come up with this solution it’s much better comparing top 101 operation.
----------------------------
SET ROWCOUNT 101
SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2
SET ROWCOUNT 0
----------------------------
Thanks
Tharindu Dhaneenja
Tharindu Dhaneenja (http://spaces.msn.com/dhaneenja) -
Sunday, December 13, 2009 7:44 PMModerator
Brad - I did a run with STATISTICS TIME ON. Look how much the "fast" top 100 slowed down after index rebuild and dbcc dropcleanbuffers.
Tharindu - your method is a little bit slower than the TOP function in the following test.
USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO SET STATISTICS TIME ON select top 100 * from test where c1 < 30000 order by c2 -- CPU time = 62 ms, elapsed time = 85 ms. SET STATISTICS TIME OFF ALTER INDEX ALL ON [dbo].[test] REBUILD GO DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 100 * from test where c1 < 30000 order by c2 -- CPU time = 110 ms, elapsed time = 1458 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 101 * from test where c1 < 30000 order by c2 -- CPU time = 437 ms, elapsed time = 2583 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET ROWCOUNT 101 SET STATISTICS TIME ON SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 -- CPU time = 469 ms, elapsed time = 3049 ms. SET STATISTICS TIME OFF SET ROWCOUNT 0 GO DROP TABLE test
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Sunday, December 13, 2009 8:51 PMModerator
Naomi/Brad,
Following Naomi's proposal to select c1 only, Top 100, Top 101 and Set Rowcount basically the same. As expected.
A good general operating practice for DBA-s to REBUILD indexes every weekend. Otherwise query performance may suffer.
USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO SET STATISTICS TIME ON select top 100 c1 from test where c1 < 30000 order by c2 -- CPU time = 31 ms, elapsed time = 33 ms. SET STATISTICS TIME OFF ALTER INDEX ALL ON [dbo].[test] REBUILD GO DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 100 c1 from test where c1 < 30000 order by c2 -- CPU time = 48 ms, elapsed time = 1305 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 101 c1 from test where c1 < 30000 order by c2 -- CPU time = 31 ms, elapsed time = 1291 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET ROWCOUNT 101 SET STATISTICS TIME ON SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 -- CPU time = 0 ms, elapsed time = 1497 ms. SET STATISTICS TIME OFF SET ROWCOUNT 0 GO DROP TABLE test
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 12:14 AMAnswererI'm going to agree with Brad - it's the result of the Top N Sort operator using TempDB for sorting rather than in memory.
By adding in calls to fn_virtualfilestats for TempDB on a server with no activity, you can see that the first query has no increase in the BytesWritten to TempDB, whereas the second (slow) query does (about 6 MB for the tempdb.mdf and 600KB for the log). -
Monday, December 14, 2009 12:26 AMModerator
I'm going to agree with Brad - it's the result of the Top N Sort operator using TempDB for sorting rather than in memory.
Agreed.
By adding in calls to fn_virtualfilestats for TempDB on a server with no activity, you can see that the first query has no increase in the BytesWritten to TempDB, whereas the second (slow) query does (about 6 MB for the tempdb.mdf and 600KB for the log).
If we rebuild the clustered index and cut out the dead-weight 2K column, as suggested by Naomi/Brad, all 3 queries perform the same. Demo follows.USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO -- REBUILD PRIMARY KEY clustered index ALTER INDEX ALL ON [dbo].[test] REBUILD GO DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 100 c1 from test where c1 < 30000 order by c2 -- CPU time = 47 ms, elapsed time = 1329 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top 101 c1 from test where c1 < 30000 order by c2 -- CPU time = 32 ms, elapsed time = 1282 ms. SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS SET ROWCOUNT 101 SET STATISTICS TIME ON SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 -- CPU time = 31 ms, elapsed time = 1313 ms. SET STATISTICS TIME OFF SET ROWCOUNT 0 GO DROP TABLE test
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 2:16 AMModerator
Interesting thread. My observation is (may be i am worng) it is to do more with ROW SIZE than any number (ie. 100 is the threashold or something like that). If you make column C3 VARCHAR then even TOP 1000 and TOP 1001 is returning in same time.
Madhu
MCITP, MCTS, MCDBA,MCP-- Blog : http://experiencing-sql-server-2008.blogspot.com/- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Wednesday, May 12, 2010 2:36 PM
-
Monday, December 14, 2009 5:39 PMModerator
Using SQL Profiler, this is what I got (with a DROPCLEANBUFFERS and FREEPROCCACHE between each test):
TOP 100 (Before Index Rebuild): Reads=7532, CPU=104, Duration=114
TOP 100 (After Index Rebuild): Reads=7517, CPU=104, Duration=108
TOP 101: Reads=13199, CPU=542, Duration=10769
ROWCOUNT 101: Reads=13176, CPU=370, Duration=1341
Difference before/after index rebuild is negligible.
Again, note the spike in Reads and CPU and Duration between TOP 100 and TOP 101. The #Reads alone indicates that something different is happening (a different algorithm being employed I think) between TOP 100 and TOP 101.
--Brad (My Blog) -
Monday, December 14, 2009 5:48 PMModeratorBrad...hmm... Why are my results so different? Can you post the scipts? Thanks.
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 5:51 PMModerator
Interesting thread. My observation is (may be i am worng) it is to do more with ROW SIZE than any number (ie. 100 is the threashold or something like that). If you make column C3 VARCHAR then even TOP 1000 and TOP 1001 is returning in same time.
Madhu
MCITP, MCTS, MCDBA,MCP-- Blog : http://experiencing-sql-server-2008.blogspot.com/
It's true that if you change the C3 column to a VARCHAR that you won't notice any difference (with your own eyes) in time between doing a TOP 100 or TOP 101.
But there is something there... If you use SQL Profiler, you will see the following differences:
TOP 100: Reads=86, Duration=25ms (Same measurements occur for TOP 99 and TOP 98 and so on downwards)
TOP 101: Reads=165, Duration=55ms (Same measurements occur for TOP 102 and TOP 103 and so on upwards)
This indicates to me that a different algorithm is being employed between <=100 and >=101.
--Brad (My Blog) -
Monday, December 14, 2009 6:05 PMModerator
Brad...hmm... Why are my results so different? Can you post the scipts? Thanks.
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com
Here you go... Again, the numbers come from Profiler, and I set Query Results to "Discard" so there is no time taken up by outputting the results to the results window.
Also, I don't typically run the entire script all at once when I'm doing this sort of thing. I'll create the table, and then I'll just highlight the "Test #1" portion and execute it 5 times or so and take the average measurements. Then I'll do the same for "Test #2" and so on.
USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO --Test #1 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 --Reads=7532, CPU=104, Duration=114 go ALTER INDEX ALL ON [dbo].[test] REBUILD --Test #2 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 --Reads=7517, CPU=104, Duration=108 go --Test #3 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 101 * from test where c1 < 30000 order by c2 --Reads=13199, CPU=542, Duration=10769 go --Test #4 DBCC DROPCLEANBUFFERS dbcc freeproccache go SET ROWCOUNT 101 SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 --Reads=13176, CPU=370, Duration=1341 go SET ROWCOUNT 0 DROP TABLE test GO
--Brad (My Blog) -
Monday, December 14, 2009 6:09 PMModerator
Interesting thread. My observation is (may be i am worng) it is to do more with ROW SIZE than any number (ie. 100 is the threashold or something like that). If you make column C3 VARCHAR then even TOP 1000 and TOP 1001 is returning in same time.
Madhu
MCITP, MCTS, MCDBA,MCP-- Blog : http://experiencing-sql-server-2008.blogspot.com/
It's true that if you change the C3 column to a VARCHAR that you won't notice any difference (with your own eyes) in time between doing a TOP 100 or TOP 101.
But there is something there... If you use SQL Profiler, you will see the following differences:
TOP 100: Reads=86, Duration=25ms (Same measurements occur for TOP 99 and TOP 98 and so on downwards)
TOP 101: Reads=165, Duration=55ms (Same measurements occur for TOP 102 and TOP 103 and so on upwards)
This indicates to me that a different algorithm is being employed between <=100 and >=101.
--Brad (My Blog)
I agree that it seems like the TOP operator has a performance tipping point. I look forward to seeing what the MS Dev team comes back with.
http://jahaines.blogspot.com/- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Wednesday, May 12, 2010 2:37 PM
-
Monday, December 14, 2009 6:16 PMModerator>
>I agree that it seems like the TOP operator has a performance tipping point.
>I look forward to seeing what the MS Dev team comes back with.
>
Conor did tell me that he believed that 100 is a "magic number" and that they've talked about improving this but have just not done it yet.
He wanted to confirm it, though, with the person who's on vacation.
--Brad (My Blog) -
Monday, December 14, 2009 7:19 PMModerator
Thanks Brad. Really exciting stuff. Here is my run on SQL Server 2008 with Profiler data. When Duration is way out of proportion to Reads, it usually means blocking.
Query Reads CPU DurationMsec select top 100 * from test whe 10801 109 110 select top 100 c1 from test wh 8268 31 20 select top 100 * from test whe 8236 172 1684 select top 100 c1 from test wh 8236 48 25 select top 101 * from test whe 13893 375 3023 select top 101 c1 from test wh 8236 15 26 SELECT * FROM TEST WHERE C1 < 13890 391 2417 SELECT c1 FROM TEST WHERE C1 < 8236 32 18
-- TOP 100 / TOP 101 anomaly test USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 50000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO --Test #1 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 go select top 100 c1 from test where c1 < 30000 order by c2 go ALTER INDEX ALL ON [dbo].[test] REBUILD --Test #2 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 go select top 100 c1 from test where c1 < 30000 order by c2 go --Test #3 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 101 * from test where c1 < 30000 order by c2 go select top 101 c1 from test where c1 < 30000 order by c2 go --Test #4 DBCC DROPCLEANBUFFERS dbcc freeproccache go SET ROWCOUNT 101 GO SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 go SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 go SET ROWCOUNT 0 DROP TABLE test GO
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 7:41 PMModeratorLittle bit more excitement... 1 million rows instead of 50K, same script as in my previous post...the table turns....
Query Reads CPU DurationMsec select top 100 * from test whe 16960 563 9002 select top 100 c1 from test wh 8505 47 20 select top 100 * from test whe 8235 62 2025 select top 100 c1 from test wh 8235 32 17 select top 101 * from test whe 13889 297 3644 select top 101 c1 from test wh 8235 32 19 SELECT * FROM TEST WHERE C1 < 13888 344 4905 SELECT c1 FROM TEST WHERE C1 < 8235 31 20
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 8:29 PMModerator
Little bit more excitement... 1 million rows instead of 50K, same script as in my previous post...the table turns....
Query Reads CPU DurationMsec select top 100 * from test whe 16960 563 9002 select top 100 c1 from test wh 8505 47 20 select top 100 * from test whe 8235 62 2025 select top 100 c1 from test wh 8235 32 17 select top 101 * from test whe 13889 297 3644 select top 101 c1 from test wh 8235 32 19 SELECT * FROM TEST WHERE C1 < 13888 344 4905 SELECT c1 FROM TEST WHERE C1 < 8235 31 20
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com
Are those figures from pushing the execute button, walking away, and then coming back and looking at all the figures? What if you change the order of your tests? The first test that gets run could be more inflated.
I did the million-row test also, and it doesn't come out much different from my previous figures on 50,000 rows (except more duration time to traverse the bigger clustered index to find the 30000 recs)... but again, I INDIVIDUALLY EXECUTE each test at least 5 times and take the average... I don't execute the entire thing all at once:
Test #1 (TOP 100 before index rebuild):
SELECT *: Reads=7533, CPU=83, Duration=2060
SELECT c1: Reads=7533, CPU=68, Duration=1985
Test #2 (TOP 100 after index rebuild):
SELECT *: Reads=7517, CPU=99, Duration=2043
SELECT c1: Reads=7517, CPU=63, Duration=1947
Test #3 (TOP 101):
SELECT *: Reads=14683, CPU=552, Duration=16750
SELECT c1: Reads=7575, CPU=46, Duration=2088
Test #4 (ROWCOUNT 101):
SELECT *: Reads=13167, CPU=480, Duration=3967
SELECT c1: Reads=7517, CPU=47, Duration=5704
Also, your SELECT c1 tests were waaaaaay lower in terms of duration than your SELECT * tests because you didn't do the DBCC commands before them... you executed them immediately after the SELECT * tests and they participated in the same cacheing and buffering... thus you got duration times of 20ms or less.
--Brad (My Blog)- Edited by Brad_SchulzModerator Monday, December 14, 2009 8:29 PM Clarification
-
Monday, December 14, 2009 8:46 PMModerator
Brad,
Valid point about the missing DBCC DROPCLEANBUFFERS. I will do a new run.
I am tracing event SQL: BatchCompleted into table TOP100 in Monitor DB.
I use the following query to get the results:
SELECT QUERY = LEFT(convert(VARCHAR,TextData),50), Reads, CPU, DurationMsec = Duration / 1000 FROM TOP100 WHERE LEFT(convert(VARCHAR,TextData),6) = 'select' ORDER BY StartTimeIn my experience I found "reads" better metrics for performance than "duration". In your case, I don't understand why duration is so much out of proportion for TOP 101. If one time, it can be blocking. If systematic, it can be tempdb issue (sorting the 2K deadweight column).
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 8:51 PMModerator
New run for 1 million. DBCC DROPCLEANBUFFERS is used before each SELECT.
Query Reads CPU DurationMsec select top 100 * from test where c1 < 30000 order 16960 156 14332 select top 100 c1 from test where c1 < 30000 order 8505 62 1482 select top 100 * from test where c1 < 30000 order 7561 78 2399 select top 100 c1 from test where c1 < 30000 order 7525 32 2147 select top 101 * from test where c1 < 30000 order 13893 328 3884 select top 101 c1 from test where c1 < 30000 order 8277 63 2200 SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 12028 266 2978 SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 8241 46 1559
-- TOP 100 / TOP 101 anomaly test USE tempdb; SET NOCOUNT ON GO create table test (c1 int primary key clustered, c2 int, c3 char(2000)) declare @count int set @count = 1 while @count <= 1000000 begin insert into test values (@count, @count, 'a') set @count = @count + 1 end GO --Test #1 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 go DBCC DROPCLEANBUFFERS go select top 100 c1 from test where c1 < 30000 order by c2 go ALTER INDEX ALL ON [dbo].[test] REBUILD --Test #2 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 100 * from test where c1 < 30000 order by c2 go DBCC DROPCLEANBUFFERS go select top 100 c1 from test where c1 < 30000 order by c2 go --Test #3 DBCC DROPCLEANBUFFERS dbcc freeproccache go select top 101 * from test where c1 < 30000 order by c2 go DBCC DROPCLEANBUFFERS go select top 101 c1 from test where c1 < 30000 order by c2 go --Test #4 DBCC DROPCLEANBUFFERS dbcc freeproccache go SET ROWCOUNT 101 GO SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 go DBCC DROPCLEANBUFFERS go SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 go SET ROWCOUNT 0 DROP TABLE test GO
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Tuesday, November 09, 2010 6:11 PM
-
Monday, December 14, 2009 9:34 PMModerator>
> In your case, I don't understand why duration is so much out of proportion for TOP 101.
>
Apparently the original poster revised his message to include a link to a set of SQL Workshop Webcasts where this TOP 100 and TOP 101 phenomenon is illustrated... that's apparently where he got his script.
If you look at the Webcast 3 of 3, you will see that SQL2005 is much more inefficient than SQL2008 in the difference between TOP 100 and TOP101.
I'm using 2005 and you're using 2008... and that's why my duration on the TOP 101 is so much higher.
--Brad
--Brad (My Blog) -
Monday, December 14, 2009 10:23 PMModerator
If you look at the Webcast 3 of 3, you will see that SQL2005 is much more inefficient than SQL2008 in the difference between TOP 100 and TOP101.
I'm using 2005 and you're using 2008... and that's why my duration on the TOP 101 is so much higher.
--Brad
DBA-s / Developers / IT managers take notice! Another reason to upgrade to upgrade to SQL Server 2008.
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, December 14, 2009 10:34 PMModerator>
> In the Webcast 1 (www.sqlworkshops.com/webcasts) they actually show how to make TOP 101
> faster in SQL Server 2008, but they hide their solution for SQL Server 2005. Can someone post a
> way to make TOP 101 faster in SQL Server 2005, this will help me.
>Okay, Bob, I found a way... this might be how he did it:
Do the following:
SELECT TOP 101 c1,c2,c3=(SELECT c3 FROM Test B WHERE c1=Test.c1)
FROM Test
WHERE c1<30000
ORDER BY c2
As R Meyyappan in the video suggested, I did not change the TOP 101 or the FROM or the WHERE or the ORDER BY.
In fact, in executing the above, I got the query to be only 60ms in duration (whereas R Meyyappan in the video was excited to get it down to 244ms... and he had 8x the memory than I have on my machine). That's even FASTER than the normal SELECT TOP 100.
Hah! Take that!
The solution above is similar to what I mentioned last Friday as an alternative solution using JOINs.
--Brad (My Blog) -
Monday, December 14, 2009 10:44 PMModeratorBrad I think those results are highly dependent on the number of rows in the outer table. As you may know the inner table has to be seeked/scanned for every row in the outer table, unless of course the optimizer decides to use a merge join. I would imagine the above query will result in a nested loop join which means the out query is filtered for rows where the id is less than 30000 and then a nested loop join is used to filter on each individual id. This means you will have more IO and the results should vary depending on the number of rows returned by the outer query which is a decent amount here, but may be a lot more in other cases.
http://jahaines.blogspot.com/- Edited by Adam HainesModerator Monday, December 14, 2009 10:45 PM syntax
- Edited by Adam HainesModerator Monday, December 14, 2009 10:49 PM edit
-
Monday, December 14, 2009 10:53 PMModeratorWhat's interesting about the query plan for that query is that it does the SEEK into the clustered index (i.e. the WHERE c1<30000) and gets the C1 and C2 values. Then it SORTs those 29999 rows by C2... BUT... it ALSO pulls out the TOP 101 at the same time (even though it's only a SORT operator and not a TOP N SORT operator)... That's the key to why it goes so fast.
Then, for each of those 101 rows (Nested Loops), it does the SEEK to get the C3 column.
Then, for whatever reason, it does the TOP 101 again via a TOP operator. I have no idea why it feels this is necessary to do.
--Brad (My Blog) -
Monday, December 14, 2009 10:58 PMAnswererA completely tongue-in-cheek alternative, just to make it even faster is to create the following index and use the original query.
CREATE NONCLUSTERED INDEX ncixc2 ON Test(c2, c1) INCLUDE (c3)
However, it makes little sense to add 2 GB of index just to save 30ms :)
-
Monday, December 14, 2009 11:07 PMModeratorBrad did you get a TOP operator within the outer portion of the nested loop join? I didnt get that when I just tried it. I got an estimated number of rows that said said 101 were being moved but the actual number was 820. That means the optimizer did 820 index seeks to satisfy the query. Then once the results were joined together, the optimizer used a single top operator, on the 820 rows.
http://jahaines.blogspot.com/ -
Monday, December 14, 2009 11:17 PMModeratorI uploaded the .SQLPLAN (actual query plan) and a JPG of the plan up to my SkyDrive:
http://cid-7ba463f684be1dde.skydrive.live.com/browse.aspx/.Public?uc=2
The SORT operator receives 29999 rows, but outputs only 101 (that's BOTH estimated and actual).
Then that gets fed to Nested Loops, which does the lookup for the C3 values.
And then those 101 rows get fed to a (seemingly redundant) TOP operator, which, of course, receives 101 rows and outputs 101 rows.
Again, this was in SQL2005. I know you work in SQL2008... and perhaps that's why you got a different result.
--Brad (My Blog) -
Monday, December 14, 2009 11:31 PMI tested with SQL Server 2008
TOP 100 -> Cpu 93, Duration 133 and Reads 11018
TOP 101 -> Cpu 234, Duration 2081 and Reads 16672
TOP 101 (Brad's with correlated subquery) -> Cpu 47, Duration 185 and Reads 11672.
So the correlated subquery need less cpu than seeking the table only once!
It must be a way to force the optimizer to do a scan (cheaper for optimizer) and there is. See this rewrite
select top 100 * from test where 0+c1 < 30000 order by c2
select top 101 * from test where 0+c1 < 30000 order by c2
Top 100 (Peso) -> Cpu 94, Duration 145 and Reads 18361.
Top 101 (Peso) -> Cpu 140, Duration 170 and Reads 18361.
Both are slightly faster than Brad's, but uses more CPU. The queries uses a clustered index scan.
As you can see, Reads is NOT a good measure for performance. My rewrite uses 1/10th of the time of original 101 problem, and significantly more Reads.- Edited by SwePesoMVP Monday, December 14, 2009 11:44 PM
- Edited by SwePesoMVP Monday, December 14, 2009 11:45 PM
-
Monday, December 14, 2009 11:43 PMModerator
Really interesting, Peso...
When I run your WHERE 0+C1<30000 on my SQL2005 box, I get the absolute worst performance of all. It takes a full 20 seconds; whereas, the normal vanilla TOP 101 query (i.e. just WHERE C1<30000) took 15-16 seconds.
Again, the difference between SQL2005 and 2008, I guess. But what a stark difference!
--Brad (My Blog) -
Monday, December 14, 2009 11:47 PMI am testing this on 2008 R2 Nov CTP 10.50.1352.12. When you ran the query, did you also get a index scan on 2005, or a table scan?
I posted the results as an input to your forthcoming blog post. The "+0" is only there to get rid of any index use. -
Monday, December 14, 2009 11:55 PMModeratorWhat is the query plan for your clustered index scan method?
It doesn't seem to make sense to me... You reported this, right?:
SELECT TOP 101 * FROM Test WHERE C1<30000 ORDER BY C2
--CPU=234, Duration=2081
SELECT TOP 101 * FROM Test WHERE 0+C1<30000 ORDER BY C2
--CPU=140, Duration=170
In SQL2005, the above create the same query plan, except the first one does a Clustered Index SEEK, and the second one does a Clustered Indes SCAN. Both of them feed the rows to the Top N Sort operator, which sorts and pulls out the TOP 101 rows.
In other words, they both feed 29999 rows to the Top N Sort operator, and that's supposedly the bottleneck.
How come your duration is so incredibly small in the SCAN? Something ridiculously much more efficient in 2008?
--Brad (My Blog) -
Tuesday, December 15, 2009 12:09 AMModerator>
>When you ran the query, did you also get a index scan on 2005, or a table scan?
>
It's the same thing, isn't it? Because the table has a PRIMARY KEY CLUSTERED INDEX on C1... so the index IS the table.
--Brad (My Blog) -
Tuesday, December 15, 2009 12:21 AMModeratorBrad,
Are you supplying the maxdop 1 hint? I get the query plan with the top filter in the sort if and only if i use the maxdop hint. If I do not use the maxdop hint, the number of rows is greater than the specified top amount.
http://jahaines.blogspot.com/- Edited by Adam HainesModerator Tuesday, December 15, 2009 12:21 AM syntax
-
Tuesday, December 15, 2009 12:35 AMModerator
Brad,
Are you supplying the maxdop 1 hint? I get the query plan with the top filter in the sort if and only if i use the maxdop hint. If I do not use the maxdop hint, the number of rows is greater than the specified top amount.
http://jahaines.blogspot.com/
Interesting question.
I am NOT using the MAXDOP hint; however, I did have my Server Properties set to Max Degree of Parallelism = 1.
When I changed it to 0 (and I have 2 processors), the query plan institutes parallelism and the SORT operator now receives 29999 rows, estimates 101, but ACTUALly produced 186. Not as high as 860 like you got, but perhaps you have more processors... or maybe faster ones?
My Nested Loops operator feeds those 186 rows to the Parallelism/GatherStreams operator, and that pares it down to 101 rows, which then gets fed to the (seemingly redundant) TOP operator.
--Brad (My Blog) -
Tuesday, December 15, 2009 1:37 AMModerator
As you can see, Reads is NOT a good measure for performance. My rewrite uses 1/10th of the time of original 101 problem, and significantly more Reads.
Hah.... script pls...hope you are not forgetting DBCC DROPCLEANBUFFERS....
SQL Server 2008 run 1 million rows. As we can see reads is good metrics.
Query Reads CPU DurationMsec select top 100 * from test where c1 < 30000 order by c2 16960 374 19369 select top 100 c1 from test where c1 < 30000 order by c2 8505 0 1432 select top 100 * from test where c1 < 30000 order by c2 8235 62 1744 select top 100 c1 from test where c1 < 30000 order by c2 8235 15 1684 select top 100 * from test where 0+c1 < 30000 order by c2 250792 3281 54936 select top 101 * from test where c1 < 30000 order by c2 13887 328 3683 select top 101 * from test where 0+c1 < 30000 order by c2 250764 1375 47836 select top 101 c1 from test where c1 < 30000 order by c2 8235 47 1475 SELECT * FROM TEST WHERE C1 < 30000 ORDER BY C2 11482 156 2617 SELECT c1 FROM TEST WHERE C1 < 30000 ORDER BY C2 8235 16 1689 SELECT * FROM TEST WHERE 0+C1 < 30000 ORDER BY C2 250780 1437 47891
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Tuesday, December 15, 2009 5:06 AMWhy should I do DBCC DROPCLEANBUFFERS on a production machine?
That would be like letting the engine cool off everytime I start it again. -
Tuesday, December 15, 2009 5:38 AMI just tested on SQL Server Express 2008, and both queries are taking almost the same time( < 1 sec). A clustered index seek is being done and a Sort operation.select top 100 * from test where 0+c1 < 30000 order by c2select top 101 * from test where 0+c1 < 30000 order by c2
I also ran them afterDBCC DROPCLEANBUFFERSdbcc freeproccachegoAnd TOP 100 took 4 seconds, TOP 101 took around 5 seconds. I couldn't get the CPU and the READS since I don't have Profiler.The following is doing Clustered Index Scan, with almost the same time (again, couldn't get the actual duration,read, and CPU)select top 100 * from test where 0+c1 < 30000 order by c2select top 101 * from test where 0+c1 < 30000 order by c2
Abdallah El-Chal, PMP, ITIL, MCTS -
Tuesday, December 15, 2009 7:34 AMModerator
I just tested on SQL Server Express 2008, and both queries are taking almost the same time( < 1 sec). A clustered index seek is being done and a Sort operation.
Abdallah El-Chal, PMP, ITIL, MCTS
Precisely. I don't think there is any issue, interesting thread though. Nobody ever promised that SQL Server will behave exactly linearly as far as performance concerned. Fragmented index, execution plan variation and sorting operation can cause non-linear behavior.
Development Performance Measurent 101:
1. Use non-production machine
2. REBUILD index(s) prior to measurement
3. Use DBCC DROPCLEANBUFFERS before queries
4. Use dbcc freeproccache if compile time is significant before queries
5. SET STATISTICS IO ON/OFF and SET STATISTICS TIME ON / OFF will yield basic performance data
6. Profiler SQL:BatchCompleted event saved to a table will yield reads, writes, CPU and duration data
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com- Edited by Kalman TothMicrosoft Community Contributor, Moderator Tuesday, December 15, 2009 11:34 PM
- Proposed As Answer by Kalman TothMicrosoft Community Contributor, Moderator Tuesday, December 22, 2009 1:11 PM
-
Tuesday, December 15, 2009 7:52 AMDBCC DROPCLEANBUFFERS will tell you more about how fast your disk system is, than the performance of the code.
Disks are much slower than physical RAM. I verified this by adding a ram-drive and moving the test database there.
Now I get same fast speed (with DBCC DROPCLEANBUFFERS on ram disk) as the code before (without DBCC DROPCLEANBUFFERS and on conventional disks).
Does it mean the code is fast? No, it only means the disk system is really fast (it should be because it is a ram drive).
Also statistics io has issues (bugs) with certain join operations/conditions and will not report accurate (if at all) timings. You can see this quite often when worktables are involved. How often do you seen 0 as metric for the statistics io for worktables? -
Tuesday, December 15, 2009 11:08 PM
What a coinccident! I am on the same issue just at the time.
I was considering implementing an algorithm like this:
First populate the N rows to a table variable (with index on the sort column), then iterate through all left rows, adding one row to the table variable if bigger than min of the table, else discard it. This could be either done in sql or clr aggregate function.
Then I thought maybe MS had already done it in the Top N stuff, so started to run a test against it.CREATE TABLE [dbo].[NUM] ([n] int NOT NULL, s varchar(128) NULL, PRIMARY KEY CLUSTERED([n] ASC)) go -- populate data set nocount on declare @n int, @i int set @n=1000000 set @i = 0 while @n>0 begin if @i = 0 begin tran insert into dbo.NUM select @n, convert(varchar,@n + @i * 2) set @n=@n-1 set @i = (@i + 1) % 1000 if @i = 0 commit end GO -- test 1 select top ( XX ) cast(s as int), n from dbo.num order by cast(s as int) desc go -- test 2 set rowcount XX select cast(s as int), n from dbo.num order by cast(s as int) desc go
for test 1, duration < 1s, for any XX <= 100, and the duration is about 12s for any XX >100
for test 2, the duration is fixed at 4s for XX: 10 - 100,000.
The show-plan shows test 1 uses Top N sort op, while the test 2 uses Sort op.
Ok I dont care about the sort op. The only thing I care is if MS has correctly implemented the Ton N Sort.
MSDN stated about "Top N sort":
"Top N Sort is similar to the Sort iterator, except that only the first N rows are needed, and not the entire result set. For small values of N, the SQL Server query execution engine attempts to perform the entire sort operation in memory. For large values of N, the query execution engine resorts to the more generic method of sorting to which N is not a parameter."
As you can see, this statement sound like the algorithm I was intending to write myself. But the later part mentioned a "more generic method of sorting to which N is not a parameter", that exlains why no matter how XX changes for test1 after going beyong 100, the duration is always the same. Test 2 is also insensitive to N.
So MS seems used 3 algorithm, in which two of them are used for "top N", one is for "set rowcount".
I do not think whether to perform it in memory or not will cause such a big difference. It's mainly due to that only one (the fastest one) uses the algorithm of just keeping the top N rows and then evict low ranking items when they fall below the N window. -
Tuesday, December 15, 2009 11:49 PMModerator
Johnx,
I added index REBUILD and DBCC DROPCLEANBUFFERS to your script and the results are strikingly similar to previous results without any drama or excitement.
Index REBUILD ensures that you are not hitting fragmented segments of the B-Tree.
DBCC DROPCLEANBUFFERS ensures that you are not comparing disk reads (slow) with buffer reads (fast).
TOP (100) CPU time = 641 ms, elapsed time = 896 ms. TOP (101) CPU time = 828 ms, elapsed time = 1322 ms. ROWCOUNT 101 CPU time = 734 ms, elapsed time = 1208 ms.
Script executed on SQL Server 2008:
USE tempdb; GO CREATE TABLE [dbo].[NUM] ([n] int NOT NULL, s varchar(128) NULL, PRIMARY KEY CLUSTERED([n] ASC)) go -- populate data set nocount on declare @n int, @i int set @n=1000000 set @i = 0 while @n>0 begin if @i = 0 begin tran insert into dbo.NUM select @n, convert(varchar,@n + @i * 2) set @n=@n-1 set @i = (@i + 1) % 1000 if @i = 0 commit end GO ALTER INDEX ALL ON Num REBUILD GO PRINT 'TOP (100)' DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top ( 100 ) cast(s as int), n from dbo.num order by cast(s as int) desc SET STATISTICS TIME OFF go PRINT 'TOP (101)' DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select top ( 101 ) cast(s as int), n from dbo.num order by cast(s as int) desc SET STATISTICS TIME OFF go PRINT 'ROWCOUNT 101' set rowcount 101 DBCC DROPCLEANBUFFERS SET STATISTICS TIME ON select cast(s as int), n from dbo.num order by cast(s as int) desc SET STATISTICS TIME OFF go DROP TABLE [dbo].[NUM]
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Wednesday, December 16, 2009 1:58 AMI am using a sql 2005.
I also tested the "select top (@n)" variation. The result shows that "select top (@n)" is similar to "set rowcount...".
The reason I tested the "select top (@n)" variation is that I was wondering if We could use plan-force to force it use the faster "Top N Sort". However it seems that "select top (@n)" is quite different from "select top (xx)" where xx is a constant, but similar to "set rowcount; ...". Guess it will not work, so I will not try to test if plan-force can do the job.
Just curious why MS choose not to use the "Top N Sort" algorithm always, instead to choose this so complex arrangement (i.e. some with "Top N Sort", some with the "Sort then Top"). I think, "Top N Sort" should always be used, can someone point out to me a case when "Sort then Top" will perform better than "Top N Sort"? -
Wednesday, December 16, 2009 7:45 AMSET ROWCOUNT x
is a deprecated feature and is very well documented in Books Online that it will be removed in a future version of SQL Server.
Don't build new business logic around this feature. -
Friday, May 14, 2010 7:54 PM
Hi felows,
I don't know if it helps, but I wrote about that some time ago...
Cheers
http://www.simple-talk.com/author/fabiano-amorim/
Fabiano Neves Amorim - MCTS / MCP - SQLServer - http://fabianosqlserver.spaces.live.com/- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Friday, May 14, 2010 8:28 PM
- Unproposed As Answer by Kalman TothMicrosoft Community Contributor, Moderator Friday, May 14, 2010 8:54 PM
-
Friday, May 14, 2010 8:27 PMModerator
" Quando uso TOP 100 o SQL utiliza 550mb de memória, 9875 de CPU e retorna a consulta em 40 segundos, quando uso TOP 101 o SQL utiliza 798mb de memória, 42745 de CPU e retorna a consulta em 1 minuto e 20 segundos. When I use the SQL TOP 100 uses 550MB of memory, CPU 9875 and the query returns in 40 seconds, when I use the SQL TOP 101 uses 798mb of memory, 42 745 CPU and returns the query in 1 minute and 20 seconds. "
SQL Server uses the available resources according to its (secret) internal algorithm. It can be imagined that the second operation may involve tempdb as opposed to memory.
Kalman Toth, SQL Server & Business Intelligence Training; SQL 2008 GRAND SLAM -
Friday, May 14, 2010 8:28 PMModerator
Hi felows,
I don't know if it helps, but I wrote about that some time ago...
Cheers
http://www.simple-talk.com/author/fabiano-amorim/
Fabiano Neves Amorim - MCTS / MCP - SQLServer - http://fabianosqlserver.spaces.live.com/
Hi Fabiano...I like the blog entry. Did you see Adam's blog on the subject as well?: http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will_17.html
How did you hear about the TOP 101 phenomenon that drove you to write about it in February? Did you stumble upon it yourself? Or did you see Ramesh Meyappan's webcasts?
(By the way, I'm enjoying your blog posts at simple-talk.com about the Showplan Operators of the Week).
--Brad (My Blog) -
Friday, May 14, 2010 8:54 PMModerator
Fabiano,
Can you post a query which we all can test? Thanks.
Kalman Toth, SQL Server & Business Intelligence Training; SQL 2008 GRAND SLAM -
Friday, May 14, 2010 9:07 PM
Brad, I haven't see the Adam's blog on the subject... I take a look now and seems to be an very interesting point of view… I wrote about it because I found this problem in one of our customers then I start to search about it.
By the way, Adam’s blog is now on my RSS feeds…
I hate when we need do this kind of stuff to “fix” some SQL behaviors… well, especially because we only do this when some person find this problems…
What a great discovery about the left order join… well… that is a very very great argue with Conor. Congrats for that. I was one of the guys who used to say, “The order doesn’t matter”, now the things has changed a little bit…
I’m glad you are enjoying the “Operator of the Weeks”, much more very interesting things will come J …
Tks.
Fabiano Neves Amorim - MCTS / MCP - SQLServer - http://fabianosqlserver.spaces.live.com/ -
Friday, May 14, 2010 9:18 PM
Sorry SQLUSA I'm afraid I cannot post the original query.
But I think, the Adam’s blog could be useful to do the tests, don’t you think?
Thanks
Fabiano Neves Amorim - MCTS / MCP - SQLServer - http://fabianosqlserver.spaces.live.com/ -
Friday, May 14, 2010 11:09 PMModerator
Thanks Fabiano. I did exactly that, took Adam's code and added a few tricks from my own optimization wizard hat. Results follows. Not exactly linear, but whoever promised B-tree indexing to be linear? Whoever said that queries on fragmented indexes will yield linear results? The execution plans are identical for 100 & 101. If you do SELECT TOP (@top) ... results about the same.
BTW - I did Sybase performance tuning for a week or so in Sao Paolo - huge city! Loved Rio!
TEST 17 ******************************************************* SQL Server Execution Times: CPU time = 156 ms, elapsed time = 1679 ms. TEST 100 ******************************************************* SQL Server Execution Times: CPU time = 203 ms, elapsed time = 1736 ms. TEST 101 ******************************************************* SQL Server Execution Times: CPU time = 281 ms, elapsed time = 2395 ms. TEST 100 ******************************************************* SQL Server Execution Times: CPU time = 124 ms, elapsed time = 1564 ms. TEST 99 ******************************************************* SQL Server Execution Times: CPU time = 125 ms, elapsed time = 1543 ms. TEST 150 ******************************************************* SQL Server Execution Times: CPU time = 203 ms, elapsed time = 2790 ms. TEST 7 ******************************************************* SQL Server Execution Times: CPU time = 125 ms, elapsed time = 1582 ms. TEST 102 ******************************************************* SQL Server Execution Times: CPU time = 312 ms, elapsed time = 2543 ms.
/****************************************************** ORIGINAL CODE by Adam Haines FROM: http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will_17.html ALTER INDEX added by Kalman Toth DBCC added by Kalman Toth Different TOP values added by Kalman Toth ********************************************************/ USE [tempdb] GO SET NOCOUNT ON; GO IF EXISTS(SELECT 1 FROM sys.tables WHERE NAME = 'TestData') BEGIN DROP TABLE dbo.[TestData]; END GO CREATE TABLE dbo.TestData( RowNum INT PRIMARY KEY, SomeId INT, SomeCode CHAR(2000) ); GO ;WITH L0 AS (SELECT 1 AS C UNION ALL SELECT 1) --2 rows ,L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B) --4 rows (2x2) ,L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B) --16 rows (4x4) ,L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B) --256 rows (16x16) ,L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B) --65536 rows (256x256) ,L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B) --4,294,967,296 rows (65536x65536) ,Number AS (SELECT row_number() OVER (ORDER BY (SELECT 0)) AS N FROM L5) INSERT INTO dbo.TestData SELECT N AS RowNumber, ABS(CHECKSUM(NEWID()))%1000000+1 AS SomeId , REPLICATE('a',2000) AS SomeCode FROM Number WHERE [N] <= 50000 GO ALTER INDEX ALL ON dbo.[TestData] REBUILD GO UPDATE STATISTICS dbo.[TestData] WITH FULLSCAN; GO DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 17 *******************************************************' SET STATISTICS TIME ON SELECT TOP 17 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 100 *******************************************************' SET STATISTICS TIME ON SELECT TOP 100 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 101 *******************************************************' --Slow SET STATISTICS TIME ON SELECT TOP 101 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION(MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 100 *******************************************************' SET STATISTICS TIME ON SELECT TOP 100 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 99 *******************************************************' SET STATISTICS TIME ON SELECT TOP 99 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 150 *******************************************************' SET STATISTICS TIME ON SELECT TOP 150 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION(MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+ 'TEST 7 *******************************************************' SET STATISTICS TIME ON SELECT TOP 7 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS DBCC FREEPROCCACHE WITH NO_INFOMSGS PRINT CHAR(10)+'TEST 102 *******************************************************' SET STATISTICS TIME ON SELECT TOP 102 [RowNum],[SomeId],[SomeCode] FROM dbo.[TestData] WHERE [RowNum] < 30000 ORDER BY [SomeId] OPTION (MAXDOP 1) SET STATISTICS TIME OFF GO DROP TABLE dbo.TestData
Kalman Toth, SQL Server & Business Intelligence Training; SQL 2008 GRAND SLAM- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Tuesday, November 09, 2010 6:13 PM

