Filtered Index vs. Full Index
-
Friday, January 04, 2013 7:35 PMModerator
According to the next article, the AddressLine2 filtered index should make the query with AddressLine2 predicate fly faster than a speeding bullet since there are only 362 non null values in 19K. In reality, tiny gain in logical reads. Why? Thanks.
/******************* CREATE TEST TABLE ********************************/ USE tempdb; GO SELECT * INTO Address FROM AdventureWorks2012.Person.Address order by newid(); GO ALTER TABLE Address ADD CONSTRAINT pkBE PRIMARY KEY(AddressID); GO -- select * from Address; SELECT COUNT(*), COUNT(AddressLine2) FROM Address; -- 19614 362 GO /******************* CREATE FULL INDEX & TEST ********************************/ CREATE INDEX idxFull on Address(AddressLine2); GO EXEC sp_spaceused 'Address'; GO /* name rows reserved data index_size unused Address 19614 3232 KB 2752 KB 312 KB 168 KB */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON; SELECT * FROM Address WHERE AddressLine2 = 'Verkaufsabteilung'; SET STATISTICS IO OFF; GO /* Table 'Address'. Scan count 1, logical reads 133, physical reads 0, read-ahead reads 127, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ DROP INDEX Address.idxFull; GO /******************* CREATE FILTERED INDEX & TEST ********************************/ CREATE INDEX idxFiltered on Address(AddressLine2) WHERE AddressLine2 is NOT NULL; GO EXEC sp_spaceused 'Address'; GO /* name rows reserved data index_size unused Address 19614 2936 KB 2752 KB 120 KB 64 KB */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON; SELECT * FROM Address WHERE AddressLine2 = 'Verkaufsabteilung'; SET STATISTICS IO OFF; GO /* Table 'Address'. Scan count 1, logical reads 131, physical reads 0, read-ahead reads 127, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ DROP INDEX Address.idxFiltered; GO DROP TABLE Address; GO
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012- Edited by Kalman TothMicrosoft Community Contributor, Moderator Friday, January 04, 2013 7:36 PM
All Replies
-
Friday, January 04, 2013 7:42 PMModerator
Try only selecting the AddressLine2 column. Or add a few columns and add them into INCLUDE columns.
Also, instead of using a particular value try
where AddressLine2 LIKE 'V%' and AddressLine2 IS NOT NULL.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Friday, January 04, 2013 7:47 PMModerator
Hi Naomi,
The question is not how to optimize the query though.
The question why is the filtered index performance so lackluster? It is a simple key equality predicate?
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012- Edited by Kalman TothMicrosoft Community Contributor, Moderator Friday, January 04, 2013 7:53 PM
-
Friday, January 04, 2013 7:49 PMModerator
Did you check the query plan? I am not sure your index is even used and even if it's used, you still need to do a key lookup to get all the columns. So, first test is to select just the Address2 column to see if the filtered index provides any gain.
Also, 19K rows is a very small table.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Friday, January 04, 2013 9:03 PMModeratorYes Naomi, I checked the query plan. Nothing exciting, just the usual culprit: Key lookup.
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012 -
Friday, January 04, 2013 9:05 PMModeratorExactly my point. I think you will easily see the advantage of that index if you will be able to eliminate key lookup.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog -
Friday, January 04, 2013 9:07 PM
With the test you are running, I would not expect to see a significant difference, I'm a little surprised you saw any difference at all.
When you have the idxFull index and you run your query SELECT * FROM Address WHERE AddressLine2 = 'Verkaufsabteilung';, SQL can see that idxFull is a good index for this query (your where clause is the key to the index and statistics tells SQL only an extremely small percentage of the rows in the table will be returned by this query). So SQL uses this index. But it doesn't have to scan the whole index. It can use the top level and any intermediate levels in the index to go directly to the first leaf entry where AddressLine2 = 'Verkaufsabteilung'. It then uses the clustered index to get the other columns for this row. Then it gets the next row in the index, then repeats the process. It stops as soon as it finds the first row in the index that has AddressLine2 greater than 'Verkaufsabteilung'. So the fact that this index has lots of other rows (including rows where AddressLine2 Is Null) doesn't make any difference. SQL only reads the rows where AddressLine2 = 'Verkaufsabteilung'.
When you have the filtered index idxFiltered, SQL goes though the exact same process. It still only has to look at the rows with AddressLine2 = 'Verkaufsabteilung', so it looks at the same number of rows (there can be noise because this index might have fewer levels in the B-tree and maybe other minor things). So you would expect the statistics to be the same (or essentially the same).
If you want a significant performance difference between a filtered index and a regular index, you would need a case where not only does the filtered index have significantly fewer rows, but the query needs to scan either the whole index or at least a lot of rows which are in the nonfiltered index, but not in the filtered one. For, example, you could try using the following two indexes and the following query against each of the indexes.
CREATE INDEX idxFull on Address(AddressLine1); CREATE INDEX idxFiltered on Address(AddressLine1) WHERE AddressLine2 is NOT NULL; SELECT AddressLine1 FROM Address WHERE AddressLine2 IS NOT NULL AND AddressLine1 Like '1%';
Tom
- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Friday, January 04, 2013 9:14 PM
- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 1:25 AM
-
Friday, January 04, 2013 9:17 PMModerator
Exactly my point. I think you will easily see the advantage of that index if you will be able to eliminate key lookup.
Yes Naomi, that is the unfulfilled dream of the entire RDBMS industry. There is limit to key lookup elimination though. Here is the 20 million rows version.
Yes Tom, but this is supposed to work for regular queries, not only for some made-up queries. The index is smaller indeed. The benefit is missing, gobbled up by a few key lookups.
http://technet.microsoft.com/en-us/library/cc280372.aspx
/******************* CREATE TEST TABLE of 20 million rows ********************************/ USE tempdb; GO SELECT [AddressLine1] ,[AddressLine2] ,[City] ,[StateProvinceID] ,[PostalCode] ,[SpatialLocation] ,[rowguid] ,[ModifiedDate] INTO Address FROM AdventureWorks2012.Person.Address order by newid(); GO ALTER TABLE Address ADD AddressID INT IDENTITY(1,1); GO INSERT Address ([AddressLine1] ,[AddressLine2] ,[City] ,[StateProvinceID] ,[PostalCode] ,[SpatialLocation] ,[rowguid] ,[ModifiedDate]) SELECT [AddressLine1] ,[AddressLine2] ,[City] ,[StateProvinceID] ,[PostalCode] ,[SpatialLocation] ,[rowguid] ,[ModifiedDate] FROM Address; GO 10 ALTER TABLE Address ADD CONSTRAINT pkBE PRIMARY KEY(AddressID); GO -- select * from Address; SELECT COUNT(*), COUNT(AddressLine2) FROM Address; -- 20084736 370688 GO UPDATE TOP (30) Address SET AddressLine2 = 'YellowStone' WHERE AddressLine2 = 'Verkaufsabteilung'; /******************* CREATE FULL INDEX & TEST ********************************/ CREATE INDEX idxFull on Address(AddressLine2); GO EXEC sp_spaceused 'Address'; GO /* name rows reserved data index_size unused Address 20084736 2986136 KB 2774888 KB 210968 KB 280 KB */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON; SELECT * FROM Address WHERE AddressLine2 = 'Yellowstone'; SET STATISTICS IO OFF; GO /* Table 'Address'. Scan count 1, logical reads 155, physical reads 0, read-ahead reads 158, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ DROP INDEX Address.idxFull; GO /******************* CREATE FILTERED INDEX & TEST ********************************/ CREATE INDEX idxFiltered on Address(AddressLine2) WHERE AddressLine2 is NOT NULL; GO EXEC sp_spaceused 'Address'; GO /* name rows reserved data index_size unused Address 20084736 2791040 KB 2774888 KB 15696 KB 456 KB */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON; SELECT * FROM Address WHERE AddressLine2 = 'Yellowstone'; SET STATISTICS IO OFF; GO /* Table 'Address'. Scan count 1, logical reads 153, physical reads 0, read-ahead reads 158, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ DROP INDEX Address.idxFiltered; GO DROP TABLE Address; GO
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012
- Edited by Kalman TothMicrosoft Community Contributor, Moderator Friday, January 04, 2013 10:36 PM
-
Friday, January 04, 2013 11:08 PM
Yes Tom, but this is supposed to work for regular queries, not only for some made-up queries. The index is smaller indeed. The benefit is missing, gobbled up by a few key lookups.
The benefit is there, right before your eyes, but you are not just looking. And even that difference is small, because you did not pick a very good example:
index_size
312 KBindex_size
120 KBHad you instead picked an integer or char(1) column with the same amount of NULL, the space reduction would have been more significant.
A query that uses an Index Seek is not going to be particularly faster with a filtered index. The only time you will see a significant difference is when the search conditions strays into the area that is filtered out...
However, updates are faster, as they only have to write to the index for those exceptional rows.
And there are queries that can be faster with a filtered index. Here is an example:
SELECT a.*, newid() as clef INTO #gusten FROM sys.columns a CROSS JOIN sys.columns B go CREATE UNIQUE INDEX full_ix ON #gusten (object_id, column_id, clef) go SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = 6 AND system_type_id = 37 go SET STATISTICS IO OFF go DROP INDEX #gusten.full_ix CREATE UNIQUE INDEX filter_ix ON #gusten (object_id, column_id, clef) WHERE system_type_id = 37 go SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = 6 AND system_type_id = 37 go SET STATISTICS IO OFF go DROP TABLE #gusten
In this case, the query cannot be resolved with a Seek, but only a Scan. The filter reduces the size of the index, why the scan is a lot faster.
The caveat here is that the optimizer does not match filtered indexes in all instances where it could, so you need to be careful.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 1:25 AM
-
Saturday, January 05, 2013 1:12 AMModerator
Hi Erland,
Testing your covering index method:
o With non-matching predicate, the filter index is slower.
o With matching predicate, the filter index is faster.
Covering index blog:
http://www.sqlusa.com/bestpractices/coveringindex/
SELECT a.*, newid() as clef INTO #gusten FROM sys.columns a CROSS JOIN sys.columns B go CREATE UNIQUE INDEX full_ix ON #gusten (object_id, column_id, clef) go SELECT TOP 1 * FROM #gusten ORDER BY clef desc; GO -- 8 name 3 239 239 ...... EXEC sp_spaceused #gusten; -- index: 26800 KB DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = 3 AND system_type_id = 239 go SET STATISTICS IO OFF go /* 841 MATCHING: Scan count 5, logical reads 7687, */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = -1 AND system_type_id = -1 go SET STATISTICS IO OFF go /* NO MATCHING: Scan count 1, logical reads 3340, */ DROP INDEX #gusten.full_ix GO CREATE UNIQUE INDEX filter_ix ON #gusten (object_id, column_id, clef) WHERE system_type_id = 239 go EXEC sp_spaceused #gusten; -- index: 2416 KB DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = 3 AND system_type_id = 239 go SET STATISTICS IO OFF go /* 841 MATCHING: Scan count 1, logical reads 26, */ DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON go SELECT COUNT(*) FROM #gusten WHERE column_id = -1 AND system_type_id = -1 go SET STATISTICS IO OFF go /* NO MATCHING: Scan count 5, logical reads 7687, */ DROP INDEX #gusten.filter_ix GO DROP TABLE #gusten
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012
- Edited by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 3:27 PM
-
Saturday, January 05, 2013 10:24 AM
Testing your covering index method:
Note that my example has nothing to do with covering index per se. I just wrote COUNT(*) because I was on autopilot. If you change it to SELECT *, you get the same result. That is, the optimizer can opt to scan an index and combine it with key/RID lookup if it deems that this is faster.
o With non-matching predicate, the filter index is slower.
o With matching predicate, the filter index is faster.
Note sure what you mean here. With a non-matching predicate, the filtered index is not applicable and will be discarded.It goes without saying that you should design you filtered indexes so that they match commonly used query predicates in the application.
Another advantage with filtered indexes is that they can be used to implement conditional constraints like SSN must be unique when not null. (Which is how the UNIQUE constraint is supposed to work according to the ANSI standard.) Or for a given customerid there must be at most one row with isdefault = 1.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 3:32 PM
-
Saturday, January 05, 2013 3:32 PMModerator
>scan an index and combine it with key/RID lookup
That was the down fall of the first test in this thread: fast index seek & killer key lookups.
>Another advantage with filtered indexes is that they can be used to implement conditional constraints like SSN must be unique when not null. (Which is how the UNIQUE constraint is supposed to work according to the ANSI standard.)
That is superb information.
>Or for a given customerid there must be at most one row with isdefault = 1.
Explain pls. Thanks.
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012 -
Saturday, January 05, 2013 4:59 PM
>Or for a given customerid there must be at most one row with isdefault = 1.
Explain pls. Thanks.
If you create a table and filtered index like the following a customer can have multiple addresses, but at most one address for each customer can have IsDefault = 1.
use tempdb go Create Table FooCustomerAddresses(CustomerID int, AddressID int, IsDefault tinyint Default (0) Constraint FCAPK Primary Key(CustomerID, AddressID) Constraint FCAIsDefaultCk Check (IsDefault In (0,1))); Create Unique Index FooCAIsDefaultIdx On FooCustomerAddresses (CustomerID) Where IsDefault = 1; -- Following Works Insert FooCustomerAddresses(CustomerID, AddressID, IsDefault) Values (1, 1, 0), (1, 2, 1), (2, 3, 1); go -- Following Fails because CustomerID 1 already has a row with IsDefault = 1 Insert FooCustomerAddresses(CustomerID, AddressID, IsDefault) Values (1, 4, 1); go -- Check the first 3 rows entered OK, 4th row failed Select * From FooCustomerAddresses; go Drop Table FooCustomerAddresses;
Tom
- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 6:59 PM
-
Saturday, January 05, 2013 8:53 PM
That was the down fall of the first test in this thread: fast index seek & killer key lookups.
But the key lookups really has nothing to do it. The point is that with a Seek, there is little different between and a full index (as long as the query fall into the filter, that is). You shave off some of the upper levels in the index, but that's all.
I see that Tom gave an excellent explanation of what I meant with isdefault = 1, so I am not going to repeat it.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se -
Saturday, January 05, 2013 9:06 PMModerator
But the key lookups really has nothing to do it. The point is that with a Seek, there is little different between and a full index (as long as the query fall into the filter, that is). You shave off some of the upper levels in the index, but that's all.
The only relationship to key lookup: 97% of the query cost was key lookup, 3% index seek. Therefore, it is hard to show any performance gain by improving the index seek with filtered index.
Kalman Toth SQL 2008 GRAND SLAM
New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012 -
Saturday, January 05, 2013 9:27 PM
The only relationship to key lookup: 97% of the query cost was key lookup, 3% index seek. Therefore, it is hard to show any performance gain by improving the index seek with filtered index.
You don't get it: generally you don't improve plans with Index Seek a lot with replacing a full index with a filtered index.
Let's say that you will look up Roger Anderson in the phone book. Does that go fast if you have a phone book which only has A to J, or you have the full phone book?
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Saturday, January 05, 2013 11:08 PM

