locked
Large Index Quickly Fragmented RRS feed

  • Question

  • We have a 400GB database in SQL 2008 (compatibility 80) we recently upgrade from SQL 2000. For some reason this database seems to have frequent fragmentation issues. There are quite a few indexes in this database with similar issues listed below, but I will use hist_sku_chn_wk_$pk as an exmaple. The fill factor is set to 80%. It currently has 83,142 pages.

    I ran a script each night for the last four nights to check for fragmentation and ran an ALTER INDEX REORGANIZE for anything 5-30% fragmented and and ALTER INDEX REBUILD ONLINE anything over 30%. Here are the fragmentation levels and tasks run each night. These fragmentation levels have only occurred after we upgraded to SQL 2008. We previoulsy ran the same database with realatively the same workload in SQL 2000 and NEVER saw this level of fragmentation before! I have also run the DBCC UPDATEUSAGE. I'm not sure why else to try.

    4/26: Frag 95% - REBUILD

    4/27: Frag 5% - REORG

    4/28: Frag 65% - REBUILD

    4/29: Frag 71% REBUILD


    Steven Trout

    Monday, April 29, 2013 6:50 PM

Answers

  • CREATE TABLE [dbo].[hist_sku_loc_wk](
     [style_id] [decimal](12, 0) NOT NULL,
     [color_id] [smallint] NOT NULL,
     [size_master_id] [int] NOT NULL,
     [merch_year_wk] [int] NOT NULL,
     [location_id] [smallint] NOT NULL,
     [received_units] [int] NOT NULL,
     [return_to_vendor_units] [int] NOT NULL,
     [distributions_units] [int] NOT NULL,
     [transfer_in_units] [int] NOT NULL,
     [transfer_out_units] [int] NOT NULL,
     [sales_total_units] [int] NOT NULL,
     [return_units] [int] NOT NULL,
     [shrink_actual_units] [int] NOT NULL,
     [adjustments_total_units] [int] NOT NULL,
     CONSTRAINT [hist_sku_loc_wk_$pk] PRIMARY KEY CLUSTERED
    (
     [style_id] ASC,
     [color_id] ASC,
     [size_master_id] ASC,
     [merch_year_wk] ASC,
     [location_id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80)
    )

    GO

     


    Steven Trout

    CONSTRAINT [hist_sku_loc_wk_$pk] PRIMARY KEY CLUSTERED 
    (
     [style_id] ASC,
     [color_id] ASC,
     [size_master_id] ASC,
     [merch_year_wk] ASC,
     [location_id] ASC
    )

    That part explains why you get so much fragmentation.  This is not a monotonically increasing value, so you have random inserts into the leaf level of the clustered index, which causes fragmentation as pages have to split.  I would expect any wide clustering key like this to have fragmentation issues.  Additionally if you update any of these you change the page location for the row in the leaf level of the clustered index.  

    You have other side effects from this as well that you may or may not realize since the clustered index keys are also a part of non-clustered indexes, so each non-clustered index has a 21 byte overhead per entry because your clustered index key is so wide.  I'd recommend reading Kimberly's blog posts http://www.sqlskills.com/blogs/kimberly/ever-increasing-clustering-key-the-clustered-index-debate-again/ and http://www.sqlskills.com/blogs/kimberly/more-considerations-for-the-clustering-key-the-clustered-index-debate-continues/ and the links that she provides in those to understand the effect of this specific key column selection.


    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server

    Monday, April 29, 2013 10:13 PM

All replies

  • Another point. I am using the 'IndexOptimze' script from http://ola.hallengren.com/ to determine fragmentation levels, ect. It had occurred to me as well that perhaps the index is being reported as fragmented incorrectly, but I cannot find anything to support the theory. The script uses sys.dm_db_index_physical_stats to determine the fragmentation levels. The database was restored from SQL 2000 and is currently running in compatibility mode 80 as previously mentioned.


    Steven Trout

    Monday, April 29, 2013 6:53 PM
  • Why does it get fragmented?

    How many rows?  How many daily inserts? How many daily updates of indexed columns? How many daily deletes?

    Do you UPDATE STATISTICS nightly?

    Optimization article: http://www.sqlusa.com/articles/query-optimization/

    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012


    Monday, April 29, 2013 7:14 PM
  • Please post the table and index DDL and describe the distribution and frequency of INSERTs into the PK, and the output of sys.dm_index_physical_stats.

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, April 29, 2013 7:20 PM
  • There are currently 8,296,614 rows in the table. Yes I update stats nightly as well unless I rebuild the index.

    The table is rolled up data by week by UPC by store so the majority of the inserts will be when start a new week. Every other day its mostly updates. There very few deletes. In any case its pretty much the same as it has been over the last 6 years. Business practice is the same. Updates / inserts /deletes would be the same as before.


    Steven Trout

    Monday, April 29, 2013 8:26 PM
  • Running: SELECT * FROM sys.dm_db_index_physical_stats(@db_id, @object_id, NULL, NULL , 'LIMITED');

    Its returning 523 rows (partitions). I'm not familiar with this output. Do you need all rows?


    Steven Trout

    Monday, April 29, 2013 8:44 PM
  • This DMV didn't exist in SQL 2000, and it's now what'd driving your fragmentation diagnostics. 

    Have you partitioned this table? What's the CREATE TABLE DDL along with indexes and keys?

    David


    David http://blogs.msdn.com/b/dbrowne/

    Monday, April 29, 2013 8:50 PM
  • CREATE TABLE [dbo].[hist_sku_loc_wk](
     [style_id] [decimal](12, 0) NOT NULL,
     [color_id] [smallint] NOT NULL,
     [size_master_id] [int] NOT NULL,
     [merch_year_wk] [int] NOT NULL,
     [location_id] [smallint] NOT NULL,
     [received_units] [int] NOT NULL,
     [return_to_vendor_units] [int] NOT NULL,
     [distributions_units] [int] NOT NULL,
     [transfer_in_units] [int] NOT NULL,
     [transfer_out_units] [int] NOT NULL,
     [sales_total_units] [int] NOT NULL,
     [return_units] [int] NOT NULL,
     [shrink_actual_units] [int] NOT NULL,
     [adjustments_total_units] [int] NOT NULL,
     CONSTRAINT [hist_sku_loc_wk_$pk] PRIMARY KEY CLUSTERED
    (
     [style_id] ASC,
     [color_id] ASC,
     [size_master_id] ASC,
     [merch_year_wk] ASC,
     [location_id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80)
    )

    GO

     


    Steven Trout

    Monday, April 29, 2013 9:22 PM
  • Looking at the "Storage" informatin on the table:

    Table is partitioned: True, Vardecimal storage format is endabled: False, Index space 10mb (?!), Row Count: 8296668, Data space: 649mb, Partition scheme: merch_year_wk_ps, Number of partitions: 523, Partition column: merch_year_wk


    Steven Trout

    Monday, April 29, 2013 9:39 PM
  • CREATE TABLE [dbo].[hist_sku_loc_wk](
     [style_id] [decimal](12, 0) NOT NULL,
     [color_id] [smallint] NOT NULL,
     [size_master_id] [int] NOT NULL,
     [merch_year_wk] [int] NOT NULL,
     [location_id] [smallint] NOT NULL,
     [received_units] [int] NOT NULL,
     [return_to_vendor_units] [int] NOT NULL,
     [distributions_units] [int] NOT NULL,
     [transfer_in_units] [int] NOT NULL,
     [transfer_out_units] [int] NOT NULL,
     [sales_total_units] [int] NOT NULL,
     [return_units] [int] NOT NULL,
     [shrink_actual_units] [int] NOT NULL,
     [adjustments_total_units] [int] NOT NULL,
     CONSTRAINT [hist_sku_loc_wk_$pk] PRIMARY KEY CLUSTERED
    (
     [style_id] ASC,
     [color_id] ASC,
     [size_master_id] ASC,
     [merch_year_wk] ASC,
     [location_id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80)
    )

    GO

     


    Steven Trout

    CONSTRAINT [hist_sku_loc_wk_$pk] PRIMARY KEY CLUSTERED 
    (
     [style_id] ASC,
     [color_id] ASC,
     [size_master_id] ASC,
     [merch_year_wk] ASC,
     [location_id] ASC
    )

    That part explains why you get so much fragmentation.  This is not a monotonically increasing value, so you have random inserts into the leaf level of the clustered index, which causes fragmentation as pages have to split.  I would expect any wide clustering key like this to have fragmentation issues.  Additionally if you update any of these you change the page location for the row in the leaf level of the clustered index.  

    You have other side effects from this as well that you may or may not realize since the clustered index keys are also a part of non-clustered indexes, so each non-clustered index has a 21 byte overhead per entry because your clustered index key is so wide.  I'd recommend reading Kimberly's blog posts http://www.sqlskills.com/blogs/kimberly/ever-increasing-clustering-key-the-clustered-index-debate-again/ and http://www.sqlskills.com/blogs/kimberly/more-considerations-for-the-clustering-key-the-clustered-index-debate-continues/ and the links that she provides in those to understand the effect of this specific key column selection.


    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server

    Monday, April 29, 2013 10:13 PM
  • Thank you very much! I will read up on the blog before I ask any other questions, but in your opinion will this cause a problem? Are the high levels of fragmentation I am seeing a concern or are they misleading because of the configuration? Do I need to continue to reindex daily? Thanks again.

    Steven Trout

    Monday, April 29, 2013 10:53 PM
  • ...also I believe the index is setup the same way it was before across the style, color, size, week, location, ect. Its been that way for 6 years without ever seeing the fragmentation we are seeing now daily. Again, I will read the blog tonight.

    Steven Trout

    Monday, April 29, 2013 10:56 PM
  • ...also I believe the index is setup the same way it was before across the style, color, size, week, location, ect. Its been that way for 6 years without ever seeing the fragmentation we are seeing now daily. Again, I will read the blog tonight.

    Steven Trout

    Were you monitoring it previously?  Are you sure that you didn't have a problem with your previous process?  The index allocation semantics at this level of things doesn't change significantly if at all from 2000 to 2012.  My guess would be that there was a problem with the way you were using DBCC SHOWCONTIG in SQL Server 2000 to find the fragmentation percentage because under the same INSERT/UPDATE workload to that table, fragmentation would be a problem in SQL Server 2000 as well.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server

    Monday, April 29, 2013 11:02 PM
  • And, of course, the table wasn't partitioned on SQL 2000.  If you have seperate partitions per week, which it appears you do, then your weekly inserts can be followed by a rebuild of the partition for that week, fixing any fragmentation introduced during the load.  Updates to the non-key columns of this table shouldn't increase fragmentation.

    And the fragmentation and increased loading cost of having meaningful, compound keys is often offset by decreased cost of updating, inserting and querying related data.  

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, April 29, 2013 11:22 PM
  • We have some tables that are 300+ million rows. I couldnt imagine inserting a few thousand rows in a table would cause it to become 90%+ fragmented in one day. I think the issue is in the index partitioning and the way I am reporting the fragmentation NOW. The partition is new to me so I'm learning as I go, but here is my theory. Since we are partitioning the index by week I could see why ONE of the partitions for the new week we are currently updating could become fragmented by inserting data for that week, but not the entire index with millions of rows. Perhaps the fragmentation level I am seeing is being misrepresented by that one index partition being fragmented. Does that sound right?

    Steven Trout

    Tuesday, April 30, 2013 2:24 PM
  • That sounds right.  You need to dig into how you are measuring fragmentation. 

    David


    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, April 30, 2013 2:26 PM
  • So hopefully last question... There is a switch in the script I am running http://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html that allows me to check at the partition level for fragmentation and REBUILD if needed.

    I was all set to start using that, but saw this disclaimer reading the details for the ALTER INDEX using the PARTITION option. It is saying anything over 1000 partitions is an issue or rebuilding an index at the partition level is an issue either way? Do you think its safe to use the 'partition' switch?

    http://technet.microsoft.com/en-us/library/ms188388.aspx

    Caution:
    Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is possible, but is not supported. Doing so may cause degraded performance or excessive memory consumption during these operations. We recommend using only aligned indexes when the number of partitions exceed 1,000.


    Steven Trout

    Wednesday, May 1, 2013 8:19 PM
  • Do you have non-aligned indexes and more than 1000 partitions?

    I suspect that that maintenance script is mis-reporting your real fragmentation somehow.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, May 2, 2013 2:14 PM
  • What exactly is a "non-aligned" index? Looking at some of the indexed they are partitioned by week and we have quite a few weeks in the system so I do see quite a few partitions. How do I know if they are "non-aligned"? I doubt there are any with 1000+ partitions in any case. That would be almost 20 years of data.


    Steven Trout

    Thursday, May 2, 2013 8:13 PM
  • What exactly is a "non-aligned" index? Looking at some of the indexed they are partitioned by week and we have quite a few weeks in the system so I do see quite a few partitions. How do I know if they are "non-aligned"? I doubt there are any with 1000+ partitions in any case. That would be almost 20 years of data.


    Steven Trout

    A non-aligned index either uses a different partition scheme or filegroup than the partitioned table.  If everything uses the same week based partitioning scheme then the indexes are aligned.  You can tell this by looking at the index properties or the DDL CREATE INDEX script for the indexes and table.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server

    Thursday, May 2, 2013 8:40 PM
  • So hopefully last question... There is a switch in the script I am running http://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html that allows me to check at the partition level for fragmentation and REBUILD if needed.

    I was all set to start using that, but saw this disclaimer reading the details for the ALTER INDEX using the PARTITION option. It is saying anything over 1000 partitions is an issue or rebuilding an index at the partition level is an issue either way? Do you think its safe to use the 'partition' switch?

    http://technet.microsoft.com/en-us/library/ms188388.aspx

    Caution:
    Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is possible, but is not supported. Doing so may cause degraded performance or excessive memory consumption during these operations. We recommend using only aligned indexes when the number of partitions exceed 1,000.


    Steven Trout

    The downside of using the option to do index maintenance on the partition level is that rebuild operations will be performed offline. That is a SQL Server limitation.

    Ola Hallengren
    http://ola.hallengren.com

    Friday, May 3, 2013 5:31 PM
  • No problem. The maint is scheduled off hours.

    So there shouldnt be a problem using that parameter on a database with partitioned indexes? Obviously you dont know my environment, but is there anything I really need to watch out for? The whole reason behind this is because I am rebuilding the index on some rather large tables every night and I think its because I'm NOT using the partition parameter currently and it thinks the entire index is highly fragmented due to a single partition being fragmented...


    Steven Trout

    Friday, May 3, 2013 5:41 PM
  • I have not experienced any issues with the partition level parameter.
    http://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html#PartitionLevel

    Ola Hallengren
    http://ola.hallengren.com

    Friday, May 3, 2013 5:50 PM