none
Database Partitioning

    Question

  • I am new to database partitioning.  I have an existing table in my database that I have considered using partitioning.  Is it possible to migrate an existing table to use partitioning or must the table have been originally created to support partitioning?  If it is possible to migrate a table to use partitioning, is there a recommended approach for migrating a table?

    Currently my table has the following columns:

    id - integer - The id of an object

    attr_id - integer - The id of an attribute

    time - big int - The time in milleseconds of an attribute value

    value - double - The value of an attribute

    I am looking at data paritioning to provide me with a sliding view for historic values where I can easily prune off old values without have to use delete and alleviate the overhead of reindexing the table.   I have existing customers who are using this schema so I have to figure out a way to move them forward.

    Any advice from folks who have used partitioning would be greatly appreciated.

    Regards,

    Fred

    Tuesday, April 03, 2012 7:03 PM

Answers

  • Hi Fred,

    Sliding window is one of the cases where partitioning can be beneficial. In your case you can partition by time or (if ID is kind of autoincrementing) perhaps by ID (it will increment same way with time column and while it introduces a little bit of extra complexity, it would reduce the size of the indexes (if time is not in the clustered index yet).

    Speaking of implementation - if you can take system offline, the sequence would be - dropping non-clustered indexes, rebuilding of clustered index and creating new (aligned) non-clustered indexes. If you cannot take system offline, you can use online rebuild. You just need to manage non-clustered indexes - ideally you would like to have them dropped/disabled to speed up the process of rebuilding CI but again, it depends on the other factors.

    Biggest potential issue you'd have is different behavior of aligned non-clustered indexes. You can see some examples here: http://aboutsqlserver.com/2010/12/22/sql-server-and-table-partitioning-part-2-when-partitioning-is-you-enemy/ I'd suggest to carefully test your queries before you push partitioning to production in order to avoid surprises.

    During implementation of the sliding window scenario, make sure that your have right-most partition "reserved" and empty. The goal is to avoid data movement when you split the range and make it as the metadata operation. You'd also need to be sure that your partition-split code handles deadlocks/lock timeouts correctly - even if altering the table is metadata operation, it requires SCH-M schema modification lock which is incompatible with SCH-S schema stability locks acquired by DML statements. That could be an issue in case of OLTP system under heavy load.

    I'd repeat - the most important thing you'll need to do is to test the solution. Make sure that your testing environment has enough data in the table to mimic production environment.


    Thank you!

    My blog: http://aboutsqlserver.com

    Tuesday, April 03, 2012 8:39 PM

All replies

  • Hi Fred,

    Sliding window is one of the cases where partitioning can be beneficial. In your case you can partition by time or (if ID is kind of autoincrementing) perhaps by ID (it will increment same way with time column and while it introduces a little bit of extra complexity, it would reduce the size of the indexes (if time is not in the clustered index yet).

    Speaking of implementation - if you can take system offline, the sequence would be - dropping non-clustered indexes, rebuilding of clustered index and creating new (aligned) non-clustered indexes. If you cannot take system offline, you can use online rebuild. You just need to manage non-clustered indexes - ideally you would like to have them dropped/disabled to speed up the process of rebuilding CI but again, it depends on the other factors.

    Biggest potential issue you'd have is different behavior of aligned non-clustered indexes. You can see some examples here: http://aboutsqlserver.com/2010/12/22/sql-server-and-table-partitioning-part-2-when-partitioning-is-you-enemy/ I'd suggest to carefully test your queries before you push partitioning to production in order to avoid surprises.

    During implementation of the sliding window scenario, make sure that your have right-most partition "reserved" and empty. The goal is to avoid data movement when you split the range and make it as the metadata operation. You'd also need to be sure that your partition-split code handles deadlocks/lock timeouts correctly - even if altering the table is metadata operation, it requires SCH-M schema modification lock which is incompatible with SCH-S schema stability locks acquired by DML statements. That could be an issue in case of OLTP system under heavy load.

    I'd repeat - the most important thing you'll need to do is to test the solution. Make sure that your testing environment has enough data in the table to mimic production environment.


    Thank you!

    My blog: http://aboutsqlserver.com

    Tuesday, April 03, 2012 8:39 PM
  • Thanks Dmitri,

    You've given me quite a bit to look at/investigate.  I appreciate the pointer to the examples you posted - very helpful.  The newest bit of information I've come across is that partitioning requires SQL Server enterprise edition.  My application up to this point has not required enterprise edition and I would surmise many of my customers are running my application not using enterprise edition.  I'm trying to figure out if there is a plan B to work around this.

    Regards,

    Fred

    Wednesday, April 04, 2012 2:53 PM
  • Hi Fred,

    Technically you can use separate tables as replacement of partitioning. For example create tables: Data201201, Data201202, etc and create a view that union data from all tables (using union all). Next, your application can select data from the view. The key point is to have check constraints that clearly define data ranges stored in the table (for example CHECK(time>='2012-01-01' and time <'2012-02-01'). This will allow optimizer to perform "table elimination" very similarly to partition elimination.

    While it introduces some headache, in some regards it could work even better than partitioning. Each table will have own set of statistics that can help optimizer - statistic histogram is limited to 200 steps and can become extremely inaccurate in case if table has a lot of data


    Thank you!

    My blog: http://aboutsqlserver.com

    Wednesday, April 04, 2012 7:13 PM
  • Here is an interesting blog  about partitioned views

    http://sqlblog.com/blogs/maria_zakourdaev/archive/2012/02/28/unfairly-forgotten-partitioning-views-can-help-us-make-our-partitioning-tables-design-better.aspx


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/

    Thursday, April 05, 2012 5:10 AM
    Moderator