locked
Partitioning Advice on PRIMARY FILE GROUP RRS feed

  • Question

  • Hi,

    I wonder if you can help. I would like some advice on partitioning within the PRIMARY file group. Basically I have a lot of experience on partitioning and in the past I have followed best practice is documents such as Partitioned Table and Index Strategies Using SQL Server 2008   and the Analysis Services Performance Guide   so in summary when partitioning in the past I have followed best practice, e.g. have one file group per partition and have your partition key as a component within the primary key of the table you are partitioning.

    However following best practice in this way can require a lot of maintainence, and the system I am performance tuning at the moment has a limited life span.

    So in this context I wonder if there are any advantages of basing partitioning on a PRIMARY file group?

    Kind Regards,

    Kieran.


    Kieran Patrick Wood http://www.innovativebusinessintelligence.com http://uk.linkedin.com/in/kieranpatrickwood http://kieranwood.wordpress.com/

    Wednesday, August 6, 2014 3:06 PM

Answers

  • I don't see any advantage of placing data on Primary file group probably less overhead and simple to create the partition:-).Paritition means large tables and huge tables, think from DR standpoint, definitely placing files in different file group help in terms of backup and restore(Piecemeal restore) strategy, data purge and archiving but this can't be achieved when we place in Primary FG.

    It's not recommended to place the data in Primary file group as it contains sys objects.

    --Prashanth

    Wednesday, August 6, 2014 4:27 PM
  • Partitioning is mainly a manageability feature, not a performance feature. Incorrectly used, and with some help from limitaitons in the optimizer, your performance may go in the other direction.

    Whether you have the recent partition on the primary file group on some other partition has per se no effect on performance. But say that you have some Fusion IO cards that are big enough to fit the recent data, but your budget does not permit you get cards to fit the entire table. Then you could put the recent partition on the fast storage, and the other partitions on slower storage. However, this means that you will regularly have to copy data from one place to another. Which kind of goes against the main idea with partitioning: being able to age out old data fast.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, August 8, 2014 9:30 AM
  • So getting back to my question in the above context: If you slice on a particular partition with a particular partition key value, e.g. PeriodYYYYMM = 201408 on a relational database table, will it make a difference to performance if this table is partitioned by the PRIMARY file group, or should the partitions have a separate FILE GROUP for each partition?

    I would expect no significant performance difference between single versus multiple filegroups if the underlying files are on the same spindles.  Segregating the sequential scan physical I/O during parallel operations will provide the most benefit.  Importantly, consider that other concurrent I/O activity during the scans will reduce performance.


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Friday, August 8, 2014 11:38 AM

All replies

  • I don't see any advantage of placing data on Primary file group probably less overhead and simple to create the partition:-).Paritition means large tables and huge tables, think from DR standpoint, definitely placing files in different file group help in terms of backup and restore(Piecemeal restore) strategy, data purge and archiving but this can't be achieved when we place in Primary FG.

    It's not recommended to place the data in Primary file group as it contains sys objects.

    --Prashanth

    Wednesday, August 6, 2014 4:27 PM
  • OK thanks Prashanth,

    This is helpful thank you.

    So can you  confirm whether or not there is no significant performance gain to be achieved by querying the most recent partitions where the partitions are based on the PRIMARY file group as opposed to an equivalent query on a none partitioned table?

    Kind Regards,

    Kieran.


    Kieran Patrick Wood http://www.innovativebusinessintelligence.com http://uk.linkedin.com/in/kieranpatrickwood http://kieranwood.wordpress.com/


    Thursday, August 7, 2014 6:50 AM
  • Can I have any clarification on this from anyone please?; -

    So can you  confirm whether or not there is no significant performance gain to be achieved by querying the most recent partitions where the partitions are based on the PRIMARY file group as opposed to an equivalent query on a none partitioned table?

    I am trying to negotiate on using best practice for partitioning with my client to fix an urgent performance issue. So clarification on the above would be greatly appreciated

    Kind Regards,

    Kieran.


    Kieran Patrick Wood http://www.innovativebusinessintelligence.com http://uk.linkedin.com/in/kieranpatrickwood http://kieranwood.wordpress.com/


    Friday, August 8, 2014 8:19 AM
  • Partitioning is mainly a manageability feature, not a performance feature. Incorrectly used, and with some help from limitaitons in the optimizer, your performance may go in the other direction.

    Whether you have the recent partition on the primary file group on some other partition has per se no effect on performance. But say that you have some Fusion IO cards that are big enough to fit the recent data, but your budget does not permit you get cards to fit the entire table. Then you could put the recent partition on the fast storage, and the other partitions on slower storage. However, this means that you will regularly have to copy data from one place to another. Which kind of goes against the main idea with partitioning: being able to age out old data fast.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, August 8, 2014 9:30 AM
  • Hi Erland,

    Many thanks for your advice which has helped me set a wider context.

    I'm sorry, I think this question is drifting onto Analysis Services even though I started out asking a question related to Partitioning Advice on PRIMARY FILE GROUP.

    Your answer increases my empathy with people who are skeptical about the performance advantages of partitioning very large tables even when these very large tables are the data source of a cube.

    Is it possible for a question in these exceptional circumstances to be associated with 2 forums, i.e Transact SQL and Analysis Services? - since the scope of this question covers both areas.

    I have had a lot of success in dramatically speeding up the processing of a cube with multi-billion record fact tables in the underlying data warehouse for previous clients. Where this cube had the partition key in the WHERE clause of the Cube partition which was the same as the partition index in the underlying fact table. And where each fact table had a different physical file group for each partition. The cube, data warehouse, transaction logs, and tempdbs were on separate spindles. However the partitions within the cube and the partitions within the data warehouse were not on separate spindles for each of the partitions.

    A proven advantage of implementing the above design strategy is that it increases parallelism. Please see page 86 of the best practice document on Analysis Services 2008 R2 Performance Guide.

    So getting back to my question in the above context: If you slice on a particular partition with a particular partition key value, e.g. PeriodYYYYMM = 201408 on a relational database table, will it make a difference to performance if this table is partitioned by the PRIMARY file group, or should the partitions have a separate FILE GROUP for each partition?

    Kind Regards,

    Kieran.


    Kieran Patrick Wood http://www.innovativebusinessintelligence.com http://uk.linkedin.com/in/kieranpatrickwood http://kieranwood.wordpress.com/

    Friday, August 8, 2014 10:59 AM
  • So getting back to my question in the above context: If you slice on a particular partition with a particular partition key value, e.g. PeriodYYYYMM = 201408 on a relational database table, will it make a difference to performance if this table is partitioned by the PRIMARY file group, or should the partitions have a separate FILE GROUP for each partition?

    I would expect no significant performance difference between single versus multiple filegroups if the underlying files are on the same spindles.  Segregating the sequential scan physical I/O during parallel operations will provide the most benefit.  Importantly, consider that other concurrent I/O activity during the scans will reduce performance.


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Friday, August 8, 2014 11:38 AM
  • Many thanks again Erland, Dan and Prashanth. This is not the answer I was hoping for but it's better for me to be more informed in the longer term after short term pain. So I now understand I need to push for dedicated partitions on the production server and also extend the ETL behind the views which are the data source of the Cube partitions so the views are simplified to the extent that each cube partition is querying a view that is querying a single table partition. All of this will take time and the loss of some functionality of the cube in the interim.

    Many thanks again,

    Kind Regards,

    Kieran.


    Kieran Patrick Wood http://www.innovativebusinessintelligence.com http://uk.linkedin.com/in/kieranpatrickwood http://kieranwood.wordpress.com/


    Friday, August 8, 2014 12:11 PM