locked
Incremental statistics for clustered columnstore index? RRS feed

  • General discussion

  • Hi,

    We have sql server 2014 enterprise edition. Until now we used regular partitioned tables and rows based partitioned clustered indexes (partition key is integer value, the same for all values within partition). Now, because of business requirements change, we'll have 10 times more data, and preliminary analysis showed that having partitioned clustered columnstore index has much better performance vs traditional partitioned clustered index.

    We have table with more than 100 partitions, each partition has more than 300 million rows (very wide rows). We have scenario where we load staging table (with 300+ million rows) (25 minutes, 70GB of data), build clustered columnstore index on it (6-7 minutes), and then slide into the main table. We also sometimes slide partitions out of the main table. Therefore, distribution of partition key values changes on each of these operations. Problem is that if statistics are not up to date, db engine is not aware of new or deleted data, and it might go into scan across all partitions (instead of doing partition elimination).

    Keeping in mind that incremental stats are not supported with columnstore indexes, how to update statistics on this table? 

    Any ideas?

    Thanks


    Pedja

    Tuesday, June 14, 2016 5:08 PM

All replies

  • Could one approach be to create statistics on partitioned key of this table with incremental = ON (not on columnstore index), and then update those stats only incrementally?

    Pedja

    Tuesday, June 14, 2016 5:13 PM
  • Hi Pedja,

    Please review the following blog to get more details about how to use incremental statistics in partition table.

    http://sqlperformance.com/2015/05/sql-statistics/improving-maintenance-incremental-statistics

    And in your scenario, you can update statistics on the table by adding “SAMPLE number PERCENT” or “RESAMPLE”.

    Thanks,
    Lydia Zhang


    Lydia Zhang
    TechNet Community Support


    Thursday, June 16, 2016 3:35 AM
  • Lydia,

    this article does not mention columnstore indexes at all (incremental statistics are not supported on columnstore indexes), and that was my question, what could be approach as alternative. I am aware of incremental statistics on partitioned tables with regular clustered indexes, as I use them in production.

    To further explain, dbcc show_statistics('tablename', <clustered columnstore index name>) does not show anything, even after I run update statistics <tablename> with any sample you want. When I select from the function that is explained in that article ( [sys].[dm_db_stats_properties_internal]), for indexid = 1 (clustered columnstore index), I get no result.

    So I am afraid that down the road, when this table becomes 1-2 TB, when we slide one of these partitions in, queries against it won't do partition elimination, but scan the whole clustered columnstore index...

    Pedja



    Thursday, June 16, 2016 12:52 PM
  • Does anyone have any further suggestions for Pedja?

    Thanks!


    Ed Price, Azure Development Customer Program Manager (Blog, Small Basic, Wiki Ninjas, Wiki)

    Answer an interesting question? Create a wiki article about it!

    Saturday, July 9, 2016 1:23 AM
  • Pedja,

    Have you made any progress with this?

    Thanks!


    Ed Price, Azure Development Customer Program Manager (Blog, Small Basic, Wiki Ninjas, Wiki)

    Answer an interesting question? Create a wiki article about it!

    Sunday, October 30, 2016 1:36 AM