locked
Question on sql statistics RRS feed

  • Question

  • We have in our database nearly 3000 tables
    and statistics are created according to default setting of Sql Server i.e Sampled (and not full scan) should i delete all statistics and rebuild it Full Scan.
    My concern
    1. Will it create insert,update, delete slow(very)
    2. Any other performance issue.

    (If no problem comes then why Sql is not giving Full Scan as default statistics.)
    • Moved by Kalman Toth Saturday, March 1, 2014 5:44 PM Not database design
    • Edited by Kalman Toth Saturday, March 1, 2014 5:45 PM spelling
    Saturday, March 1, 2014 4:55 PM

Answers

  • There is no need to delete statistics to recreate them with fullscan, just run UPDATE STATISTICS WITH FULLSCAN.

    Having statistics updated with fullscan will not incur any impact on INSERT, UPDATE or DELETE statements, once the the statistics have been created as statistics are not maintained with each and every statement.

    However, running UPDATE STATISTICS WITH FULLSCAN on a big table will incur load on the system, as SQL Server scans the table once for each non-indexes column there is statistics.

    And that is the reason, SQL Server defaults to sampled statistics. to reduce the impact of statistics creation. It is not likely that FULLSCAN is always needed, but there are certainly situations where the sampled statistics are insufficient. A compromise is to run WITH FULLSCAN, INDEX. This restricts the operation to indexed column. These statistics are likely to be more important, and they are also faster to update, because here SQL Server only have to scan the index which is smaller and already sorted.

    From this follows that you should run UPDATE STATISTICS outside business hours, and if you don't have all the hours to run fullscan on your 3000 tables you should not do it. But if you identify an occasional table where this gives effect, do it with that one.

    Important to know is that when you create or rebuild in an index, this updates the statistics for that index with fullscan. (Since SQL Server is reading all data anyway.)


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Kalman Toth Sunday, March 2, 2014 1:01 PM
    • Marked as answer by Fanny Liu Wednesday, March 12, 2014 1:31 AM
    Saturday, March 1, 2014 8:15 PM
  • SQL Server works best if the statistics are a good representation of the data. Usually, the default method of sampling works just fine. When it uses that method, it basically assumes that the pages that are sampled will yield a representative distribution and cardinality of the column(s) for which the statistics are collected.

    When the data is changed a lot (Inserts/Updates/Deletes), then the statistics may become stale (=nonrepresentative) and should be updated. SQL Server does this automatically, but in some situation, this auto-update-statistics kicks in too late, and the consequence is poorly performing queries.

    To a lesser extent sampled statistics can give similar problems. If the distribution of the values is very uneven, and there is a high correlation with the clustered index (or heap pages) then column values with a low cardinality may be missed, leading to poorly performing queries if these values are queried.  However, such situation is very uncommon. Usually the default option works just fine.

    The reason it usually works fine, is because we are only talking about nonindexed columns here. Statistics of indexes are based on all values.

    Because of all this,it can make sense to regularly run UPDATE STATISTICS when there is little activity on the server. Whether or not you want to add WITH FULLSCAN to those statements is up to you. The only drawback of that is the extra time it will take. In general I would say that running UPDATE STATISTICS regularly is much more important than whether or not you update them WITH FULLSCAN.


    Gert-Jan

    • Proposed as answer by Kalman Toth Sunday, March 2, 2014 1:01 PM
    • Marked as answer by Fanny Liu Wednesday, March 12, 2014 1:31 AM
    Sunday, March 2, 2014 10:14 AM

All replies

  • If statistics are obsolete, then there would be many issues. For high OLTP system, its good to run the statistics in periodic manner possibly by daily for high workload tables(INSERT/UPDATE/DELETE).

    I highly recommend you to read the below article, there are many question and answers session, which would definitely help you to understand more on statistics:

    https://www.simple-talk.com/sql/performance/sql-server-statistics-problems-and-solutions/

    Saturday, March 1, 2014 5:04 PM
  • see

    https://blogs.manageengine.com/product-blog/appmanager/2013/10/22/optimizing-sql-server-performance-the-story-of-statistics.html


    Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

    Saturday, March 1, 2014 5:08 PM
  • why delete the statistics,just update it.Deleting and updating it might take more time.As other mentioned updated stats leads to good query plan and in turn faster query.

    Also if you rebuild index for a column  stats are automatically updated.


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Saturday, March 1, 2014 5:35 PM
  • There is no need to delete statistics to recreate them with fullscan, just run UPDATE STATISTICS WITH FULLSCAN.

    Having statistics updated with fullscan will not incur any impact on INSERT, UPDATE or DELETE statements, once the the statistics have been created as statistics are not maintained with each and every statement.

    However, running UPDATE STATISTICS WITH FULLSCAN on a big table will incur load on the system, as SQL Server scans the table once for each non-indexes column there is statistics.

    And that is the reason, SQL Server defaults to sampled statistics. to reduce the impact of statistics creation. It is not likely that FULLSCAN is always needed, but there are certainly situations where the sampled statistics are insufficient. A compromise is to run WITH FULLSCAN, INDEX. This restricts the operation to indexed column. These statistics are likely to be more important, and they are also faster to update, because here SQL Server only have to scan the index which is smaller and already sorted.

    From this follows that you should run UPDATE STATISTICS outside business hours, and if you don't have all the hours to run fullscan on your 3000 tables you should not do it. But if you identify an occasional table where this gives effect, do it with that one.

    Important to know is that when you create or rebuild in an index, this updates the statistics for that index with fullscan. (Since SQL Server is reading all data anyway.)


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Kalman Toth Sunday, March 2, 2014 1:01 PM
    • Marked as answer by Fanny Liu Wednesday, March 12, 2014 1:31 AM
    Saturday, March 1, 2014 8:15 PM
  • SQL Server works best if the statistics are a good representation of the data. Usually, the default method of sampling works just fine. When it uses that method, it basically assumes that the pages that are sampled will yield a representative distribution and cardinality of the column(s) for which the statistics are collected.

    When the data is changed a lot (Inserts/Updates/Deletes), then the statistics may become stale (=nonrepresentative) and should be updated. SQL Server does this automatically, but in some situation, this auto-update-statistics kicks in too late, and the consequence is poorly performing queries.

    To a lesser extent sampled statistics can give similar problems. If the distribution of the values is very uneven, and there is a high correlation with the clustered index (or heap pages) then column values with a low cardinality may be missed, leading to poorly performing queries if these values are queried.  However, such situation is very uncommon. Usually the default option works just fine.

    The reason it usually works fine, is because we are only talking about nonindexed columns here. Statistics of indexes are based on all values.

    Because of all this,it can make sense to regularly run UPDATE STATISTICS when there is little activity on the server. Whether or not you want to add WITH FULLSCAN to those statements is up to you. The only drawback of that is the extra time it will take. In general I would say that running UPDATE STATISTICS regularly is much more important than whether or not you update them WITH FULLSCAN.


    Gert-Jan

    • Proposed as answer by Kalman Toth Sunday, March 2, 2014 1:01 PM
    • Marked as answer by Fanny Liu Wednesday, March 12, 2014 1:31 AM
    Sunday, March 2, 2014 10:14 AM
  • > A compromise is to run WITH FULLSCAN, INDEX

    It depends...if you have one hour maintenance night window and STATISTICS UPDATE WITH FULLSCAN, INDEX takes 10 hours, you still have to go with sampling.

    >Usually, the default method of sampling works just fine.

    It depends...the default sampling maybe way too low for AUTO UPDATE STATISTICS in some applications.

    I love how we have to fudge/twist our answers when discussing STATISTICS.


    Kalman Toth Database & OLAP Architect SELECT Video Tutorials 4 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012




    Sunday, March 2, 2014 1:21 PM
  • The reason it usually works fine, is because we are only talking about nonindexed columns here. Statistics of indexes are based on all values.

    When the index is newly created or rebuilt. Auto statistics are still sampled, aren't they?


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, March 2, 2014 1:42 PM
  • A compromise is to run WITH FULLSCAN, INDEX

    It depends...if you have one hour maintenance night window and STATISTICS UPDATE WITH FULLSCAN, INDEX takes 10 hours, you still have to go with sampling.


    Yup. I should have written "A possible compromise" or something like that.

    As always, it's a matter of tradeoffs.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, March 2, 2014 1:43 PM