none
How to find outdated statistics in SQL Server?

    Question

  • My question is: How to find outdated statistics in SQL Server 2005 and above?

    I have seen many solution based on sysindexes.rowmodctr in many blogs/sites as below. 

    SELECT
      id                    AS [Table ID]
    , OBJECT_NAME(id)       AS [Table Name]
    , name                  AS [Index Name]
    , STATS_DATE(id, indid) AS [LastUpdated]
    , rowmodctr             AS [Rows Modified]
    FROM sys.sysindexes 
    WHERE STATS_DATE(id, indid)<=DATEADD(DAY,-1,GETDATE()) 
    AND rowmodctr>10 AND (OBJECTPROPERTY(id,'IsUserTable'))=1

    However, In one of Kimberly Tripp's blog here , it has been changed using sysrowsetcolumns.rcmodified. So, am wondering how the query using rowmodctr is valid for SQL Server 2005 and above. If any other query available, please share with me. Any thoughts would be appreciated deeply.



    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Thursday, July 18, 2013 1:45 PM

Answers

All replies

  • Hello Latheesh,

    Please have a look a TechNet ScriptCenter: Update out-dated statistics

    It uses the sys.dm_db_index_usage_stats => LastUpdate information.


    Olaf Helper

    Blog Xing

    • Marked as answer by Latheesh NK Monday, July 29, 2013 5:55 AM
    Thursday, July 18, 2013 1:49 PM
  • Hi Olaf, Thanks for your quick reply. This would help me for sure.

    But, I am not able to see sysrowsetcolumns.rcmodified is used in your script as well.. Could you please clarify your query on an index update date can be an alternative to sysrowsetcolumns.rcmodified in some way? Perhaps, Please correct me if am wrong.


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.


    • Edited by Latheesh NK Thursday, July 18, 2013 10:52 PM edit
    Thursday, July 18, 2013 2:04 PM
  • Any updates/thoughts please?

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Thursday, July 18, 2013 10:55 PM
  • Please find my understanding as follows: (I request to shed some lights perhaps you have any other thoughts)

    Invalidation of Statistics is for SQL server internal purpose to calculate the Auto update stats on an assumption of the below logic which would help for query performance. 


    n - number of rows in a table.

    However the logic has been changed over different SQL Server version asKim Tripp mentioned in a blog as below:

    "Invalidation: The biggest changes were introduced in SQL Server 2005 where they decided to NO LONGER user the sysindexes.rowmodctr and instead use an internal (and undocumented) columns specific modification counter. Now, statistic invalidation is more isolated to only those columns which are heavily modified. This internal/undoc'ed column is sysrowsetcolumns.rcmodified and can only be seen when connecting to SQL Server using the DAC (Dedicated Admin Connection). "


    If we manually,proactively find a logic to calculate/find the outdated statistics, then its a wild guess which might or might not help(it depends), but which might NOT equivalent to the way how SQL Server think for Auto update stats. 

    If we look at some of ways that we can find the stale stats as follows:

    1. Using sysrowsetcolumns/sysrscols by Ricardo Leka:

    /*SQL 2005*/
    SELECT SO.NAME AS tableName, COL_NAME(sc.object_id, sc.column_id) AS columnName, A.name as stats_name, SSC.*, SSR.* FROM sys.sysrowsetcolumns SSC 
    INNER JOIN sys.sysrowsets SSR ON SSC.rowsetID = SSR.rowsetID 
    INNER JOIN sys.sysobjects SO ON SSR.idmajor = SO.id 
    INNER JOIN sys.stats_columns SC on SSR.idmajor = SC.object_id AND SSC.rowsetcolid = SC.column_id 
    INNER JOIN sys.stats as A ON A.object_id = SO.id 
    WHERE SO.xtype = 'U' 
    ORDER BY so.name, sc.column_id 
    /*SQL 2008*/ 
    SELECT SO.NAME AS tableName, COL_NAME(sc.object_id, sc.column_id) AS columnName, A.name as stats_name, SSC.*, SSR.* FROM sys.sysrscols SSC 
    INNER JOIN sys.sysrowsets SSR ON SSC.rowsetID = SSR.rowsetID 
    INNER JOIN sys.sysobjects SO ON SSR.idmajor = SO.id 
    INNER JOIN sys.stats_columns SC on SSR.idmajor = SC.object_id AND SSC.rscolid = SC.column_id 
    INNER JOIN sys.stats as A ON A.object_id = SO.id 
    WHERE SO.xtype = 'U' 
    ORDER BY so.name, sc.column_id 

    Ref: http://www.toadworld.com/platforms/sql-server/b/weblog/archive/2012/03/23/update-statistics-quando-vai-acontecer.aspx

    2. Using stats update date by Olaf Helper:

     SELECT SCH.name AS SchemaName 
              ,OBJ.name AS ObjectName 
              ,STA.name AS StatName 
              ,STATS_DATE(STA.object_id, STA.stats_id) AS StatUpdateStamp 
        FROM sys.stats AS STA 
             INNER JOIN sys.objects AS OBJ 
                 ON STA.object_id = OBJ.object_id 
             INNER JOIN sys.schemas AS SCH 
                 ON OBJ.schema_id = SCH.schema_id 
             LEFT JOIN 
             (SELECT IUS.object_id 
                    ,MIN(ISNULL(IUS.last_user_update, IUS.last_system_update)) AS LastUpdate 
              FROM sys.dm_db_index_usage_stats AS IUS 
              WHERE database_id = DB_ID() 
                    AND NOT ISNULL(IUS.last_user_update, IUS.last_system_update) IS NULL 
              GROUP BY IUS.object_id 
             ) AS IUS 
                 ON IUS.object_id = STA.object_id 
        WHERE OBJ.type IN ('U', 'V')    -- only user tables and views 
              AND DATEDIFF(d, ISNULL(STATS_DATE(STA.object_id, STA.stats_id), {d N'1900-01-01'}) 
                            , IUS.LastUpdate) > @minDateDiff 
        ORDER BY STATS_DATE(STA.object_id, stats_id) ASC; 

    Ref: http://gallery.technet.microsoft.com/scriptcenter/9aad569c-2d33-4d13-8242-160671260b5f

    3. As usual, looking at sysindexes.rowmodctr by many:

    Paul White says (in comments):

    "Connecting via the DAC to see sys.sysrscols (or sys.sysrowsetcolumns in 2005) I have noticed that issuing a CHECKPOINT seems to cause the in-memory counters to be flushed to persistent storage, so you might want to try that.  With larger values, you might be seeing flushing based on number of changes (this is just speculation on my part) without needing CHECKPOINT.

    That said, in my simplified tests I found it more convenient to query rowmodctr from sysindexes on a regular connection.  So long as one bears in mind the differences between colmodctr and rowmodctr, this can still be an effective technique."

    Ref: http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx

    Fabiano Amorim says (in comments):

    "The new rowmodctr value will frequently help you determine when to update statistics because the behavior is reasonably close to the results of earlier versions.
    http://msdn.microsoft.com/en-us/library/ms190283.aspx"

    Ref:https://www.simple-talk.com/sql/t-sql-programming/13-things-you-should-know-about-statistics-and-the-query-optimizer/

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.


    • Edited by Latheesh NK Friday, July 19, 2013 7:04 AM highlight
    Friday, July 19, 2013 6:46 AM
  • Instead of finding out-of-date statistics, just UPDATE STATISTICS every night:

    http://www.sqlusa.com/articles/query-optimization/


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012


    • Marked as answer by Latheesh NK Monday, July 29, 2013 5:55 AM
    Sunday, July 28, 2013 12:12 AM
    Moderator
  • SELECT 
    object_name(si.[object_id])  AS [TableName]
    , CASE 
    WHEN si.[index_id] = 0 then 'Heap'
    WHEN si.[index_id] = 1 then 'CL'
    WHEN si.[index_id] BETWEEN 2 AND 250 THEN 'NC ' + RIGHT('00' + convert(varchar, si.[index_id]), 3)
    ELSE ''
     END AS [IndexType]
    , si.[name] AS [IndexName]
    , si.[index_id] AS [IndexID]
    , CASE
    WHEN si.[index_id] BETWEEN 1 AND 250 AND STATS_DATE (si.[object_id], si.[index_id]) < DATEADD(m, -1, getdate()) 
    THEN '!! More than a month OLD !!'
    WHEN si.[index_id] BETWEEN 1 AND 250 AND STATS_DATE (si.[object_id], si.[index_id]) < DATEADD(wk, -1, getdate()) 
    THEN '! Within the past month !'
    WHEN si.[index_id] BETWEEN 1 AND 250 THEN 'Stats recent'
    ELSE ''
     END AS [Warning]
    , STATS_DATE (si.[object_id], si.[index_id]) AS [Last Stats Update]
    FROM sys.indexes AS si
    WHERE OBJECTPROPERTY(si.[object_id], 'IsUserTable') = 1
    ORDER BY [TableName], si.[index_id]
    go

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Blog: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance

    • Marked as answer by Latheesh NK Monday, July 29, 2013 5:55 AM
    Sunday, July 28, 2013 9:23 AM
    Answerer
  • Hence , it looks like more of a choice (not an accurate way) on finding out the stats with so much scripts above, Am marking all replys as "Mark As Answer". Thanks to everyone who contributed to the same with many ideas.

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Monday, July 29, 2013 5:55 AM