none
any other redeemeing qualities in temporal tables

    Question

  • Hi we run 2016 enterprise. We understand temporal tables pretty well but are wondering if there is anything about the history component that is "fast".  Perhaps index"able".

    Why do we ask?  Our front end guys are dead set on doing hard deletes on one supplemental table that will be temporal soon.  And they are resistant to doing a bogus update to the master record before doing the supplemental delete.  Such a bogus update would increment the master's version #.  Our front end guys feel doing the bogus update would cause a performance issue and a dependency that they dont want.  Even though its just one additional update.

    We want to get out of the business of relying on dates for incremental architecture in processes that extract data for reporting purposes (eg warehouse building etl).  We want to start relying on high keys and high version #'s (aka sql timestamps) instead to help identify master accounts that may have changed (change candidates) in any way on a number of tables since the last time the extract process ran.  The reason for the strategy change is that dates are porous enough to introduce an occasional problem in these extracts. Records get missed from time to time.  Keys and version #'s are more airtight in helping to trigger a list of change candidate acct numbers.

    Our supplemental table has a version # that sql controls and calls a timestamp.  While the version # doesn't increment as the record is moved to the history component, we may be able to talk our ui guys into doing a bogus update on the record before deleting it so that an incremented version # appears in the history table component.  But then there is the challenge of quickly finding history records whose version # is higher than some previously saved high value. 

    That's why I ask if the history component can be indexed on a column like the version # or is perhaps extremely fast. 

    We don't know what kind of archive strategy we'll employ on the history component of the temporal table so for the moment we are assuming it will grow quite large, perhaps to a million records before sql starts automatically purging per the cutoff we designated.

    Thursday, February 08, 2018 1:23 PM

Answers

  • The history data is in a separate table, which you can name if you wish.

    You can work against the history table (almost) as a regular table. That includes SELECT and also CREATE INDEX. You cannot modify the data, though.

    The history table is clustered by default on (sysend, sysstart). But again, you can create other indexes as you wish.

    The history table is also page compressed by default. Again, you can change that if you wish.


    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by db042190 Tuesday, February 20, 2018 2:57 PM
    Friday, February 09, 2018 9:12 AM
    Moderator
  • Yup, compression tends to cost CPU. ;-)

    Or, to be more serious, there is an overhead in cracking the compression every time a page has to be accessed. But you'll fit more of the data in cache.

    Row compression is an option which costs far less CPU. It all boils down to the compression ratio between the two. And willingness to pay CPU for page, if it has significant extra compression over row.

    Anyhow, it is in the end up to you if you want the history data compressed. It is only compressed by default, change that if you want!


    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by db042190 Tuesday, February 20, 2018 2:58 PM
    Friday, February 09, 2018 2:58 PM
    Moderator

All replies

  • Temporal tables are likely too much for what you are looking for, unless you are trying to track every change. 

    If you are just looking for hard deletes, and already have a solution for updates, then just create a delete trigger to capture the deleted PK to a new "deleted" table (also add a INSERT trigger to remove any entry in case it gets re-added).  Then add a process to read the "deleted" table and do what is needed.

    Thursday, February 08, 2018 1:44 PM
    Moderator
  • thx Tom I like the answer.   Just in case we need closure on the subject, can you tell us if 1) an index can be built on a column in the historic component of a temporal table, 2) if underlying temporal historic table behavior is some sort of speedy engine like you would find in a full column store (vertipac?) etc.   
    Thursday, February 08, 2018 4:48 PM
  • Reading your post with great interest, as excited as I was to hear temporal tables had arrived I've still never actually tried one.  But I'm confused about your delete problem, was that not anticipated in the implementation of temporal tables?

    https://docs.microsoft.com/en-us/sql/relational-databases/tables/modifying-data-in-a-system-versioned-temporal-table

    Data in a system-versioned temporal table is modified using regular DML statements with one important difference: period column data cannot be directly modified. When data is updated, it is versioned, with the previous version of each updated row is inserted into the history table. When data is deleted, the delete is logical, with the row moved into the history table from the current table - it is not permanently deleted.

    Does that not do what you want?

    Do you have a problem because you may update rows in the history?

    Because otherwise I think the key or "timestamp" number would still all be monotonically increasing, too.  Or unless you do something really unusual like changing the real primary key of rows?

    Josh



    • Edited by JRStern Thursday, February 08, 2018 5:09 PM
    Thursday, February 08, 2018 5:09 PM
  • Thx JR , very astute.  There was a miscommunication between our front end and backend developers.  Our backend folks asked our frontend folks if they ever "do a delete" in a new app.  And got the wrong answer.  I've seen this happen before in my career.  And I don't blame anybody.

    Your research aligns with my own experimentation.  The version number stays the same when the record is "moved" to the "history table" in a delete. 

    Is that what I want?  No,  but it was what I expected.

    I'd never worked with temporals before either.  I set one up and tried a few simple CRUD operation.

    I'm pretty certain (I'll try it and post back here but it kind of defeats the purpose) that you cannot update the history table of a temporal object.   Because the history portion of a temporal needs to potentially represent the same record many times, I don't think there is really a notion of a primary key "anymore" but i'll try updating it too in my next experiment. 

    The real takeaway so far for me is "if you are going to allow hard deletes on a table with a version #, you are basically taking away the gift you gave your users in the first place".  And don't expect that if you turn versioning on  on that table, that you've necessarily compensated for the lost version # because it retains the value you saw last time you scraped the data (only correctable if you do an update then a delete on the same record).  And you now asked your user to visit yet another table to recognize accts that may have changed.       

       
    Thursday, February 08, 2018 5:37 PM
  • no, updates of any kind to the history portion of a temporal object are forbidden.

    Msg 13561, Level 16, State 1, Line 2

    Cannot update rows in a temporal history table...

    • Edited by db042190 Thursday, February 08, 2018 5:43 PM showing the error message
    Thursday, February 08, 2018 5:41 PM
  • 1) You can add indexes to temporal tables.  

    2)  no, it is a normal table.

    Thursday, February 08, 2018 6:31 PM
    Moderator
  • The real takeaway so far for me is "if you are going to allow hard deletes on a table with a version #, you are basically taking away the gift you gave your users in the first place".  And don't expect that if you turn versioning on  on that table, that you've necessarily compensated for the lost version # because it retains the value you saw last time you scraped the data (only correctable if you do an update then a delete on the same record).  And you now asked your user to visit yet another table to recognize accts that may have changed.          

    Unless I miss my guess the deleted record will still have a newer timestamp than any history record for that PK, and I think that should be sufficient - the query supports an "as of" qualifier, right?  Otherwise you'd need some fancy SQL to get all the newest versions.

    (and like many people I have done it the hard way, before temporal features were available)

    from Tom:

    >updates of any kind to the history portion of a temporal object are forbidden

    Pity that, there are scenarios where that makes sense to modify old, when it turns out some old version was erroneous and you need to correct not the current value but the historic value.

    Josh

    Thursday, February 08, 2018 7:09 PM
  • thx Tom but what about an index that affects (acts on) the history portion? 
    Thursday, February 08, 2018 8:11 PM
  • Thx Josh.  No, the saved version # is of no value.  True its higher than any other for that pk but its also <= the high one that controlled your process's last run. So it might as well be invisible. 
    Thursday, February 08, 2018 8:14 PM
  • Thx Josh.  No, the saved version # is of no value.  True its higher than any other for that pk but its also <= the high one that controlled your process's last run. So it might as well be invisible. 

    I see if that's your scenario, you just want the additions and modifications since last run.

    But while you don't want to trust your own date fields, can't you trust the "from" temporal operator to pick those rows?  Sorry if this is a dumb (wrong) suggestion.

    https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table

    Josh

    Friday, February 09, 2018 1:08 AM
  • The history data is in a separate table, which you can name if you wish.

    You can work against the history table (almost) as a regular table. That includes SELECT and also CREATE INDEX. You cannot modify the data, though.

    The history table is clustered by default on (sysend, sysstart). But again, you can create other indexes as you wish.

    The history table is also page compressed by default. Again, you can change that if you wish.


    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by db042190 Tuesday, February 20, 2018 2:57 PM
    Friday, February 09, 2018 9:12 AM
    Moderator
  • Thx Tibor. interesting, you can create a non clustered index on the history component but it cannot be unique. I have to look up what page compressed means. I wonder if that offers it some speed too.
    Friday, February 09, 2018 12:24 PM
  • page compression seems to have elements of the vertipac engine which I believe is similar to what sits underneath the full column store.  So that translates to compression and dictionary algorithms. I see an interesting article on it at https://www.mssqltips.com/sqlservertip/3187/demonstrating-the-effects-of-using-data-compression-in-sql-server/  but this is my first time reading anything by that author.

    The author basically shows that extra cpu is traded off for speed in scan based queries.

    Friday, February 09, 2018 12:37 PM
  • Thx Josh.  I would say no.  We trust all dates including the temporal dates.  But interpreting exactly when they were set in the transaction involved is probably where they become porous when compared to last process run date.  There always seems to be a question if computed items like identitys, version #'s and dates like these are set at commit time or not.   
    Friday, February 09, 2018 12:46 PM
  • this story gets more interesting as my peers keep me on my toes.  There is an anonymous history and user defined history.  I'm going to check if the user defined approach can have its own version #.  If yes, I suspect my delete will generate a new "timestamp" which potentially relieves me of one of my concerns.   
    Friday, February 09, 2018 2:55 PM
  • Yup, compression tends to cost CPU. ;-)

    Or, to be more serious, there is an overhead in cracking the compression every time a page has to be accessed. But you'll fit more of the data in cache.

    Row compression is an option which costs far less CPU. It all boils down to the compression ratio between the two. And willingness to pay CPU for page, if it has significant extra compression over row.

    Anyhow, it is in the end up to you if you want the history data compressed. It is only compressed by default, change that if you want!


    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by db042190 Tuesday, February 20, 2018 2:58 PM
    Friday, February 09, 2018 2:58 PM
    Moderator
  • i was hoping this would work but no such luck.  The number and types of columns in the temporal and temporal history tables must match.  And u cant have 2 timestamps in the same table anyway. 

    I was hoping to have the history component create its own timestamp so that upon (hard) delete I'd have another way of detecting a change to a table without reading the entire table.  What this means to me is that before a hard delete on a temporal table, I'd need a bogus update just so I could get the timestamp increased.   

    CREATE TABLE DepartmentHistory   
    (    
         DeptID int NOT NULL  
       , DeptName varchar(50) NOT NULL  
       , ManagerID INT  NULL  
       , ParentDeptID int NULL  
       , SysStartTime datetime2 NOT NULL  
       , SysEndTime datetime2 NOT NULL 
       , lastchange1 timestamp
       , lastchange2 timestamp  
    );   
    GO   
    --CREATE CLUSTERED COLUMNSTORE INDEX IX_DepartmentHistory   
    --   ON DepartmentHistory;   
    --CREATE NONCLUSTERED INDEX IX_DepartmentHistory_ID_PERIOD_COLUMNS   
    --   ON DepartmentHistory (SysEndTime, SysStartTime, DeptID); 
    create nonclustered index ix_lastchange2 on departmenthistory (lastchange2)  
    GO   
    CREATE TABLE Department   
    (    
         DeptID int NOT NULL PRIMARY KEY CLUSTERED  
       , DeptName varchar(50) NOT NULL  
       , ManagerID INT  NULL  
       , ParentDeptID int NULL  
       , SysStartTime datetime2 GENERATED ALWAYS AS ROW START NOT NULL  
       , SysEndTime datetime2 GENERATED ALWAYS AS ROW END NOT NULL     
       , PERIOD FOR SYSTEM_TIME (SysStartTime,SysEndTime) 
       , lastchange1 timestamp     
    )    
    WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.DepartmentHistory))   
    ;  

    Monday, February 19, 2018 9:38 PM