SQL Server Developer Center > SQL Server Forums > SQL Server Database Engine > Order of rows in the inserted and deleted psuedo tables

Answered Order of rows in the inserted and deleted psuedo tables

  • Wednesday, August 15, 2007 8:39 PM
     
     

    I have a table that sometimes has modifications to column(s) comprising the primary key [usually "end_date"].  I need to audit changes on this table, and naturally, turned to after triggers.

     

    The problem is that for updates, when the primary key composition changes, I'm not able to relate/join using the primary key - obviously, it no longer matches across INSERTED and DELETED.  Now, for a single row update, it's easy to check for updates on PK columns and then deduce what changes were made...

     

    So the real question is:  are rows in INSERTED and DELETED always in matching order (1st row in INSERTED corresponds to the 1st row in DELETED...)?

     

     

     

    I don't want to put a surrogate key (GUID nor IDENTITY) on the base table if at all possible.  INSERT... SELECT from the inserted/deleted tables into a temp table with identity column is fine, and is what I'm currently doing; I would like MVP or product engineer level confirmation that my ordering assumption is correct.

     

    Testing using an identity surrogate key on base table, and selecting from the Ins/del tables, and the temp tables without an order by clause seems to always return in proper order (proper for my purposes).  I've tested under SQL 2005 RTM, SP1, SP2, and SP2 "3152".

     

    FYI, I've lost the debate that such auditing is better handled by the application, not the database server...

     

    Aside:  why doesn't the ROW_NUMBER() function allow an empty OVER( ORDER BY() ) clause?  Will SQL ever expose an internal row_id, at least in the pseudo tables, so we can work around this situation?

     

    Thanks

    Mike

     

     

Answers

  • Wednesday, August 15, 2007 11:37 PM
    Moderator
     
     Answered
    Well, you should never rely on the order of rows unless you specify the order by clause within the statement.

    If the primary key can be compromised on any way, you should consider introducing another columns which makes it possible to you to identify the appropiate rows.


    Jens K. Suessmeyer

    ---
    http://www.sqlserver2005.de
    ---

  • Thursday, January 15, 2009 2:46 PM
    Moderator
     
     Answered
    I agree with Jens... Updating PK is not  a good idea. In that case why dont you do Move the records to Audit table and delete  from the original table and Insert the new rows.

    Madhu
    MCITP, MCTS, MCDBA,MCP-- Blog : http://experiencing-sql-server-2008.blogspot.com/
  • Saturday, January 17, 2009 4:33 AM
    Moderator
     
     Answered Has Code
    Basically, all I can say is what Jens and Madhu already said.  Both Jens and Madhu are highly respected in the SQL Community; Jens works for Microsoft I believe and Madhu is a SQL Server MVP and has been one of my mentors over the last year before I became an MVP.  Changing the PK value of a row is not something I would do.  The only thing I could think of that you might try would be to use the OUTPUT clause to get your information:

    use tempdb;  
    go  
    create table test1  
    (insertdate datetime primary key default(getdate()),  
     somevalue varchar(100))  
       
    insert into test1 (somevalue) values ('Jon''s test')  
    waitfor delay '00:00:01' 
    insert into test1 (somevalue) values ('Jon''s test2')  
    waitfor delay '00:00:01' 
    insert into test1 (somevalue) values ('Jon''s test3')  
    waitfor delay '00:00:01' 
     
    declare @output table 
    (insertdateorig datetime,  
     somevalueorig varchar(100),  
     insertdatenew datetime,  
     somevaluenew varchar(100))  
       
    update test1  
    set insertdate = dateadd(ss, 1, insertdate)  
    output  deleted.insertdate,  
            deleted.somevalue,  
            inserted.insertdate,  
            inserted.somevalue  
    into @output 
     
    --Display the result set of the table variable.  
    select *  
    from @output 
     
    go  
    drop table test1 

    It works but it is a hack job at best, and I don't recommend it.
    -- Jonathan Kehayias (MCITP) | Please mark answers that solve your problem | http://www.sqlclr.net

All Replies

  • Wednesday, August 15, 2007 11:37 PM
    Moderator
     
     Answered
    Well, you should never rely on the order of rows unless you specify the order by clause within the statement.

    If the primary key can be compromised on any way, you should consider introducing another columns which makes it possible to you to identify the appropiate rows.


    Jens K. Suessmeyer

    ---
    http://www.sqlserver2005.de
    ---

  • Thursday, January 15, 2009 5:20 AM
     
     
    So, this answer means that is not possible to use triggers "after update", unless some serial column serves as a primary key. Point.

    Why? Because any multi-row modification may involve multiple rows, and a primary key can be changed.
    In the absence of row-order correlation between "inserted" and "updated" table, if one "should never rely on the order of rows", it is never possible to imposed any rules by comparing the two tables.
     
    [We are talking about updates and trigger tables, so "order by" clause is unlikely to be relevant]

    I hope, Jens might have misunderstood the question, and these pseudo-tables do have a special treatement; MSDN just states that "the old rows are copied to the deleted table first, and then the new rows are copied to the trigger table and to the inserted table", which is not very encouraging.
  • Thursday, January 15, 2009 2:46 PM
    Moderator
     
     Answered
    I agree with Jens... Updating PK is not  a good idea. In that case why dont you do Move the records to Audit table and delete  from the original table and Insert the new rows.

    Madhu
    MCITP, MCTS, MCDBA,MCP-- Blog : http://experiencing-sql-server-2008.blogspot.com/
  • Saturday, January 17, 2009 4:33 AM
    Moderator
     
     Answered Has Code
    Basically, all I can say is what Jens and Madhu already said.  Both Jens and Madhu are highly respected in the SQL Community; Jens works for Microsoft I believe and Madhu is a SQL Server MVP and has been one of my mentors over the last year before I became an MVP.  Changing the PK value of a row is not something I would do.  The only thing I could think of that you might try would be to use the OUTPUT clause to get your information:

    use tempdb;  
    go  
    create table test1  
    (insertdate datetime primary key default(getdate()),  
     somevalue varchar(100))  
       
    insert into test1 (somevalue) values ('Jon''s test')  
    waitfor delay '00:00:01' 
    insert into test1 (somevalue) values ('Jon''s test2')  
    waitfor delay '00:00:01' 
    insert into test1 (somevalue) values ('Jon''s test3')  
    waitfor delay '00:00:01' 
     
    declare @output table 
    (insertdateorig datetime,  
     somevalueorig varchar(100),  
     insertdatenew datetime,  
     somevaluenew varchar(100))  
       
    update test1  
    set insertdate = dateadd(ss, 1, insertdate)  
    output  deleted.insertdate,  
            deleted.somevalue,  
            inserted.insertdate,  
            inserted.somevalue  
    into @output 
     
    --Display the result set of the table variable.  
    select *  
    from @output 
     
    go  
    drop table test1 

    It works but it is a hack job at best, and I don't recommend it.
    -- Jonathan Kehayias (MCITP) | Please mark answers that solve your problem | http://www.sqlclr.net
  • Friday, May 15, 2009 9:43 PM
     
     
    Would you guys say that this is safe if you have a primary key that doesn't change or is using an order by still pretty much mandatory?
  • Saturday, May 16, 2009 1:27 PM
    Moderator
     
     
    It depends, let me chop your assumptions:

    "that this is safe if you have a primary key that doesn't change"

    Yes, for sure. Normally the primary key make the clustered key also, the clustered key should be due to physical sorted ordering as static as possible to prevent fragmentation (static, narrow, ever increasing). it is also much more easy to indentify the rows that changed by a static key. In the scenario of an insert it is hardly possible to find the row that changed back (e.g. for join reasons if you don not know about the primary key)

    "using an order by still pretty much mandatory"

    I only would call that mandatory if you want predicatable, determinstic results. The row order *seems* sometime to be the order of he physical stored tables, but features like "merry-go-round" scanning can work against that assumption. So if you wnat to rely on the order, do an ORDER.

    -Jens
    Jens K. Suessmeyer http://blogs.msdn.com/Jenss
  • Saturday, May 16, 2009 1:53 PM
     
     
    So, this answer means that is not possible to use triggers "after update", unless some serial column serves as a primary key. Point.

    Why? Because any multi-row modification may involve multiple rows, and a primary key can be changed.
    In the absence of row-order correlation between "inserted" and "updated" table, if one "should never rely on the order of rows", it is never possible to imposed any rules by comparing the two tables.
     
    [We are talking about updates and trigger tables, so "order by" clause is unlikely to be relevant]

    I hope, Jens might have misunderstood the question, and these pseudo-tables do have a special treatement; MSDN just states that "the old rows are copied to the deleted table first, and then the new rows are copied to the trigger table and to the inserted table", which is not very encouraging.
    First I want to agree with the other people who have responded so far.

    The problem you are facing is caused directly by the decision to use a natural key as the primary key rather than a surrogate.  Once of the problems with natural keys is that they can change.  

    Now you say that you don't want to use a surrogate key on the table, but you don't say why.  The four bytes needed for an IDENTITY column more than pay for themselves by solving this, and many other problems.

    The existing compund primary key presumably means that new rows are inserted at varying points in the table.  This is not good as it causes pages to split and the table to become fragemented.  Not a problem for single-row queries using an index, but anything else will find the pages all over the place.  So, instead of realtively fast sequential I/O, you will be generating a good proportion of random I/O.  With a monotonically increasing surrogate key, new data is added at the end of the table so page splitting does not occur.

    An INT surrogate key also means that your foreign keys will be INTs, and joins between related tables will often be on these INT columns.  Not only is this a much more efficient way to JOIN, the database engine has several optimizations which are available only on INT joins.  Your existing joins and searches on the compound primary key will be much less efficient: consider the size of the primary key you have and what comparisons the engine must do to determine if a row is a match.  Compare that to a simple comparison between INTs.

    Also, consider how many bytes each primary key value occupies and how many of those keys will fit in a single 8K page.  Say your primary key is 20 bytes wide for example.  This is five times the size of an INT, so each 8K page that SQL Server reads can contain roughly 400 of your keys, or 2000 INT keys.  A query that needs to touch 400,000 rows would need to read 1,000 8K pages on your key.  If the key were an INT, that would be 200 pages.

    Also, assuming that your primary is also clustered, every row in every nonclustered index you have also includes the full primary key as the pointer to the rest of the row data.  Your indexes will be larger, and less information-dense, than they could be.

    I could go on (and on) but I think that is enough to make the point.  Why put yourself through this unnecessary pain?  Consider adding an invariant lightweight surrogate key to your tables.

    Paul
  • Monday, May 25, 2009 6:35 AM
    Moderator
     
     
    Hi Michael,

    do you have additional questions or was your problem solved ? If yes, please don´t forget to mark the answer to help other people finding the solution at the top of the list :-)

    -Jens
    Jens K. Suessmeyer http://blogs.msdn.com/Jenss
  • Monday, May 25, 2009 6:57 PM
     
     
    I can't mark an answer on this thread because I did not start it.  It was started quite a while ago.  I asked a follow up.