none
IDENTITY BIG JUMP (1000/10000) a Feature?

All replies

  • I cannot think of any use case where this would be a feature. But, as the article points out, there is a startup parameter that can disable this behavior. Thanks for pointing it out.
    Friday, July 11, 2014 4:17 PM
  • Here is the quote from the same blog:

    "Register -t272 to SQL Server Startup Parameter

    Open SQLServer configuration manager from your server. Select SQL Server 2012 instance there right client and select Properties menu. You will find a tabbed dialog window. You select start up parameters tab from there and register -t272. Then restart SQL Server 2012 instance again and see the difference:"

    http://www.codeproject.com/Tips/668042/SQL-Server-Auto-Identity-Column-Value-Jump-Is


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012








    Friday, July 11, 2014 4:24 PM
  • MS has responded to this several times "it is working as designed", because of the sequence caching function added in 2012.  It allocated them, in cache, they just never got assigned to rows in the table.  However, I would also suggest is a bug. 

    My personal concern is you will eventually run out of identity numbers quickly if your server reboots often.  Then you will be in a reseed situation which is never pretty for Primary Keys.

    The trace flag turns off the caching.  But that introduces the same problem which existed before of  many inserts being slowed down waiting on the identity number to be assigned.


    Friday, July 11, 2014 6:41 PM
  • I would say that it is a tradeoff required for performance reasons. Microsoft responded in the connect case:

    "In SQL Server 2012 the implementation of the identity property has been changed to accommodate investments into other features. In previous versions of SQL Server the tracking of identity generation relied on transaction log records for each identity value generated. In SQL Server 2012 we generate identity values in batches and log only the max value of the batch. This reduces the amount and frequency of information written to the transaction log improving insert scalability."

    So once new maximum value is saved in the log, there is no going back, even if numbers are never used. (Unless they will add another log entry to the transaction log on shutdown to return unused numbers).

    Long term solution should be to use SEQUENCE with NO CACHE for small tables instead of identity and benefit from performance improvement for large tables.

    Friday, July 11, 2014 7:02 PM
  • Long term solution should be to use SEQUENCE with NO CACHE for small tables instead of identity and benefit from performance improvement for large tables.

    Microsoft has introduced the SEQUENCE object to be ANSI SQL compliant.

    My understanding is that we should try to transit our database design practices to SEQUENCE object instead of IDENTITY.

    I know that IDENTITY is in our blood due to over 2 decades of usage. Nonetheless the future is SEQUENCE object.


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012








    Friday, July 11, 2014 8:57 PM
  • Even if not visible to the client, it is visible to tech staff and it is confusing, detrimental to programmer's productivity.

    So what is it finally? Bug or feature?  Thanks.

    Feature. IDENTITY is a meaningless number and it is designed to have gaps for better concurrency. You could lose 10000 numbers because of a server restart. Or because a 10000-row insert that is rolled back. Absolutely nothing to lose sleep over.

    Tom was a afraid that you would run out of IDENTITY values more quickly. Hm, let's see. We would need some 21400 server restarts for that to happen. If you are of the funny school that restarts the server every day, it will take you 57 years.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, July 11, 2014 10:15 PM
  •  IDENTITY is a meaningless number

    That is not exactly true. Yes, as far as SURROGATE PRIMARY KEY concerned, it does not matter. One can apply random numbers as a matter of fact.

    However, by virtue of sequential allocation IDENTITY has a meaning to the Human Intelligence scanning the table. It always had. Higher number meant more recent record insertion barring out-of-sequence manual insertions (SET IDENTITY_INSERT, DBCC CHECKIDENT RESEED). It was a kind of quick "audit trail feature" for DBA-s and developers.

    Now, you can see server restarts in the so-called "meaningless" number sequence.  It is rather peculiar. In fact if the table has CreatedDate, one can narrow down the date of server restart.

    For a slow-growing small table of 500 records, the ID numbers may be jumping quickly depending on the frequency of server restarts.  SEQUENCE object is the solution. It does require creating a new object - not as convenient as the CREATE TABLE built-in IDENTITY - but it works.


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012










    Saturday, July 12, 2014 7:23 AM
  • Now, you can see server restarts in the so-called "meaningless" number sequence.  It is rather peculiar.

    Not particularly. You could also see gaps due to failed inserts or whatever. You might see them more often now. That's eludicating.

    For a slow-growing small table of 500 records,

    If you have that volume, and you don't like gaps, roll your own. There is no need to use IDENTITY or sequences in this case. The main reason to use these features is that they permit high-concurrency inserts.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Saturday, July 12, 2014 8:56 AM
  •  The main reason to use these features is that they permit high-concurrency inserts.

    I never had trouble with INSERTs. In fact I was not aware so far that IDENTITY maybe the culprit.

    Are we talking about ultra large volume INSERTs?


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012








    Saturday, July 12, 2014 12:02 PM
  •  The main reason to use these features is that they permit high-concurrency inserts.

    I never had trouble with INSERTs. In fact I was not aware so far that IDENTITY maybe the culprit.

    Are we talking about ultra large volume INSERTs?

    I was talking about high concurrency, not necessarily large volumes per INSERT. Say that you have Orders and OrderDetails. OrderID is not IDENTITY or sequence. To insert an order, you need to get the current max value add one, insert a row into Orders, and then insert rows into OrderDetails maybe more to it. All in a transaction. This means that inserts are serialised, while any able server is perfectly able to process multiple orders at the same time.

    With IDENTITY or sequences you don't have these problems, because the new id value is retrieved outside the transaction, why multiple inserts may be going on in parallel.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Saturday, July 12, 2014 12:16 PM
  • When two sessions want to insert a row with a sequential key you have only two choices.

    1) Block one session until the other commits.

    2) Allow both sessions to generate sequence numbers and insert, but leave a gap if either session rolls back.

    SQL Server IDENTITY columns and SEQUENCES choose option 2), so rollbacks and failed inserts already create gaps.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Saturday, July 12, 2014 12:17 PM
  • Yes David, we got used to occasional small gaps in IDENTITY sequence in the past.

    But when you look at 1000 and 10000 gaps, the natural human reaction is being alarmed.

    Here is the bottom line though: why must server restarts be reflected in the IDENTITY sequence?  Why not just serve random numbers which would hide server restarts?

    I understand the technical explanation, but the entire thing does not appear to me as very logical.

    An alternative is to go back to the MAX + 1 method which was used before IDENTITY.


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012










    Saturday, July 12, 2014 12:26 PM
  • >Why not just serve random numbers which would hide server restarts?

    Random numbers would have to be very large and carefully chosen to avoid collisions.  That is the idea of a GUID or UNIQUEIDENTIFIER, whish is really a 128-bit integer.  But random keys perform poorly, so NEWSEQUENTIALID() was introduced to generate GUIDs sequentially.  But even this resets on server restarts, so reboots are visible.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Saturday, July 12, 2014 12:36 PM
  • According to this article we have to worry about duplicates when using the CACHE option:

    "This functionality may leave gaps, but guarantees that the same value will never be issued two times for a single sequence object unless it is defined as CYCLE or is manually restarted."

    LINK: Sequence Cache management and Internals – Sql Server 2012


    Kalman Toth Database & OLAP Architect SQL Server 2014 Design & Programming
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012








    Saturday, July 12, 2014 5:26 PM
  • But when you look at 1000 and 10000 gaps, the natural human reaction is being alarmed.

    It is also very naturally to be alarmed to see that SQL Server uses all memory there is in the machine, and yet it is doing nothing. It must be a memory leak!

    Again, if you care about the numbers to be contiguous, don't use IDENTITY or sequences. If you don't care, well, you don't care if 10000 numbers goes down the drain.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Saturday, July 12, 2014 5:57 PM