locked
Order of insertion downside? RRS feed

  • Question

  • I am having a table with primary key clustered index (which means a clustered table) on ID column which is an int. The inserts to this table are going to be bulk inserts from application and I am getting a range of values for the ID column through sequences and inserting the rows based of these sequence values. However in the application there are mutiple processes that decide the range of sequence values based on the number of inserts they have to perform.

    Example:

    Application Process1 has to insert 1000 rows

    Application process2 has to insert 5 rows

    Application process3 has to insert 10000 rows

    Assuming the table is empty process1 gets values for the ID from the sequence as 1 to 1000 and while it is building the inserts process 2 gets its next values from 1001 to 1005 and it can go ahead of process 1 and do the insert while process 1 is still building.

    So when we look at the table at that time we will have values from 1001 to 1005 initially

    but later after all the processes do their inserts when we run a select against the table it will be from 1 to 11005. 

    My question is this correct way to do the inserts on a clustered table? Like inserting rows irrespective of order but in the range? I am asking from performance wise as well as functional wise.

    Thanks a ton for your responses




    Tuesday, February 16, 2016 6:54 PM

Answers

All replies

  • SQL Server will have to do some page splits when you insert out-of-order, but it shouldn't be too bad. 

    David


    David http://blogs.msdn.com/b/dbrowne/

    • Proposed as answer by Ice Fan Sunday, March 6, 2016 9:42 AM
    Tuesday, February 16, 2016 6:58 PM
  • SQL Server will have to do some page splits when you insert out-of-order, but it shouldn't be too bad. 

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thanks for your response. I think sql server will do page splits even if it were one process that is trying to do insert but as soon as it reaches the max size of the page. The processes are relying on sequences to decide the range of values for inserting, so if not right away but pretty fast the page will be full and written by the inserts.

    Unless I am missing something will there be any issues from performance or functional wise by doing this type of inserts? Like inserting a range of smaller values first but not in order at the time of insertion? (they will be in order in the end when all application processes finish their inserts)

    Thanks

    Tuesday, February 16, 2016 7:09 PM
  • Also the processes together will be inserting closer to 6 million rows per day.
    Tuesday, February 16, 2016 7:29 PM
  • >will there be any issues from performance or functional wise by doing this type of inserts?

    No Functional issues. 

    Performance shouldn't be too bad, but not as good as inserting the rows from a single client process in order.  But using multiple client processes allows you to use multiple cores.

    You'll just need to test.

    David


    David http://blogs.msdn.com/b/dbrowne/


    Tuesday, February 16, 2016 7:44 PM
  • >>but not as good as inserting the rows from a single client process in order.  

    Can you please emphasize?

    Thanks again

    Tuesday, February 16, 2016 7:51 PM
  • Take a look at

    Good Page Splits and Sequential GUID Key Generation

    It's discussing GUIDs, but covers what happens when a bunch sequential key values are inserted in the middle of a clustered index.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, February 16, 2016 8:35 PM