none
OLE DB Destination, Rows Per Batch and Max Insert Commit Size

    Question

  • What is the difference between these two?

    The best explanation I've seen is that, Max Insert Commit Size is equivalent to BATCHSIZE argument in the BULK INSERT command. And the way I understand this is that if you have 1,000,000 rows in your data source and you set your batch size at 10,000 then you will commit the rows in blocks of 10,000. Am I correct about this? In SSIS, by default, this number is 2147483647. I guess this is in units of rows? So if I have a 1,000,000 rows in my source, the data won't get committed until all of that 1,000,000 rows are committed. Is that correct?

    If so, then why am I seeing my count(*) number tick up as I run select count(*) on the target as the data is being written? If my Max Insert Commit Size is larger than the rows in my source, doesn't this mean that data will be commited after all rows have been written? So one big commit at the end.

    I could be off in my understanding of the BATCHSIZE option on BULK INSERT so would someone please explain.

    Lastly, what about Rows Per Batch option?


    Monday, June 11, 2012 8:56 PM

Answers

  • It is in units of rows.

    And your assumption is correct.

    But whether it can commit them all at once is like I said a question, but you do ask. E.g. on my laptop it would not happen but on my 16 core server with 64GB RAM it would.

    The DB engine may also decide on how it uptakes the load and split that into smaller chunks.

    "when I saw around 10K rows getting commited at a time, I thought 10K rows for my data must equate to 2,147,xxx,xxx Bytes" -- cannot be true for a row that is comprised of 1 short literal right? Or if we take a humongous ERP like record of 138 columns full of financial data?


    Arthur My Blog

    • Marked as answer by Eileen Zhao Friday, June 29, 2012 6:16 AM
    Friday, June 15, 2012 6:52 PM

All replies

  • RowsPerBatch was well explained already here: http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/86efc54e-45bd-4ccc-baf1-a7b34f80fe32/

    andMax Insert Commit Size is how many rows it takes into the buffer to process, not that what you see on the db size, there the counts can be any. But the SSIS engine will allocate enough memory to process these many specified.

    lastly, the default of 2,147,483,647 is roughly an equivalent of 2GB or RAM that is the most optimal for most systems, but not exactly what you would find in reality allocated.

    In theory all your 1 Mil would commit in one hop, but I guess it depends on how long/large your rows are


    Arthur My Blog


    Monday, June 11, 2012 9:06 PM
  • RowsPerBatch was well explained already here: http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/86efc54e-45bd-4ccc-baf1-a7b34f80fe32/

    andMax Insert Commit Size is how many rows it takes into the buffer to process, not that what you see on the db size, there the counts can be any. But the SSIS engine will allocate enough memory to process these many specified.

    lastly, the default of 2,147,483,647 is roughly an equivalent of 2GB or RAM that is the most optimal for most systems, but not exactly what you would find in reality allocated.

    In theory all your 1 Mil would commit in one hop, but I guess it depends on how long/large your rows are


    Arthur My Blog


    Ok, so it sounds like that number is in units of Bytes, not rows.

    Monday, June 11, 2012 9:25 PM
  • These are rows indeed. So, like you said, if you have 1,000,000 rows then since the default value is 2,147,483,647 the entire load would go through as one hop/batch.

    Now, if you opt for changing it to 0 (yes, zero is supported) this uses the max possible memory size - # of rows a given machine can handle.


    Arthur My Blog

    • Proposed as answer by Eileen Zhao Wednesday, June 13, 2012 8:47 AM
    Tuesday, June 12, 2012 2:02 PM
  • These are rows indeed. So, like you said, if you have 1,000,000 rows then since the default value is 2,147,483,647 the entire load would go through as one hop/batch.

    Now, if you opt for changing it to 0 (yes, zero is supported) this uses the max possible memory size - # of rows a given machine can handle.


    Arthur My Blog

    Wait, now I am confused. Did you say that big number 2,147,xxx,xxx is in units of Rows? Not in units of Bytes?

    Ok, so if it is in rows, then I'm telling SSIS to commit 2,147,xxx,xxx rows in one transaction, correct? But that is not the behavior I'm seeing when I load from a source that has around 1M rows. The behavior I'm seeing is about 10K rows getting commited at a time. I am seeing this by just repeatly running count(*) on the target as the data flow is ruuning.

    I thought, 2,147,xxx,xxx is in Bytes, so when I saw around 10K rows getting commited at a time, I thought 10K rows for my data must equate to 2,147,xxx,xxx Bytes.

    Wednesday, June 13, 2012 2:05 PM
  • It is in units of rows.

    And your assumption is correct.

    But whether it can commit them all at once is like I said a question, but you do ask. E.g. on my laptop it would not happen but on my 16 core server with 64GB RAM it would.

    The DB engine may also decide on how it uptakes the load and split that into smaller chunks.

    "when I saw around 10K rows getting commited at a time, I thought 10K rows for my data must equate to 2,147,xxx,xxx Bytes" -- cannot be true for a row that is comprised of 1 short literal right? Or if we take a humongous ERP like record of 138 columns full of financial data?


    Arthur My Blog

    • Marked as answer by Eileen Zhao Friday, June 29, 2012 6:16 AM
    Friday, June 15, 2012 6:52 PM