locked
ROWS PER BATCH IN SSIS? RRS feed

  • Question

  • HI ALL,

    I NEED TO INSERTS OR UPDATES MILLIONS OF RECORDS IN TABLE. IN OLEDB DESTINATION WE HAVE ONE PROPERTY ROWS PER BATCH. PLEASE SUGGEST ME...WHICH FIGURE  IS GOO FOR INCREASE THE PERFORMANCE .50,000 IS GOOD OR NOT?

    THANKS

    CMK....

     

     

    Tuesday, July 13, 2010 1:47 AM

Answers

  • You will want to test a wide range of values with your package.  You should further look at the data flow default max rows per buffer and default buffer size properties, in conjunction with rows per batch and max insert commit size of the destination.  I would suggest building a test harness to set these values dynamically and then test it out to see what combination of values gives you the best range.  Note that this will be based in large part on the resource constraints of the systems you are reading from and writing to as well as the record size.

    You might want to use SQL Profiler, or the BufferSizeTuning log event, to see how many records are actually being inserted per transaction, as opposed to the number that is specified in the destination.  This is due to the fact that the commit size will only be able to submit the number of records it can contain in a single buffer, so it takes the lower of the records in the buffer or the max commit size.

    See the post - Fast Load with Maximum Insert Commit Size - for some more information on this topic.


    Please mark answered posts. Thanks for your time.
    Tuesday, July 13, 2010 2:15 AM

All replies

  • You will want to test a wide range of values with your package.  You should further look at the data flow default max rows per buffer and default buffer size properties, in conjunction with rows per batch and max insert commit size of the destination.  I would suggest building a test harness to set these values dynamically and then test it out to see what combination of values gives you the best range.  Note that this will be based in large part on the resource constraints of the systems you are reading from and writing to as well as the record size.

    You might want to use SQL Profiler, or the BufferSizeTuning log event, to see how many records are actually being inserted per transaction, as opposed to the number that is specified in the destination.  This is due to the fact that the commit size will only be able to submit the number of records it can contain in a single buffer, so it takes the lower of the records in the buffer or the max commit size.

    See the post - Fast Load with Maximum Insert Commit Size - for some more information on this topic.


    Please mark answered posts. Thanks for your time.
    Tuesday, July 13, 2010 2:15 AM
  • Please stop posting with all caps.
    Todd McDermid's Blog Talk to me now on
    Tuesday, July 13, 2010 2:25 AM
  • Suppose you have nearly 1 Crore records at your source side & have to transfer these records to destination in batch wise let's suppose like 1 lakh records per batch.

  • In SSIS OLEDB destination component, set Data access Mode =Table or view - fast load. There will be options made available, including Rows per batch and Maximum insert commit size. Set these to 10,000 to begin with and increase or decrease according to the speed required and amount of RAM available.

    Effect of Rows Per Batch and Maximum Insert Commit Size Settings

    Rows per batch – The default value for this setting is -1 which specifies all incoming rows will be treated as a single batch. You can change this default behavior and break all incoming rows into multiple batches. The allowed value is only positive integer which specifies the maximum number of rows in a batch.

    Maximum insert commit size – The default value for this setting is '2147483647' (largest value for 4 byte integer type) which specifies all incoming rows will be committed once on successful completion. You can specify a positive value for this setting to indicate that commit will be done for those number of records. You might be wondering, changing the default value for this setting will put overhead on the dataflow engine to commit several times. Yes that is true, but at the same time it will release the pressure on the transaction log and tempdb to grow tremendously specifically during high volume data transfers.

    The above two settings are very important to understand to improve the performance of tempdb and the transaction log. For example if you leave 'Max insert commit size' to its default, the transaction log and tempdb will keep on growing during the extraction process and if you are transferring a high volume of data the tempdb will soon run out of memory as a result of this your extraction will fail. So it is recommended to set these values to an optimum value based on your environment.


  • Thanks Shiven:) If Answer is Helpful, Please Vote

Wednesday, April 25, 2012 4:52 AM
  • Suppose you have nearly 1 Crore records at your source side & have to transfer these records to destination in batch wise let's suppose like 1 lakh records per batch. 

  • In SSIS OLEDB destination component, set Data access Mode =Table or view - fast load. There will be options made available, including Rows per batch and Maximum insert commit size. Set these to 10,000 to begin with and increase or decrease according to the speed required and amount of RAM available.

    Effect of Rows Per Batch and Maximum Insert Commit Size Settings

    Rows per batch – The default value for this setting is -1 which specifies all incoming rows will be treated as a single batch. You can change this default behavior and break all incoming rows into multiple batches. The allowed value is only positive integer which specifies the maximum number of rows in a batch.

    Maximum insert commit size – The default value for this setting is '2147483647' (largest value for 4 byte integer type) which specifies all incoming rows will be committed once on successful completion. You can specify a positive value for this setting to indicate that commit will be done for those number of records. You might be wondering, changing the default value for this setting will put overhead on the dataflow engine to commit several times. Yes that is true, but at the same time it will release the pressure on the transaction log and tempdb to grow tremendously specifically during high volume data transfers.

    The above two settings are very important to understand to improve the performance of tempdb and the transaction log. For example if you leave 'Max insert commit size' to its default, the transaction log and tempdb will keep on growing during the extraction process and if you are transferring a high volume of data the tempdb will soon run out of memory as a result of this your extraction will fail. So it is recommended to set these values to an optimum value based on your environment.


  • Thanks Shiven:) If Answer is Helpful, Please Vote

    Wednesday, April 25, 2012 4:52 AM