none
Is 100'000 x 1 == 1 x 100'000 ?

All replies

  •  TheViewMaster wrote:
    Is processing 100'000 rows x 1 at a time == 1 time x 100'000 rows at once

    No

     

    Wednesday, October 25, 2006 4:35 AM
    Moderator
  • This question really depends on caching and committing levels.
    Wednesday, October 25, 2006 1:26 PM
    Moderator
  •  Jamie Thomson wrote:

    TheViewMaster wrote:
    Is processing 100'000 rows x 1 at a time == 1 time x 100'000 rows at once

    No


    Why I am asking is because it seems to me that processing 100'000 rows x 1 at a time is exactly how SSIS data flow OLE DB Update command works - and it causes ***load of problems to us like deadlocking and web page timeouts.
    So how can i setup a test to prove that 100'000 rows x 1 at a time != 1 time x 100'000 rows
    Wednesday, October 25, 2006 1:48 PM
  • Yes, that is how it works.

    You should use Execute SQL Task if you want it to execute on the SQL server efficently.

    Wednesday, October 25, 2006 2:05 PM
  •  TheViewMaster wrote:

    Why I am asking is because it seems to me that processing 100'000 rows x 1 at a time is exactly how SSIS data flow OLE DB Update command works - and it causes ***load of problems to us like deadlocking and web page timeouts.
    So how can i setup a test to prove that 100'000 rows x 1 at a time != 1 time x 100'000 rows

     

    That seems correct.  You are pushing a row into the OLE DB Command where you are performing an update, presumably with parameters, correct?  Can you jump out of the data flow and perform your update there?

    Wednesday, October 25, 2006 2:06 PM
    Moderator
  •  Tom Phillips wrote:
    Yes, that is how it works.

    You should use Execute SQL Task if you want it to execute on the SQL server efficently.


    So how would i push 100 000 rows from data flow to Execute SQL task?
    Wednesday, October 25, 2006 2:35 PM
  • Don't know if this is the best way to do it but I handled a similar situation with set oriented operations by simply writing a .net console app to do it using VB.NET in which I issued a series of Update queries to handle the differing conditions I needed to handle.  Then I used the Execute Process task to run the console app.  Didn't have to leave my package to do it and also let me handle all my if then else update conditions elegantly in VB.

    Hope this helps.

    Steve Wells

    Wednesday, October 25, 2006 2:55 PM
  •  TheViewMaster wrote:
     Tom Phillips wrote:
    Yes, that is how it works.

    You should use Execute SQL Task if you want it to execute on the SQL server efficently.


    So how would i push 100 000 rows from data flow to Execute SQL task?

     

    The Execute SQL Task works on relational data. hence if you want to issue an Execute SQL Task against data in the pipeline you need to load that data to a relational database.

    -Jamie

     

    Wednesday, October 25, 2006 4:26 PM
    Moderator
  • Without knowing exactly what you are doing that is very hard to answer.

    If you use a "Data Flow" object, it basically does everything one record at a time, in memory, on the machine running the script, across the network, etc.

    Are you doing a bulk insert or update or inserts with transformations?

    If you can use the "Bulk Insert" object in control flow or simplify the data flow object to just insert the file into the table, then massage it using "Execute SQL Task", that is much better.

    In some cases, I create a temp table and import the source into it, so I can process it on the server using SQL commands instead of locally. In other cases that is not possible and Data Flow is the only way to do it.
    Wednesday, October 25, 2006 4:55 PM
  •  TheViewMaster wrote:
     Jamie Thomson wrote:

    TheViewMaster wrote:
    Is processing 100'000 rows x 1 at a time == 1 time x 100'000 rows at once

    No


    Why I am asking is because it seems to me that processing 100'000 rows x 1 at a time is exactly how SSIS data flow OLE DB Update command works - and it causes ***load of problems to us like deadlocking and web page timeouts.
    So how can i setup a test to prove that 100'000 rows x 1 at a time != 1 time x 100'000 rows

    Load the data into a table and do a set-based update using the Execute SQL Task instead.

    It will be orders of magnitude faster than using OLE DB Command.

    -Jamie

     

    Wednesday, October 25, 2006 5:39 PM
    Moderator
  • What I am trying to do is to Update 100000 records as fast as possible.
    Currently it is using OLE DB Update transformation.

    So for the test case I will create a temp table to hold Update records and then use Exec SQL to perform the update. I will post the test results in the weekend .

    Any other viable test cases?
    Wednesday, October 25, 2006 6:10 PM
  •  Jamie Thomson wrote:
     TheViewMaster wrote:
     Jamie Thomson wrote:

    TheViewMaster wrote:
    Is processing 100'000 rows x 1 at a time == 1 time x 100'000 rows at once

    No


    Why I am asking is because it seems to me that processing 100'000 rows x 1 at a time is exactly how SSIS data flow OLE DB Update command works - and it causes ***load of problems to us like deadlocking and web page timeouts.
    So how can i setup a test to prove that 100'000 rows x 1 at a time != 1 time x 100'000 rows

    Load the data into a table and do a set-based update using the Execute SQL Task instead.

    It will be orders of magnitude faster than using OLE DB Command.

    -Jamie

     

     

    That's how I conduct the updates within SSIS and it works really well; for instance when updating attributes in a type 2 SCD. OLE DB command on the other hand, is a performance killer if you are leading with medium to high volumes of data. 

     

    Rafael Salas

    Wednesday, October 25, 2006 7:50 PM
    Moderator