locked
execute sql task vs data flow task RRS feed

  • Question

  • My manager has asked me to use linked server to connect to ORACLE database rather then oledb

     

    my source db is ORACLE and my Distination is SQL server 2005 ( in 64 bit  windows server)

    I am planning to write this query in an execute SQL task,

    INSERT  dbo.MV_MAX (PERIOD
                                    ,PRODUCTGROUPID
                                    ,PRODUCTID
                                    ,SUBPRODUCTID
                                    ,HLBER_ID
                                    ,BE_ID
                                    ,ENTITYID
                                    ,REVENUE)
    SELECT *  FROM  openquery (PMAX, ' Select PERIOD
                                                ,PRODUCTGROUPID   
                                                ,PRODUCTID
                                                ,SUBPRODUCTID
                                                ,HLBER_ID
                                                ,BE_ID
                                                ,ENTITYID
                                                ,REVENUE
                                                from MAX.MV_MAX
                                                WHERE BE_ID <> 0
                                                AND ENTITYID <> ''XXX''
                                                AND REVENUE <> 0')

     

    the source table in ORACLE  DB server has arround 11.5 million records to be transfered to destination(SQL SERVER 2005) table.

     

    Can anyone suggest me.. which would be more faster...

    writing the above query in execute SQL task

    or

      using a data flow task and writing the Select query to get data from source in OLEB SQL data source ( using sql server and the query would still be using linked server as above)  and  then a OLEDB SQL destination.

    Thursday, September 2, 2010 1:28 PM

Answers

All replies

  • Could you be more specific about your question.
    Thursday, September 2, 2010 1:38 PM
  • my question was which  way would the data transfer me faster   using the EXECUTE SQL Task  or using the data flow task.

    My source data is in ORACLE DB  and have to transfer the data to a table in SQL server 2005 DB

    I will be having only one connection manager that is the  Distination sql server.  My source database (PMAX) is in ORACLE DB which I will connect via Linked server.

    I have 2 options  to accomplish  1. using execute SQL task and 2. using data flow

    If I use Exeute SQL task  i would bind this to the SQL Connenction manager and would run the above given query.

    If I use Data flow task, I would use OLEDB Data source connected to the SQL Connection manager  then run the query in it:

                                    ,REVENUE)
    SELECT *  FROM  openquery (PMAX, ' Select PERIOD
                                                ,PRODUCTGROUPID   
                                                ,PRODUCTID
                                                ,SUBPRODUCTID
                                                ,HLBER_ID
                                                ,BE_ID
                                                ,ENTITYID
                                                ,REVENUE
                                                from MAX.MV_MAX
                                                WHERE BE_ID <> 0
                                                AND ENTITYID <> ''XXX''
                                                AND REVENUE <> 0')

    Then have a OLEDB distination connected to SQL Connection manager and the destination table.

    both OLEDB SOURCE and Destination  will be connected to only the same connection manager.

     

    Can some one suggest which option would lead to faster data transfer as I am dealing with 12 million records...

     

     

     

    Thursday, September 2, 2010 2:09 PM
  • If you're using a linked server, then it will most likely be faster if you do it all through the Execute SQL Task.

    If you weren't using a linked server, then doing the transfer with an SSIS data flow is usually performs better.

    Thursday, September 2, 2010 2:11 PM
  • Well Linked server is an option but I would prefer doing it in a DFT as it would provide you other benefits like error handling and logging.

    While using Execute SQL task you would have to modify your code to do this. You would be having a transaction and rollback in case of issue.

    In DFT it would be better as the error records could be redirected to some staging area for analysis and reload.


    Sudeep's Domain
    • Proposed as answer by Kalman Toth Thursday, September 2, 2010 3:33 PM
    • Marked as answer by developer_zone1 Thursday, September 2, 2010 7:25 PM
    Thursday, September 2, 2010 2:15 PM
  • Reading 11Mil records you have to be careful, you need to do lots of testing and checking with tools like the profiler to make sure that you are getting the most out of the data transfer.

    My suggestion is using PARTITION TABLES, and it’s kind of a parallel loading

    I know that you are working with SQL2005 but take a look at

    http://henkvandervalk.com/reading-as-fast-as-possible-from-a-table-with-ssis-part-ii

    http://henkvandervalk.com/speeding-up-ssis-bulk-inserts-into-sql-server/comment-page-1#comment-1976

    http://msdn.microsoft.com/en-us/library/dd537533.aspx

    http://msdn.microsoft.com/en-us/library/dd425070.aspx


    Sincerely SH -- MVP, MCITP 2008, MCTS 2008 & 2005 -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
    • Proposed as answer by Kalman Toth Thursday, September 2, 2010 3:36 PM
    • Marked as answer by developer_zone1 Thursday, September 2, 2010 7:25 PM
    Thursday, September 2, 2010 2:34 PM