locked
Basic Upsert Functionality RRS feed

  • Question

  • Hi All

    I am using SSIS 2008 for a data migration project which involves multiple data sources. Therefore I am building a highly normalised enterprise layer where I intend to merge and clean the data. 

    However, when I transfer data from the staging layer into the enterprise layer the SSIS package fails where a record already exists in the destination table . Is this not a very basic fundamental activity that should be supported by any ETL tool ?

    My background is in data warehousing with Cognos and SAP BW, where both these ETL tools have been able to perform such an upsert out of the box for over a decade. There must be a built in way of doing this in SSIS 2008 but I dont seem to be able to find it. The only thing that comes close is the SCD transform, but that is meant for dimensions in the presentation layer (which have smaller row counts). I shouldnt have to write any sql to do this or split up the data into updates and inserts.

    I have come across an upsert transform from the Task Factory. Has anybody got any experience of using it? Is it any good?

    Please give me some suggestions?

    Regards

    Asif 


    Kind regards Asif

    Thursday, May 7, 2015 11:11 AM

Answers

All replies

  • Check this tutorial: Upsert and Merge with SSIS

    web: www.ronnierahman.com

    Thursday, May 7, 2015 11:40 AM
  • Thanx for this Ronnie.

    Is this the best solution that Microsoft can offer for such a common and basic ETL requirement ?

    When you have loads of tables in your D/W is this what you are expected to implement for every table? Surely not?

    I cant believe that Microsoft didnt implement a transform that attempts an insert and if that fails it then performs an update. Not exactly NASA level coding (ie Rocket Science)


    Kind regards Asif

    Thursday, May 7, 2015 12:01 PM
  • Just use SSIS to bulk load staging tables and then use TSQL MERGE to perform the upsert. Performing a row-by-row upsert from an ETL tool is really painfully slow.

    There is also a sample here:

    http://sqlsrvintegrationsrv.codeplex.com/releases/view/19048

    And a third party one here:

    http://pragmaticworks.com/Products/Task%20Factory/Features/UpsertDestination(BatchUpdateOrInsert)

    David


    David http://blogs.msdn.com/b/dbrowne/



    Thursday, May 7, 2015 2:37 PM
  • As a follow on question, whats the best way to perform summary calculations that update the target table.

    This is another very common activity in data warehousing scenarios where aggregations and / or flags are calculated for the entities in the data model. 

    How is that commonly performed in SSIS ?


    Kind regards Asif

    Friday, May 8, 2015 9:31 AM
  • As a follow on question, whats the best way to perform summary calculations that update the target table.

    This is another very common activity in data warehousing scenarios where aggregations and / or flags are calculated for the entities in the data model. 

    How is that commonly performed in SSIS ?


    Kind regards Asif

    You can use 'Aggregate' Data Flow component.

    web: www.ronnierahman.com

    Friday, May 8, 2015 9:34 AM
  • Asking "How do you do X in SSIS?" is often the wrong question.

    SSIS is capable of perform all sorts of transformations and calculations in a data flow.  But it's often better/faster/cheaper to perform the transformations either

    1) as the data is extracted from the source system using a SQL Query

    2) after the data has been landed in the target system using a SQL Query

    And use SSIS to orchestrate the load process and move the data between systems.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Friday, May 8, 2015 1:09 PM
  • Hi Ronnie

    Can the aggregate transform update an existing record or does it have to feed the result into a new record ?


    Kind regards Asif

    Friday, May 8, 2015 1:39 PM
  • Hi Ronnie

    Can the aggregate transform update an existing record or does it have to feed the result into a new record ?


    Kind regards Asif

    Aggregate Transform doesn't update an existing record. Find out more: The Aggregate Transformation

    web: www.ronnierahman.com

    Friday, May 8, 2015 1:41 PM
  • Before you start using these transformations in your enterprise wide ETL system, I want you to become familiar with the concept of blocking vs partial blocking transformations. This may help you to understand the importance of leveraging T-SQL over SSIS transformations.

    http://sqlblog.com/blogs/jorg_klein/archive/2008/02/12/ssis-lookup-transformation-is-case-sensitive.aspx


    Thanks, hsbal

    Friday, May 8, 2015 6:35 PM
  • As suggested by Harry,it is better to gain in-depth knowledge of blocking & partially blocking transformations and then decide whether to use the transformation or to go for a stored proc with the complex logic.

    Regards, Sreelekha

    Sunday, May 10, 2015 9:43 AM