locked
How to merge/replace data in a table when importing new data from csv RRS feed

  • Question

  • I have a late breaking change request from a customer where I created a basic routine to import data from a csv file into a table.  They now want the ability to "update" data via the csv import.  Meaning if the first 2 columns in the record from the csv match a row in the table, delete that row in the table and replace with the row from the csv file.  It is literally a 3 column file/table. 


    CREATE TABLE [dbo].[Profile_Actuals](
    [Date] [datetime] NULL,
    [Profile] [varchar](4) NULL,
    [Value] [real] NULL
    )
    GO


    Wednesday, February 18, 2015 3:43 PM

Answers

  • Hi Jason, 

    An alternative (among many) is to redirect the rows you don't match with your Lookup component to a straightforward insertion to your destination table (because these rows are not yet there) and redirect the ones to a delta table that you truncate before every execution. 

    After this, execute an update statement in a SQL Task joining the rows you have in your delta table with the ones in your destination table (using the primary key). Something like 

    update dest set dest.a = delta.a, dest.b = delta.b from destination_table dest inner join delta_table delta on dest.pk = delta.pk

    • Marked as answer by JasonDWilson77 Friday, February 20, 2015 4:59 PM
    Wednesday, February 18, 2015 7:09 PM
  • I found this example online that seems to do what I am looking for.  This one has a delete involved as it is trying to sync to tables, but otherwise seems a pretty good fit.

      

    INSERT LinkedServer.dbo.Dest (Col1, Col2, Col3)

    SELECT Col4, Col5, Col6

    FROM Source S WITH (TABLOCK, HOLDLOCK)

    WHERE NOT EXISTS (

       SELECT 1

       FROM LinkedServer.dbo.Dest D WITH (TABLOCK, HOLDLOCK)

       WHERE S.Key = D.Key

    )



    DELETE D

    FROM LinkedServer.dbo.Dest D

    WHERE NOT EXISTS (

       SELECT 1

       FROM Source S

       WHERE D.Key = S.Key

    )



    UPDATE D

    SET

       D.Col1 = S.Col4,

       D.Col2 = S.Col5,

       D.Col3 = S.Col6

    FROM

       LinkedServer.dbo.Dest D

       INNER JOIN Source S ON D.Key = S.Key

    WHERE

       D.Col1 <> S.Col4

       OR Coalesce(D.Col1, '!-NULL-!') = Coalesce(S.Col5, '!-NULL-!') -- or some way to handle nulls if they can be present

       OR D.Col3 <> S.Col6

    • Marked as answer by JasonDWilson77 Friday, February 20, 2015 4:59 PM
    Friday, February 20, 2015 4:59 PM

All replies

  • You can do it from SSIS with the Lookup Transformation or if you are running SQL Server 2008 or above with the T-SQL Merge

    Arthur

    MyBlog


    Twitter

    • Proposed as answer by Wendy Fu Thursday, February 19, 2015 9:27 AM
    Wednesday, February 18, 2015 3:54 PM
  • Thanks for the reply!  That is definitely new to me.  I am on SQL 2012.  Seems like doing this via T-SQL would be easier based off of the first tutorial video I just watched (https://www.youtube.com/watch?v=UZH8b5cpgBU).  I believe I understand how the lookup transformation works not sure how to apply it exactly to what I am trying to do without duplicating my insert logic as it splits between the match/no match logic of the lookup.
    Wednesday, February 18, 2015 4:39 PM
  • Hi Jason, 

    An alternative (among many) is to redirect the rows you don't match with your Lookup component to a straightforward insertion to your destination table (because these rows are not yet there) and redirect the ones to a delta table that you truncate before every execution. 

    After this, execute an update statement in a SQL Task joining the rows you have in your delta table with the ones in your destination table (using the primary key). Something like 

    update dest set dest.a = delta.a, dest.b = delta.b from destination_table dest inner join delta_table delta on dest.pk = delta.pk

    • Marked as answer by JasonDWilson77 Friday, February 20, 2015 4:59 PM
    Wednesday, February 18, 2015 7:09 PM
  • I found this example online that seems to do what I am looking for.  This one has a delete involved as it is trying to sync to tables, but otherwise seems a pretty good fit.

      

    INSERT LinkedServer.dbo.Dest (Col1, Col2, Col3)

    SELECT Col4, Col5, Col6

    FROM Source S WITH (TABLOCK, HOLDLOCK)

    WHERE NOT EXISTS (

       SELECT 1

       FROM LinkedServer.dbo.Dest D WITH (TABLOCK, HOLDLOCK)

       WHERE S.Key = D.Key

    )



    DELETE D

    FROM LinkedServer.dbo.Dest D

    WHERE NOT EXISTS (

       SELECT 1

       FROM Source S

       WHERE D.Key = S.Key

    )



    UPDATE D

    SET

       D.Col1 = S.Col4,

       D.Col2 = S.Col5,

       D.Col3 = S.Col6

    FROM

       LinkedServer.dbo.Dest D

       INNER JOIN Source S ON D.Key = S.Key

    WHERE

       D.Col1 <> S.Col4

       OR Coalesce(D.Col1, '!-NULL-!') = Coalesce(S.Col5, '!-NULL-!') -- or some way to handle nulls if they can be present

       OR D.Col3 <> S.Col6

    • Marked as answer by JasonDWilson77 Friday, February 20, 2015 4:59 PM
    Friday, February 20, 2015 4:59 PM