locked
How to track deleted records if ADF Copy pipeline is used? RRS feed

  • Question

  • I have source SQL Server DB and currently copying table data with ADF to staging table of DW.

    How to track what what records have been deleted in source so that EDW can be updated?


    Kenny_I

    Wednesday, June 5, 2019 2:13 PM

All replies

  • Hello Kenny_I and thank you for your inquiry.

    If you are doing full loads (replace all the contents of sink, with source, every time), then you shouldn't need to do anything.

    If you are doing incremental loads, the details may depend upon your paradigm and usage habits.

    Here are two suggestions for the watermark methods:

    Implement soft-deletes:  In your source table, instead of actually deleting records, add a column "is_deleted", which defaults to a False value.  When you want to delete a row, set the value to True and update the watermark.  After a successful update of both databases, you can go back and delete all rows which have "is_deleted" set to True.

    Implement on_delete triggers:  In your source table, set up a trigger, so that when you try to delete a row, it is written to a new table, "rows_to_delete", along with a watermark/timestamp.  After a successful incremental load of new records are written to your destination, do a similar incremental load on "rows_to_delete", except in the destination database, these rows do a stored procedure which finds and deletes the corresponding rows in the destination table.

    There are more methods which pertain to specific use-cases.  Do these ideas help you?

    Thursday, June 6, 2019 12:42 AM