none
ETL - load data for SCDI dimension. RRS feed

  • Question

  • I have a source table: T1, I extracted the data to the staging table S_T1. ANd then I extract the updated rows via a change identifier column col1 from S_T1, now I need to load the data to the table P_T1 (SCDI). Now the data in control flow contains two parts: insert and update.  I have two options but I want to konw which one could be better in your mind:

    1) Use LookUp in data flow to indentify the two parts and then insert&update.

    2) Extract the Incrementa data, load the data to another staging table S_T2 and then delete the data from P_t1 like below and then instert all:

      delete from P_T1  from P_T1 A
      JOIN S_T2 B
      ON A.[businessKey] =B.[businessKey]

    Which one would be better?  Thank you!

    Thursday, March 10, 2011 9:50 AM

Answers

  • The issues with a delete and insert method are many:

    • Higher I/O, as you're deleting and inserting a lot of records, leading to greater fragmentation as well.
    • I would think you have a surrogate key in your dimension table - if you don't, you should!  Deletes and inserts would regenerate those surrogate keys, destroying the link to your fact table.  If you're considering the delete and insert, it indicates you don't have a surrogate key - which is probably a bigger problem.
    • In general, updating a data warehouse should never involve deleting data.  Updates to dimension tables are frequent, but updates to fact tables only in special circumstances.  Inserts all around...

    The biggest reason is probably the second one.  You just can't delete/insert if you've got a surrogate key.


    Todd McDermid's Blog Talk to me now on
    • Marked as answer by SSAS_user Sunday, March 13, 2011 6:26 AM
    Friday, March 11, 2011 6:52 PM
    Moderator

All replies

  • You can use combination of both. Use lookup to get the to-be-updated(TBU) and to-be-inserted(TBI) records and take TBI records to a destination component. Take TBU records to a staging table and do a set based update using staging table and the final table at control flow using an execute sql task.
    Nitesh Rai- Please mark the post as answered if it answers your question
    Thursday, March 10, 2011 11:07 AM
  • You can use combination of both. Use lookup to get the to-be-updated(TBU) and to-be-inserted(TBI) records and take TBI records to a destination component. Take TBU records to a staging table and do a set based update using staging table and the final table at control flow using an execute sql task.

    Thank you Nitesh, but that's the first option I mentioned. The second the option : delete the to-be-updated(TBU)  data from the target table, so we only need to insert without update.

     

    Thursday, March 10, 2011 2:13 PM
  • I tested my ETL package found that the second one could be better in my scenario. But base on my seach results, almost all of you use the first option. So I really want to konw your opinion. Which one could be better in your mind? Thanks!
    Friday, March 11, 2011 2:58 PM
  • The issues with a delete and insert method are many:

    • Higher I/O, as you're deleting and inserting a lot of records, leading to greater fragmentation as well.
    • I would think you have a surrogate key in your dimension table - if you don't, you should!  Deletes and inserts would regenerate those surrogate keys, destroying the link to your fact table.  If you're considering the delete and insert, it indicates you don't have a surrogate key - which is probably a bigger problem.
    • In general, updating a data warehouse should never involve deleting data.  Updates to dimension tables are frequent, but updates to fact tables only in special circumstances.  Inserts all around...

    The biggest reason is probably the second one.  You just can't delete/insert if you've got a surrogate key.


    Todd McDermid's Blog Talk to me now on
    • Marked as answer by SSAS_user Sunday, March 13, 2011 6:26 AM
    Friday, March 11, 2011 6:52 PM
    Moderator
  • Actrually, it's not a dimension. But your explaination makes perfect sence. Thank you!

    Sunday, March 13, 2011 6:29 AM