locked
Avoid Sort Transformations RRS feed

  • Question

  • Hello,

    Need help .

    My do have  my SSIS package having different numbers of controls as well logics build as per teh busineess requirement .

    Recently we are instructed to work on the optimization part of the package .because it was taking long time to process(approx 2.5 hrs).
    While debug of the SSIS package , i found that we are using Sort transformations with remove duplicates for the Fact table loading   in the  FACT DFT. count of these is 4  and this part is taking more time .approx 1 hr and 10 mints .

    Fact tables are having a lot of data approx 20m in each 4 Sort transformations .

    i need to find the alternative of SORT transformation with duplicates remove.

    What will be the best possible way to replace the sort transformation with remove duplicates.

    Any reference links will be very thankfull

    Wednesday, February 3, 2016 12:52 PM

All replies

  • Since it is from database query isnt it possible to do deduplication within query itself? which db are you extracting data from?

    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Wednesday, February 3, 2016 1:21 PM
  • Hello Visakh...

    We are extracting from another SQL Server . and within the DFT there are 2 sources

    1 is the Old SQL DB  and  other is the Navison  SQL Db .

    After this there are some data conversions  and some logics  and finally a UNION ALL Transformations before moving to the sort Transformation and last SQL SQL Destination

    As this data is from Fact table while in SORT Transformation it is taking time .

    Below this i am trying to explain the scenario :-----

    Source 1  (table)   ---------------      Source 2 (Query)

    dataconversions   ---------------     dataconversions

    logics                      -----------

    -------------- Union All Transformation --------

    ------------ Sort Transformation--------  (Time taking)

    ------------SQL Destination  -------------

    Wednesday, February 3, 2016 1:58 PM
  • How many columns are you sorting across?

    Would it be possible to stage the output of the Union All into a staging table and from there in a different dataflow use a T-SQL query using PARTITION BY to filter out the duplicates before inserting into the destination table?

    http://social.technet.microsoft.com/wiki/contents/articles/22706.how-to-remove-duplicates-from-a-table-in-sql-server.aspx

    Wednesday, February 3, 2016 8:24 PM