locked
Alternatives to Sort Transformation RRS feed

  • Question

  • Hi,

    I am running a package having 1 terabyte of data and in the package we are doing various transformations including sorts before merge join transformations. Sort being a asynchronous component it is acting as a major road block and the package spools out a out of memory exception. Can you please let me know how to avoid sort component. I know that on a OLEdb source we can set a property as issorted and order the data by query and this I have implemented, however the problem arises when we are joining this data set downstream with other data sources. SSIS is implicitly asking me to sort the data again, now here I don't have a OLEdb source to set the issorted option. Please help.

     


    Rojit

    Friday, August 5, 2016 2:35 PM

Answers

  • Do not sort any in SSIS, sort all always on the database side.

    To speed up overall, you may want to rather process the data in chunks.


    Arthur

    MyBlog


    Twitter

    • Proposed as answer by Seif Wang Monday, August 8, 2016 9:43 AM
    • Marked as answer by Eric__Zhang Friday, August 19, 2016 12:19 AM
    Sunday, August 7, 2016 3:59 PM

All replies

  • Hi Rojit,

    Switch to OLEDB source where you can do the sort on the database side and mark


    Arthur

    MyBlog


    Twitter


    • Edited by ArthurZ Friday, August 5, 2016 2:54 PM
    Friday, August 5, 2016 2:53 PM
  • ArthurZ,

    The option that you mentioned option works best when I use it against a OLEDB source , but think about a scenario wherein I am going to further merge join the data down stream against another data set, this is when SSIS asks me to again sort the data, now here I don't have the luxury of oledb source since data is coming from upstream after an initial join from two sources.

    Thanks


    Rojit

    Friday, August 5, 2016 3:47 PM
  • Do not " initial join from two sources".

    Have one source and another sorted then merge inner join with the option above.


    Arthur

    MyBlog


    Twitter

    Friday, August 5, 2016 6:48 PM
  • Hi,

    Even if I use one as a data source and Sort the other one even then its is taking too much time, I mean the sort operation and eventually it times out. IS there a way to remove sort completely.

    Thanks


    Rojit


    • Edited by Rojit Sunday, August 7, 2016 11:26 AM Additional comments
    Sunday, August 7, 2016 11:25 AM
  • Do not sort any in SSIS, sort all always on the database side.

    To speed up overall, you may want to rather process the data in chunks.


    Arthur

    MyBlog


    Twitter

    • Proposed as answer by Seif Wang Monday, August 8, 2016 9:43 AM
    • Marked as answer by Eric__Zhang Friday, August 19, 2016 12:19 AM
    Sunday, August 7, 2016 3:59 PM