Note: Forums will be making significant UX changes to address key usability improvements surrounding search, discoverability and navigation. To learn more about these changes please visit the announcement which can be found HERE.
Composite Domains in SSIS with spaces in field

Answered Composite Domains in SSIS with spaces in field

  • Tuesday, August 07, 2012 10:57 PM
     
     

    I'm trying to send in a Composite Domain into SSIS by the method of concatenating two fields to make up the domain:

    KB has 2 domains:

    > Shape (e.g. Triangle)

    > Sides (e.g. 3)

    If I try to map into a Composite Domain that uses Shapes and Sides, by using a Derived Column as [Shapes] + " " + [Sides] it works, unless I have two words, e.g. "Triangle Plus" in the Shapes Domain. Then the Composite Domain in SSIS treats the "Triangle" component as the Shapes domain element, and "Plus" as the Sides domain component and treats it as a New value as "Plus" is not a recognised number of sides.

    Anybody know how to work around this? Are there alternate delimiters other than Space I can use?


    James Beresford @ www.bimonkey.com & @BI_Monkey
    SSIS / MSBI Consultant in Sydney, Australia
    SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex

All Replies

  • Wednesday, August 08, 2012 5:19 AM
     
     Answered

    Hi James,

    You don't have to concatenate 2 fields in the example that you mentioned here. If you map your 2 individual source fields to both the domains in your Composite Domain, your composite domain automatically participates in the cleansing activity. Let me illustrate this with an example:

    Here is the sample source data in 2 fields in an Excel file (I purposely introduced errors in the values in the Sides field):

    Shapes        Sides

    Square           5
    Triangle          7
    Triangle Plus   8
    Rectangle       3
    Circle             10

    To cleanse this data using the DQSCleansing tranformation is SSIS, I do the following:

    Step 1: in DQS, I create a Composite Domain, Shapes&Sides, that contains 2 domains: Shapes and Sides.

    Step 2: Next, I created a bunch of CD rules for the Shapes&Sides composite domain that validate the number of sides based on the shape:

    Step 3: Next, in SSIS, I create a simple data flow task that inputs data from the source Excel file, cleanses it, and outputs to a SQL DB:

    Step 4: IMPORTANT: In the DQS Cleansing transformation, make sure you map your individual source columns to the individual domains within the composite domain. When all the domains within a composite domain are mapped, the composite domain participates in the cleansing activity, and all the CD rules are applied during the cleansing:

    See the next post for the output as there is a limit of 2 images per post. Thanks!

    • Marked As Answer by BI Monkey Wednesday, August 08, 2012 11:01 AM
    •  
  • Wednesday, August 08, 2012 5:20 AM
     
     

    [Continued from the Previous Post]

    When you run the SSIS package, this is the output:

    Hope that helps. Happy cleansing!

    ~ Vivek

  • Wednesday, August 08, 2012 11:02 AM
     
     

    Thanks Kumar - this conflicts with advice i'd had from the SSIS team but does work.

    However I did note that the Status of the Domains impacted by a correction is "New" - I presume this is a bug?


    James Beresford @ www.bimonkey.com & @BI_Monkey
    SSIS / MSBI Consultant in Sydney, Australia
    SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex

  • Thursday, August 09, 2012 3:19 AM
     
     

    Hi James,

    Its good to know that this resolves your issue. About the status of the domains impacted by correction, I will check and get back to you within a week.

    Thanks,
    Vivek

  • Friday, August 10, 2012 3:19 AM
    Moderator
     
     
    To me, it seems like all New value if the values are not in the domain as values. Did you add them as domain values?

    Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance

  • Tuesday, August 14, 2012 4:57 AM
     
     
    The Domain values are valid in isolation - e.g. "Triangle" for Shape and "5" for sides. It's just when corrected by a composite domain rule, they get flagged as "New" instead of "Corrected" (as I would expect) - even the Domain value which wasn't updated by the rule ("Triangle").

    James Beresford @ www.bimonkey.com & @BI_Monkey
    SSIS / MSBI Consultant in Sydney, Australia
    SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex