Composite Domains in SSIS with spaces in field
-
Tuesday, August 07, 2012 10:57 PM
I'm trying to send in a Composite Domain into SSIS by the method of concatenating two fields to make up the domain:
KB has 2 domains:
> Shape (e.g. Triangle)
> Sides (e.g. 3)
If I try to map into a Composite Domain that uses Shapes and Sides, by using a Derived Column as [Shapes] + " " + [Sides] it works, unless I have two words, e.g. "Triangle Plus" in the Shapes Domain. Then the Composite Domain in SSIS treats the "Triangle" component as the Shapes domain element, and "Plus" as the Sides domain component and treats it as a New value as "Plus" is not a recognised number of sides.
Anybody know how to work around this? Are there alternate delimiters other than Space I can use?
James Beresford @ www.bimonkey.com & @BI_Monkey
SSIS / MSBI Consultant in Sydney, Australia
SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex
All Replies
-
Wednesday, August 08, 2012 5:19 AM
Hi James,
You don't have to concatenate 2 fields in the example that you mentioned here. If you map your 2 individual source fields to both the domains in your Composite Domain, your composite domain automatically participates in the cleansing activity. Let me illustrate this with an example:
Here is the sample source data in 2 fields in an Excel file (I purposely introduced errors in the values in the Sides field):
Shapes Sides
Square 5
Triangle 7
Triangle Plus 8
Rectangle 3
Circle 10
To cleanse this data using the DQSCleansing tranformation is SSIS, I do the following:
Step 1: in DQS, I create a Composite Domain, Shapes&Sides, that contains 2 domains: Shapes and Sides.
Step 2: Next, I created a bunch of CD rules for the Shapes&Sides composite domain that validate the number of sides based on the shape:
Step 3: Next, in SSIS, I create a simple data flow task that inputs data from the source Excel file, cleanses it, and outputs to a SQL DB:
Step 4: IMPORTANT: In the DQS Cleansing transformation, make sure you map your individual source columns to the individual domains within the composite domain. When all the domains within a composite domain are mapped, the composite domain participates in the cleansing activity, and all the CD rules are applied during the cleansing:
See the next post for the output as there is a limit of 2 images per post. Thanks!
- Marked As Answer by BI Monkey Wednesday, August 08, 2012 11:01 AM
-
Wednesday, August 08, 2012 5:20 AM
[Continued from the Previous Post]
When you run the SSIS package, this is the output:
Hope that helps. Happy cleansing!
~ Vivek
-
Wednesday, August 08, 2012 11:02 AM
Thanks Kumar - this conflicts with advice i'd had from the SSIS team but does work.
However I did note that the Status of the Domains impacted by a correction is "New" - I presume this is a bug?
James Beresford @ www.bimonkey.com & @BI_Monkey
SSIS / MSBI Consultant in Sydney, Australia
SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex -
Thursday, August 09, 2012 3:19 AM
Hi James,
Its good to know that this resolves your issue. About the status of the domains impacted by correction, I will check and get back to you within a week.
Thanks,
Vivek -
Friday, August 10, 2012 3:19 AMModeratorTo me, it seems like all New value if the values are not in the domain as values. Did you add them as domain values?
Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance
-
Tuesday, August 14, 2012 4:57 AMThe Domain values are valid in isolation - e.g. "Triangle" for Shape and "5" for sides. It's just when corrected by a composite domain rule, they get flagged as "New" instead of "Corrected" (as I would expect) - even the Domain value which wasn't updated by the rule ("Triangle").
James Beresford @ www.bimonkey.com & @BI_Monkey
SSIS / MSBI Consultant in Sydney, Australia
SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex

