Monday, July 09, 2012 9:01 PM
Through the use of importing data via Knowledge Discovery, I would like to take in data that is based on a hierarchy. For example, I would like to cleanse data that has a model of a vehicle that belongs to a manufacture. I have a set of tables that define the correct relationship so taking this same data and using the import process via Knowledge Discovery is preferred. Does anyone have any tips or tricks on how to import data that relates?
My goal is to cleanse data as invalid or corrected when a model does not belong to the appropriate manufacture. I would like this in the reverse manner as well (e.g. manufacture does not belong to a certain model). In the past, I have been importing models and their manufactures on an individual basis ....so if the model is misspelled then it is corrected or marked as invalid. This is the same for the manufactures. However, now I want to extend this so that a model can't be mis-represented under a manunfacture (e.g. Mustang to Chevrolet or Chevrolet to Mustang).
Thanks for any tips or tricks on how to validate data to a given hierarchy.
Wednesday, July 11, 2012 2:00 PMI have not heard anything for over two days. Can anyone suggest anything?
Sunday, July 15, 2012 3:16 AMModerator
There isn't a feature for hierarchy validation in DQS.
You could try this - in Domain Management make a Composite Domain that encapsulates the two simple domains Manufacturer and Model.
The frequency of related values in the simple domains can help DQS find the outliers in the composite domain. So if you have 1 Chevrolet Mustang, because the most frequent value will be 100s of other Ford Mustang it may help you find that one outlier.
Composite Domain help > http://msdn.microsoft.com/en-us/library/hh510414
You can also make a cross-domain rule to enter that if the value of Model is Mustang, then the value of Manufacturer must be equal Ford.
That means typing in a lot of rules though. I don't think there is a tool yet than can take your hierarchy and make a set of 1000 rules of out that data set.
Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance