We want to apply Data Cleansing on current CRM system to dedupe data and improve data quality. Our ultimate goal is building a MDM system. After roughly trying DQS and MDS, I still feel confused in some scenarios and need your advice.
1) Can we correct original (RAW, Source) CRM(OLTP) data by DQS ?
It seems DQS does not cleanse RAW data in the source, is it correct? Based on my understanding, the KB is actually a Rule Base that filters/regulates data against Rules, that's the meaning of "cleansing" in DQS (eg: Rule 1: "NY"
should be regulated to "New York". Once "NY" data occurrs, DQS would correct it to "New York"). We need to EXPORT corrected ones to EXCEL or other SQL Tables, but not UPDATE to its original record. So, can we correct
RAW data by DQS only ? or need SSIS as well ? any design idea or samples ?
2) How can we cleanse values which are hard to be eunumerated in Domain Value?
Samples in Knowledge Domain tutorial are Domains which can be eunumerated, such as Country, City, Zip, etc. How about Company Name or Customer Name(Fist Name, Last Name, Middle Name) ?
If these Names all stand for the same person, how can we use DQS to regulate them to
Last Name = Smith and First Name = Bob ? The combinations are so much that I think it's hard to eunumerate every cases and list them in Domain Values.
btw, the naming rules are different in countries.
Nguyen Kim Thuy
How can I use DQS to split Last Name and First Name ?
3) How can I use Data Profiling to get Data Quality Assessment ?
As far as I know, data profiling could be used to know the characteristics of data, such as distribution of values, maximumm, minimum, outliers that could be abnormal values, etc, and provide a overview to let us know how good our Data Quality is. In
DQS, Data Profiler is in KB Matching Policy function with a few measurements. In SSIS, Data Profiling Task provides much more dimensions to view the data. Can you suggest how we can use Data Profiling Task with DQS or MDS to assess the improvement
of Data Quality ?
Edited byNick_GoThursday, August 15, 2013 2:12 AMtypo