Proposed A descent SSIS and DQS sample

  • miércoles, 25 de julio de 2012 14:42
     
     

    Hi

    I have spent the last week reading up on SQL2012, DQS and MDS.  I think I am getting there but I was hoping there might be some more sample code out there.  I don't know who is responsible for creating the demos, AdventureWorks etc, but they don't really address a "batch" type requirement. I have watched all the videos but all fail to walk end to end from DQS to MDS.

    I have a problem to solve which I have distilled into the following requirement:

    1) I want to allow people to register lists of Names, Date-of-Birth and Addresses in my database.

    2) I want them to upload either an XLS file or an XML file with the data through a web page

    3) I want to cleanse the data using DQS. I want to check the address in the "cloud". I want my own KB of first and second names.

    4) I want "dodgy" data to be manually checked and approved by my staff. I want them to see the DTQ decision scoring reasoning, like the DQS client

    5) I want to store the "cleaned" data in MDS but identify "matching" data so I don't get duplicates. So "Bob" and "Robert" are matched if same DOB and address

    To me this must be fairly typical of the things people might like to achieve.

    From what I have read I MUST use SSIS for driving the cleansing. This means I must write the file somewhere and get SSIS to pick it up. Correct?

    Then, having created a KB, I run this from SSIS using the DQS component

    Then it gets a bit hazy....

    I must put decisions in my SSIS to detect the "dodgy" records that fall below a threshold. What do I do with them?  Do I write to my own tables? Do I then have to write screens to allow my staff to "approve" them? Can I use the DQS client for this? Can it see my "batch" run? Can I use Excel?

    Where do I put the clean records? Another table? An MDS staging table? 

    How do my "approved" (previously dodgy) records find their way into my "clean" table after approval?

    How do I improve my KB be feeding back from the manual approval process? 

    Given that I now want to "match" my "clean" records with what is already in MDS to remove duplicates, how is that achieved?  Do I bulk upload them from SSIS?

    As you can see, even with all my reading over the last week, joining all the dots is proving a bit of a challenge!!

    So if anyone out there fancies a challenge :) a simple one entity MDS and simple Excel file with 4 columns (fname, sname, dob, address) dropped in a folder...how hard can it be?? :))

    thanks


Todas las respuestas

  • miércoles, 25 de julio de 2012 15:30
    Moderador
     
     

    Just to make sure, did you already see this video?

    Enterprise Information Management (EIM): Bringing Together SSIS, DQS, and MDS http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DBI310

    Thanks, Jason


    Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance

  • miércoles, 25 de julio de 2012 16:51
     
     

    Just to make sure, did you already see this video?

    Enterprise Information Management (EIM): Bringing Together SSIS, DQS, and MDS http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DBI310

    Thanks, Jason


    Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance

    Hi

    Yes, I have watched this.  It covers the process but seems to get a bid hurried at the end around the MDS staging part.

    It leaves me wondering if the 2 SSIS flowcharts he has can (and should) be combined. We have SSIS doing the cleanse but this doesn't put the output into a MDS staging table for example.

    It just seems taking a file/table, cleaning it and populating MDS is an obvious scenario.

    I just think that a simple sample would be very useful :)

  • miércoles, 01 de agosto de 2012 8:17
     
     Respuesta propuesta
  • miércoles, 01 de agosto de 2012 20:57
     
     

    Yes, you need to combine the data flows shown in demo for your scenario. Please see Importing Data (Master Data Services) section of MDS documentation to learn about importing data into MDS via staging tables.

    HTH,

    Sreedhar

  • jueves, 01 de noviembre de 2012 19:12
     
     Respuesta propuesta

    I have written a tutorial and published it to the Download Center couple of days. Please review and see if it helps. The last lesson in this tutorial deals with creating an SSIS package that uses DQS cleansing functionality, Fuzzy Grouping to remove duplicates, and MDS entity-based staging to load the cleansed and matched data to MDS. Thanks.

    http://www.microsoft.com/en-us/download/details.aspx?id=35462

    • Propuesto como respuesta Paras Doshi lunes, 10 de diciembre de 2012 18:57
    •  
  • viernes, 02 de noviembre de 2012 16:48
     
     

    I have written a tutorial and published it to the Download Center couple of days. Please review and see if it helps. The last lesson in this tutorial deals with creating an SSIS package that uses DQS cleansing functionality, Fuzzy Grouping to remove duplicates, and MDS entity-based staging to load the cleansed and matched data to MDS. Thanks.

    http://www.microsoft.com/en-us/download/details.aspx?id=35462


    Great Tutorial, Thanks a lot! I just completed it.

    Paras Doshi (Blog: ParasDoshi.com | Twitter: @Paras_Doshi )