locked
Data Cleansing in SQL Server 2008 R2 Master Data Service RRS feed

  • Question

  • Does Master Data Service has some in built mechanism to deal with cleansing of data? Or, is there some cleansing service available that works with SQL Server 2008 R2 Master Data Services?
    • Changed type HPawar Thursday, November 25, 2010 10:45 AM This question is related to master data services
    • Moved by Tom Li - MSFT Friday, November 26, 2010 8:11 AM (From:Getting started with SQL Server)
    Thursday, November 25, 2010 9:33 AM

Answers

  • Hello,

    Master Data Services has a data cleaing service capable of normalize data formats, replace missing values, standardize values and map attributes. For more information:

    http://download.microsoft.com/download/D/B/D/DBDE7972-1EB9-470A-BA18-58849DB3EB3B/MasterDataManagementfromaTechnicalPerspective.pdf

    http://msdn.microsoft.com/en-us/library/bb190163.aspx

    http://download.microsoft.com/download/D/E/0/DE0C5C1D-E8F3-4663-9720-3108B3A6F059/Pozzera-PowerPivot_ELCA.pptx

    The following forum may be more appropriate for this question:

    http://social.msdn.microsoft.com/Forums/en-US/sqlmds/threads
     

    Hope this helps.   
     
    Regards,

    Alberto Morillo
    SQLCoffee.com

    Thursday, November 25, 2010 1:15 PM
  • The best features that MDS currently has in this regard are business rules (http://msdn.microsoft.com/en-us/library/ff487015.aspx) and the filters provided above the grid in Explorer. In particular the similarity level can be used to find close matches: http://msdn.microsoft.com/en-us/library/ee633833.aspx.


    Suzanne Selhorn [MSFT]
    Tuesday, November 30, 2010 12:33 AM
  • Personally I would handle data cleansing in the ETL process of loading data into MDS, not in MDS itself.  That being said...At PASS they unveiled a new component in SQL 11....Quality Data Services. This will, presumably, provide the functionality that you are looking for.  Unfortunatly until then I think you are stuck with combining SSIS/SQL Server to with MDS Business Rules to achieve your ends.  Use SSIS/SQL Server to make the data consitent before you load it into the staging tables of MDS, then use MDS Business Rules to eliminate duplicates, etc...

    Tuesday, November 30, 2010 6:10 PM
  • Hi HPawar

    Personnally, I use C# code with MDS API to do some cleansing

    you can find some sample code an my free winform app on http://mdsmanager.codeplex.com/

     public void DeleteAllMembersFromSelectedEntity()
        {
          try
          {
            Cursor.Current = Cursors.WaitCursor;
    
            if (ucManageEntities1 != null && ucManageEntities1.lstEntities.SelectedItem != null)
            {
              EntityMembers em = new EntityMembers();
    
              em.EntityId = ((CustomEntity)ucManageEntities1.lstEntities.SelectedItem).entityId;
              em.ModelId = (Identifier)lstModels.SelectedItem;
              em.VersionId = (Identifier)lstVersions.SelectedItem;
              em.MemberType = MemberType.Leaf;
              em.Members = new Collection<Member>();
    
              foreach (CustomMember mnc in ucManageMembers1.lstMembers.Items)
              {
                Member m = new Member();
                m.MemberId = new MemberIdentifier();
                m.MemberId.Name = mnc.Name.Trim();
                m.MemberId.Code = mnc.Code.Trim();
                em.Members.Add(m);
              }
              OperationResult or = new OperationResult();
    
              using (ServiceClient c = MDS_WSConnect.CreateMdsProxy())
              {
                or = c.EntityMembersDelete(new International(), em);
              }
              StringBuilder sb = new StringBuilder();
              if (or.Errors.Count > 0)
              {
                foreach (Error err in or.Errors)
                {
                  sb.AppendLine(em.Members[0].MemberId.Code + " - " + err.Code + " - " + err.Description);
                }
                MessageBox.Show(sb.ToString(), "Errors");
              }
              RefreshLstMembersAttributes((Identifier)lstModels.SelectedItem, (CustomEntity)ucManageEntities1.lstEntities.SelectedItem, (Identifier)lstVersions.SelectedItem);
            }
          }
    
    


    Note: Please vote/mark the post as answered if it answers your question/helps to solve your problem. _________________________________________________________ Regards, Xavier Averbouch, Consultant, Avanade France
    Wednesday, December 8, 2010 4:13 PM

All replies

  • Hello,

    Master Data Services has a data cleaing service capable of normalize data formats, replace missing values, standardize values and map attributes. For more information:

    http://download.microsoft.com/download/D/B/D/DBDE7972-1EB9-470A-BA18-58849DB3EB3B/MasterDataManagementfromaTechnicalPerspective.pdf

    http://msdn.microsoft.com/en-us/library/bb190163.aspx

    http://download.microsoft.com/download/D/E/0/DE0C5C1D-E8F3-4663-9720-3108B3A6F059/Pozzera-PowerPivot_ELCA.pptx

    The following forum may be more appropriate for this question:

    http://social.msdn.microsoft.com/Forums/en-US/sqlmds/threads
     

    Hope this helps.   
     
    Regards,

    Alberto Morillo
    SQLCoffee.com

    Thursday, November 25, 2010 1:15 PM
  • Those powerpoints and pdfs just tell the backgrounds and explain the concept but don't tell anything about the MDS application itself.

    Can someone please show an example how to really implement such features as data cleansing  in the MDS application itself ?

     

    Friday, November 26, 2010 8:28 AM
  • The best features that MDS currently has in this regard are business rules (http://msdn.microsoft.com/en-us/library/ff487015.aspx) and the filters provided above the grid in Explorer. In particular the similarity level can be used to find close matches: http://msdn.microsoft.com/en-us/library/ee633833.aspx.


    Suzanne Selhorn [MSFT]
    Tuesday, November 30, 2010 12:33 AM
  • Personally I would handle data cleansing in the ETL process of loading data into MDS, not in MDS itself.  That being said...At PASS they unveiled a new component in SQL 11....Quality Data Services. This will, presumably, provide the functionality that you are looking for.  Unfortunatly until then I think you are stuck with combining SSIS/SQL Server to with MDS Business Rules to achieve your ends.  Use SSIS/SQL Server to make the data consitent before you load it into the staging tables of MDS, then use MDS Business Rules to eliminate duplicates, etc...

    Tuesday, November 30, 2010 6:10 PM
  • Let me elaborate my question and it is,

    How to implement data cleansing(data scrubbing) with Master Data Hub, either by using matching functions(for eg. mdq.Similarity function etc.) or else I need to know in what other way data cleansing can be done in Master Data Hub, (i) for bulk data that is already loaded in Master Data Hub (ii) for new records inserted/updated as result of the Master Data Synchronization?

    Kindly refer below architecture of my Master Data Management Hub and Subscribing system,

                                                   Master Data
                                                 Synchronization 
                                   |            |---------------->|Master|
    Subscribing <------->| Biztalk  |                      | Data |
      System                  |            |<----------------| Hub   |
                                                    Master Data
                                                     Publication

    Thank you in advance!

    Wednesday, December 1, 2010 9:43 AM
  • Hi HPawar

    Personnally, I use C# code with MDS API to do some cleansing

    you can find some sample code an my free winform app on http://mdsmanager.codeplex.com/

     public void DeleteAllMembersFromSelectedEntity()
        {
          try
          {
            Cursor.Current = Cursors.WaitCursor;
    
            if (ucManageEntities1 != null && ucManageEntities1.lstEntities.SelectedItem != null)
            {
              EntityMembers em = new EntityMembers();
    
              em.EntityId = ((CustomEntity)ucManageEntities1.lstEntities.SelectedItem).entityId;
              em.ModelId = (Identifier)lstModels.SelectedItem;
              em.VersionId = (Identifier)lstVersions.SelectedItem;
              em.MemberType = MemberType.Leaf;
              em.Members = new Collection<Member>();
    
              foreach (CustomMember mnc in ucManageMembers1.lstMembers.Items)
              {
                Member m = new Member();
                m.MemberId = new MemberIdentifier();
                m.MemberId.Name = mnc.Name.Trim();
                m.MemberId.Code = mnc.Code.Trim();
                em.Members.Add(m);
              }
              OperationResult or = new OperationResult();
    
              using (ServiceClient c = MDS_WSConnect.CreateMdsProxy())
              {
                or = c.EntityMembersDelete(new International(), em);
              }
              StringBuilder sb = new StringBuilder();
              if (or.Errors.Count > 0)
              {
                foreach (Error err in or.Errors)
                {
                  sb.AppendLine(em.Members[0].MemberId.Code + " - " + err.Code + " - " + err.Description);
                }
                MessageBox.Show(sb.ToString(), "Errors");
              }
              RefreshLstMembersAttributes((Identifier)lstModels.SelectedItem, (CustomEntity)ucManageEntities1.lstEntities.SelectedItem, (Identifier)lstVersions.SelectedItem);
            }
          }
    
    


    Note: Please vote/mark the post as answered if it answers your question/helps to solve your problem. _________________________________________________________ Regards, Xavier Averbouch, Consultant, Avanade France
    Wednesday, December 8, 2010 4:13 PM