Reference Data - self provided RRS feed

  • Question

  • Hi

    Just starting to work my way through this - is it possible to set up reference data from anywhere other than Azure? Say for example I had an internally maintained list of Product Names I wanted to use as lookup for reference data, how could I reference that (other than by repeatedly importing it from Excel)?

    To add complexity, can SSDQS talk to Master Data Services? (and if not... why not?)

    Cheers, James

    James Beresford @ www.bimonkey.com
    SSIS / MSBI Consultant in Sydney, Australia
    SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex

    Wednesday, July 27, 2011 1:29 AM

All replies

  • Hi James,

    Thank you for your comments and questions.

    1. MDS does "talk" to DQS, via the MDS Excel add-in. The Excel add-in has a built-in functionality to de-duplicated its records, and this is done via DQS. You can either use a pre-defined KB in the DQS server, or a default KB that we will create on the fly based on your Excel data.

    2. As for using reference data - there are three ways to go about doing that, only two available now.

      a. Onboard this data into your knowledge base via import or via knowledge discovery. This way the domain in the KB will contain all your reference data, and you can use this for cleansing or matching purposes. This is the easiest, swiftest way of doing that.

      b. Create a service and onboard it to DataMarket. The service itself does not need to be running on Azure - you can host it yourself. This is what was done by our current reference data providers (MelissaData, CDyne, Digital Trowel and Loqate, which you can see under the data quality services category). There is an API for doing that will soon be published for general availability. This is especially relevant if you feel that your data may be relevant for additional users, and so you may want to onboard it to datamarket for public consumption.

      c. (To be available in the next update of DQS) - create a "private" reference data service, and hook it directly to DQS (via the "direct" section in the reference data setting tabs in the DQS configuration. Again, this API has not yet been publicly released, we are currently working on finalizing it and will provide updates later on.

    Right now, it seems that the best solution for what you're trying to do would be a, but we are planning on improving this capability soon.



    Thursday, July 28, 2011 4:18 AM
  • Elad, thanks for the input. I'll look at MDS / DQS interaction now I know it's there - however not 100% sure we are on the same page, so i'll explain further - it wraps in with option "c" above:

    I would expect that DQS - in the context of the "private" reference service you mention above - would be able to use an MDS managed set of Master Data as a reference set. So as you've described it, MDS would have to be able to be made available as a private reference service provider.

    Drilling a little further into what you have in mind for option "c" - is a reference data service going to be some sort of specialised server, or will it just be a viable source of data (e.g. SQL Table, Excel spreadsheet)

    The reasons I go into this is because "a" is a fixed list which will make sense in the context of more static data (e.g. Gender Codes) that needs periodic updates, versus more dynamic data which is constantly changing (e.g. ISBN numbers in a book sellers database)

    Option "b" may not be pragmatic in an Enterprise (exposing DW infrastructure to internet connections may not be permissible) and setting up your own Datamarket server seems uneccessary given that there's a database server already in place which in theory should meet such needs.

    As an aside, i'm liking DQS - I wish it had been available for my last project - and thanks for doing the Tech Ed session.

    Cheers, James


    James Beresford @ www.bimonkey.com
    SSIS / MSBI Consultant in Sydney, Australia
    SSIS ETL Execution Control and Management Framework @ SSIS ETL Framework on Codeplex
    Friday, July 29, 2011 3:32 AM
  • Hi Elad,

    When can we expect the option C for using 3rd party reference data services. If it is already available, what is the version I need to install to get it. 

    Thanks in advance.

    Regards, Syam Bandi

    Wednesday, January 22, 2014 9:02 AM