Monday, June 11, 2012 12:57 PM
I am pretty new to MDS and was hoping somebody could put me straight...I write this in as non technical format as I can as I don;t really know what the audience is in this forum.
We have a Parent Table X and a table Y ; There is a reference to a Name Field in X via an integer field in Table Y referencing the Integer in X. We also have the same relation ship of Table Y to a Table Z. For the purposes of clarity and familiarity I shall use the AdventureWorks examples of Category, SubCategory and Product (X,Y,Z)
Now here is the odd bit..we have the possibility of receiving data that will populate AND delta update all 3 tables from several disparate sources; whoch means that a Category from supplier 1 (SUP1) might have a unique integer key of 1 with a name of ABC whilst a Category from supplier 2 (SUP2) might have a unique key of 1 with a name of JKL. Get the idea ?
So, to distinguish the incoming data I have embellished the incoming data with a Supplier ID or Code...NOT a table at this stage...just SUP1 or SUP2. The problem here is that we can go in either 2 ways.
a) We form a relationship vai a composite Key Supplier Code + UniqueID (delivered by the 3rd party supplier) in ALL tables (which doesn't really fit in with MDS)
b) We use a NEW Auto Identity Integer Key (Seed and Increment) which would generally suit us as this data needs to be referenced by transactional data and render the supplier code (apart from maybe the top table) obselete.
So, assuming that the direction should be as in b) above, I can see how we should specify in Excel or through the MDS web interface that on Entity creation the 'Column that contains a unique identifier' should be 'Generate Code automatically' and of course on each supplier's load of data a unique key is generated - this is the real question. Assuming I have to update a new column for the FK relationship in each sub table (so in the first instance I have a column NewCategoryId in the SubCategory table whis is updated with the values from this column in Category - using the composite key of SupplierCode and UniqueID) I should then use one of Attribute Properties, Explicit Hierarchies or Derived Hierarchies. The next problem also comes when I am doing a delta update..the approach should cater for adding new records, updating existing whilst maintaining what we have done.
Has anyone any experienced or idea on this should be best approached.
Hope this was not too long winded...but I would really like to get this correct from the off as we have many relationships like this.
Wednesday, June 13, 2012 2:40 PM
I will take a stab at this. Let me tell you what we are doing. Before the data gets to MDS, we put it into an ODS (Operational Data Store). This is where we manage the codes. Sometimes we use business keys as our 'codes' when it makes sense. Sometimes, we use identity columns in the ODS. For example, in situations where we have compound keys, as you reference, we will use a surrogate key created in our ODS. In situations where there is a natural business key, we use that. We have a 1:1 relationship between tables in our ODS and entities in MDS. We have the luxury of not having the requirement to maintain data in MDS. All (or almost all) of our data is maintained in source systems, and then loaded into MDS, where it is subscribed to by downstream applications. I don't know if there is a 'right' answer to your question. i think its best to start small with your first release. Deliver something quickly, and then modify as you go.
Thursday, June 14, 2012 7:57 AM
Hi there BI Baracus (not sure of your first name so I'll not embarrass myself ! ;-) ),
Many thanks for your reply (much appreciated) and believe it or not that is exactly what I ended up doing over the last few days.
I have initally loaded the Primary Key of the supplier Table (e.g. Category) into a Junction Table with an Auto-Generated Incremental Integer Key (AGIK). This AGIK will now be the primary key for our Destination table which we update via links from the supplier Table and new junction Table and awating load into MDS. The second table (e.g. SubCategory) will be loaded in a similar manner with use of a seperate Junction Table with the added task of updating the new Foreign Key via the First Junction Table (Category).
Now the data is more amenable to MDS. I guess the message here is to look at your data, know the limitations or 'desires' of MDS and adjust your data prior to MDS load accordingly. This of course decouples the Data Preparation from MDS so we can quite easily choose/swap (to) another product than MDS should we desire.
I will take this opportunity to ask my next question...Delta Updates.
Has anybody any views on Delta updates on selected fields - We are in the position that as MDS offeres Data Governance by particular users we do NOT wish to update those fields in the 'Golden Record' that have been changed - BUT we do wish to receive updates on others. I was toying around with the idea of using SQL CHECKSUM using hashed indexes for comparison - seems to be a more structured way than in comparing field by field. The actual dynamic selection of which fields to select for comparison seems to be limited and perhaps THIS Is where MDS's Business Rules and Validation comes in ? I have the feeling that there may be a fine balance between what gets updated in the flow from supplier Table through Junction table through to Destination Table AS AGAINST what we can achieve by accepting/rejecting at the MDS Level.
Many thanks to those of you that took the time to read this...very much appreciated..