locked
Importing data without tblStgMember & tblStgMemberAttribute RRS feed

  • Question

  • Hello everyone,

    I am wondering if it is not more performant to import data directly to the final tables than passing through the import tables (tblStgMember & tblStgMemberAttribute).

    • Is it feasible ?
    • If yes, how can I implement it ?
    • If not, it is more performant to create multiple batches of small number of records or to create a big batch with all records ?
    Thursday, September 30, 2010 1:15 PM

Answers

  • Hello,

    1. It is 100% feasible
    2. via MDS Web Service API (take the look at the doc).
    3. It is less performant as the batch treatment via staging tables. You should balance the number of records in a batch, in order to:
    - keep performance elevated, but
    - have the proper control granularity (in case of many errors, it may be difficcult to analyze them at once.

    Regards, 

     

     


    Marius
    Thursday, September 30, 2010 2:48 PM
  • Guillaume,

    as you can see in the stored procedure of MDS :udpStagingSweep, the staging process is complex

    you should use staging for bulk data loads via SQL script / SSIS. (Val Lovicz  :http://social.msdn.microsoft.com/Forums/en-US/sqlmds/thread/df732f65-57c6-4658-99c7-1348b7c1ed50)

    I think it takes the same processing time if you use multiple batches or a big batch with the same data, but as Marius says, you have a better control on granularity if you use multiple batches and also error debugging is easier 

    if you want to import with API, you can see my example code on codeplex for using API :

    http://mdsmanager.codeplex.com/ 

     

    Regards,



     


    Xavier Averbouch
    Thursday, September 30, 2010 2:51 PM

All replies

  • Hello,

    1. It is 100% feasible
    2. via MDS Web Service API (take the look at the doc).
    3. It is less performant as the batch treatment via staging tables. You should balance the number of records in a batch, in order to:
    - keep performance elevated, but
    - have the proper control granularity (in case of many errors, it may be difficcult to analyze them at once.

    Regards, 

     

     


    Marius
    Thursday, September 30, 2010 2:48 PM
  • Guillaume,

    as you can see in the stored procedure of MDS :udpStagingSweep, the staging process is complex

    you should use staging for bulk data loads via SQL script / SSIS. (Val Lovicz  :http://social.msdn.microsoft.com/Forums/en-US/sqlmds/thread/df732f65-57c6-4658-99c7-1348b7c1ed50)

    I think it takes the same processing time if you use multiple batches or a big batch with the same data, but as Marius says, you have a better control on granularity if you use multiple batches and also error debugging is easier 

    if you want to import with API, you can see my example code on codeplex for using API :

    http://mdsmanager.codeplex.com/ 

     

    Regards,



     


    Xavier Averbouch
    Thursday, September 30, 2010 2:51 PM
  • This is what we are planning. We are going to read data from source systems and convert it into a canonical message(enterprise business object) that all source systems will understand. Then the message will be placed onto the queue in and enterprise service bus. The message will be picked up by some .net code and this code will convert the message into the various API calls to get the data into mds. We are doing this because we want to get near real time data changes. Our struggles that we are still figing out is how to get the data to come back out and distributed back to source system, basically notification of any data cleansing. The source systems may ignore these messages, but we want to give them the option. We have a proof of concept working for the SOA integration using the esb for getting data in, and it is working real well. Does anyone looked at something like an outbound workflow or distributed of changed records
    Wednesday, November 17, 2010 4:48 AM