locked
Approach to propagate/refresh a set of tables from one SQL Server 2005 DB to another RRS feed

  • Question

  • Hi,

    I've 2 MS SQL Server 2005 Databases.

    First one is existing one and contains hundreds of tables and is being used for high OLTP application (CRM).

    I want to copy some set of tables (~50 tables) to another newly created SQL Server 2005 DB for reporting purpose daily at a specific time when the 1st DB is low on OLTP.

     

    I don't want to copy the entire database or all tables. What approach should I take to copy my tables and keep them in sync with a difference of 1 day?

    Also what are the best practices you PROs follow for keeping OLTP & OLAP databases sync?

     

    Thanks,

    Manoj

    Tuesday, June 29, 2010 11:57 AM

Answers

  • SSIS package  is  your friend

    www.sqlis.com


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
    Tuesday, June 29, 2010 12:01 PM
  • Kind of depends on your database design and business requirements.

    If the load window is wide and the data set is not too cumbersome, then you can use SSIS to just truncate and reload each table. I suggest one SSIS package per table, and a Parent package to control the order in which they are loaded.

    But if your load window is small, and your data volumn is very large, you may need to look into grabbing only NEW and CHANGED records since the last load. Such processing is not difficult but it does take a little more thought.

    Also, your OLAP database, if it will be used for SSAS cubes, may need to take on a different structure than just simply copying the source tables.

    Look into the following books:

    Applied Microsoft Analysis Services 2005 by Teo Lachev and The Microsoft Data Warehouse Toolkit by Mundy, Thornthwaite and Kimball.


    Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
    Tuesday, June 29, 2010 1:12 PM
  • Hi Manu,

     

    When you say 50 table sync, I have a question in my mind,

     

    1. Are all the tables are independent or they required some kind of sequences to load the data.

       I mean Parent- Child relationship or not?

     

    "Recently we had a similar kind of challenge where exactly, data needs to sync from "PROGRESS DB" to "SQL SERVER 2008 R2" server every day basis."

     

    Previously the “PROGRESS DB” Data sync was taking 8 Hrs. - 9 Hrs . in “ORACLE” DB to sync the data. But we have achieved this task within 1Hrs 15 min. in “SQL Server 2008 R2“

     

    Initially we tried using only one “SSIS” with parallel processing method but it took min 4- 5 Hrs. but we have applied “Multithreading Architecture” to make it faster process by using few SSIS packages and Process oriented SP’s. (This is taking 1 Hr. 15 Mins. now). By which now our Partners are much more confident and very much happy with "MS SQL Server Technology" which is Magic for them in real.

     

    So there are few ways to achieve this but you need a suitable plan to achieve this for production time limitation.

     

    Data loading option from source to destination:

     

    1st: Delete the old table and create the New One every day basis.

    2nd: Insert/Update the table on condition basis.

     

    You can use SSIS (parallel processing) which is best for data loading but you need to prepare a plan how to load as I specified above.

     

    For your 2nd question: You can use SSIS package to process the cubes to make sync OLTP to OLAP DB.

     

    Thanks,

    Sandeep

    Tuesday, June 29, 2010 1:18 PM
  • This may be old school (cuz' that's what I am) but we used to use two schemas to do what you are talking about, an A and a B schema. The synchronization routines got really complex, but it worked. We would have schema A live while we did ODS and other system data loads on schema B. We would keep those routines and at midnight switch to Schema B as the live schema. Then run the data loads we did on B the day before on A and then ran the new ODS data loads to populate Schema A. This was one day behind, which was fine for a CRM system. The complexity came on some of the more transactional/operational aspects of the CRM like campaign management or other aspects that could not be "missing" for a day. Anything that could be orphaned from the data for a period of time like campaign management tracking, we had in separate schemas that were not part of the A/B data load routines.

    We used different tools for anything immediate inside the ODS.

    Hope this helps.

     

    Jay

    Tuesday, June 29, 2010 2:13 PM

All replies

  • SSIS package  is  your friend

    www.sqlis.com


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
    Tuesday, June 29, 2010 12:01 PM
  • Kind of depends on your database design and business requirements.

    If the load window is wide and the data set is not too cumbersome, then you can use SSIS to just truncate and reload each table. I suggest one SSIS package per table, and a Parent package to control the order in which they are loaded.

    But if your load window is small, and your data volumn is very large, you may need to look into grabbing only NEW and CHANGED records since the last load. Such processing is not difficult but it does take a little more thought.

    Also, your OLAP database, if it will be used for SSAS cubes, may need to take on a different structure than just simply copying the source tables.

    Look into the following books:

    Applied Microsoft Analysis Services 2005 by Teo Lachev and The Microsoft Data Warehouse Toolkit by Mundy, Thornthwaite and Kimball.


    Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
    Tuesday, June 29, 2010 1:12 PM
  • Hi Manu,

     

    When you say 50 table sync, I have a question in my mind,

     

    1. Are all the tables are independent or they required some kind of sequences to load the data.

       I mean Parent- Child relationship or not?

     

    "Recently we had a similar kind of challenge where exactly, data needs to sync from "PROGRESS DB" to "SQL SERVER 2008 R2" server every day basis."

     

    Previously the “PROGRESS DB” Data sync was taking 8 Hrs. - 9 Hrs . in “ORACLE” DB to sync the data. But we have achieved this task within 1Hrs 15 min. in “SQL Server 2008 R2“

     

    Initially we tried using only one “SSIS” with parallel processing method but it took min 4- 5 Hrs. but we have applied “Multithreading Architecture” to make it faster process by using few SSIS packages and Process oriented SP’s. (This is taking 1 Hr. 15 Mins. now). By which now our Partners are much more confident and very much happy with "MS SQL Server Technology" which is Magic for them in real.

     

    So there are few ways to achieve this but you need a suitable plan to achieve this for production time limitation.

     

    Data loading option from source to destination:

     

    1st: Delete the old table and create the New One every day basis.

    2nd: Insert/Update the table on condition basis.

     

    You can use SSIS (parallel processing) which is best for data loading but you need to prepare a plan how to load as I specified above.

     

    For your 2nd question: You can use SSIS package to process the cubes to make sync OLTP to OLAP DB.

     

    Thanks,

    Sandeep

    Tuesday, June 29, 2010 1:18 PM
  • This may be old school (cuz' that's what I am) but we used to use two schemas to do what you are talking about, an A and a B schema. The synchronization routines got really complex, but it worked. We would have schema A live while we did ODS and other system data loads on schema B. We would keep those routines and at midnight switch to Schema B as the live schema. Then run the data loads we did on B the day before on A and then ran the new ODS data loads to populate Schema A. This was one day behind, which was fine for a CRM system. The complexity came on some of the more transactional/operational aspects of the CRM like campaign management or other aspects that could not be "missing" for a day. Anything that could be orphaned from the data for a period of time like campaign management tracking, we had in separate schemas that were not part of the A/B data load routines.

    We used different tools for anything immediate inside the ODS.

    Hope this helps.

     

    Jay

    Tuesday, June 29, 2010 2:13 PM