Answered Best Practice to Move my DataSource

  • Tuesday, August 11, 2009 1:46 PM
     
     
    Our current setup is the following:
    1. Production Transactional Database
    2. Each night, this data replicates to our Reporting Database on another server.

    Our Warehouse will be based on the Reporting Database, so...

    1. We'd like to have a fresh copy of the Reporting Database on our Warehouse server to do the daily/monthly data manipulations. What is the best/fastest way to move a fresh copy of the Reporting Database to our Warehouse server?...or, is there a better practice for this situation?

    Thanks!!

All Replies

  • Friday, August 14, 2009 6:44 AM
    Moderator
     
     

    Hi Dustin,

     

    If I understand correctly, you can take a look at the SQL Server Replication. It is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases.

     

    Hope this helps.

    Raymond

  • Friday, August 14, 2009 11:22 AM
     
     
    I am familiar w/ Replication. We're actually replicating from our Production DB to the Reporting one. Can we also replicate from Reporting to the WareHouse?
  • Friday, August 14, 2009 11:40 AM
    Moderator
     
     

    Hi Dustin,

     

    What is the meaning of “reporting database”? There two database generated by Reporting Services –repertserver and reportserver_tempDB, or you mean the datasource database for the report? I think it’s the second one. Anyway, Replication is often used in data warehousing and reporting applications, you can check the below document to see if it meets your requirement:

    http://msdn.microsoft.com/en-us/library/ms151784.aspx

    http://msdn.microsoft.com/en-us/library/aa237172(SQL.80).aspx

     

    Hope this helps.

    Raymond

  • Tuesday, August 18, 2009 2:09 AM
    Moderator
     
     
    I really don't understand what you are asking.  You wouldn't want to replicate data from your transactional system to your data warehouse.  The reason is really simple, the OLTP database is going to have a dramatically different structure from any data warehouse.  What you should be doing is building ETL routines that move the recently modified data through a set of transforms such that it is in a proper structure to be loaded to a data warehouse.

    Mike Hotek BlowFrog Software, Inc. http://www.BlowFrogSoftware.com Affordable database tools for SQL Server professionals
  • Tuesday, August 18, 2009 2:25 AM
     
     
    Well we currently have all the stored procedures prepared to convert the data into our warehouse. We just aren't sure yet the performance over a linked server.
  • Tuesday, August 18, 2009 3:05 AM
    Moderator
     
     Answered
    That's an easy one.  Performance and linked server don't belong in the same sentence.  If everything you are running allows you to send the query across to the linked server, execute the query entirely on that side, and then return a result set, you will get sort of decent performance.  If you are doing something like joining tables across a linked server, be prepared for absolutely abysmal performance.  This is how most people try to use linked servers and it was not what they were designed for.  I find linked servers so universally misused that it is one of the first things that I work on eliminating at a customer site when I start a project.

    I would strongly suggest looking into SSIS.  SSIS was designed to do everything that you are trying to accomplish and it will run circles around stored procedure based ETL routines based on linked servers.

    Mike Hotek BlowFrog Software, Inc. http://www.BlowFrogSoftware.com Affordable database tools for SQL Server professionals
  • Wednesday, August 19, 2009 2:13 AM
     
     Answered
    If the stored procedures are in the reporting server then you are ok.  If the procedures are in your warehouse then they may not be following the best execution plan because they have to reach across to another server that they are not familiar with.  As Michael said, do not perform any joins.  At most you can only use an exists via a linked server without severely degradating performance.

    Aside from SSIS, If you put the procedures in the reporting server then call the procedures via openrowset, that would likely be the fastest way.

    When inserting the data make sure you follow the prerequisites for minimal logging to get optimal performance.

    Derek SQLServerPlanet.com