locked
Transfering data using an unstable internet connection RRS feed

  • Question

  • Hi,

    On a regular basis - let's say one time pr. hour, I need to replicate data from three tables in a database to a custom database located somewhere else.

    The main database server, that contains all the live production data is located in europe and data from the three tables need to be transfered to a server in Thailand. The internet connection to Thailand is considered very unstable and I need to find a way to accomplish the datatransfer and more importantly to handle any errors caused by the poor connection.

    Normally I would set up logshipping, but:

    • It's not the entire database I need to transfer only data/transactions from three tables?
    • What about the poor connection - how do I handle any fallouts on the connection?

    A workaround that I thought of was to:

    • set up some (insert, update and delete) triggers on the three tables that comes into play
    • each time the triggers fire, call a .net managed code assembly from within SQL server
    • from the managed code, update the database in Thailand and implement error handling due to poor connectivity

    What would you recommend to do to accomplish the data transfer. Any suggestions will be very much appreciated.
    Thank you.

    Regards,
    Torben

    Monday, February 28, 2011 2:38 PM

Answers

  • We can get Change Data from two built in table-valued functions created during CDC setup

    For all changes (meaning a row is returned for each DML)
    use cdc.fn_cdc_get_all_changes_<instance>

    For the net changes (meaning one row returned for each source row modified among one or more DMLs)
    use cdc.fn_cdc_get_net_changes_<instance>

    Here is sample http://weblogs.sqlteam.com/derekc/archive/2008/01/28/60469.aspx

    Monday, February 28, 2011 4:03 PM
  • Hi Torben,

    Don't worry, transactional replication will make sure transaction commands are propagated to all Subscriber, which means the replication will resume if a connection is dropped and can work over low bandwidth connections. However, please note that transaction commands are stored in the distribution database until they are propagated to all Subscribers or until the maximum distribution retention period has been reached.

    Hope this helps.


    Best Regards,
    Chunsong Feng

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
    Wednesday, March 2, 2011 3:07 AM

All replies

  • I would suggest logshipping could be the best option.

    Alternatively you can think about configuring some SSIS packages and scheudle them to pull the database hourly basis with enable CDC on production for pulling only modified data.

    CDC example here

    http://www.databasejournal.com/features/mssql/article.php/3720361/Microsoft-SQL-Server-2008----Change-Data-Capture--Part-I.htm

    Please beaware impliment triggers might hurt the database performance.

    Monday, February 28, 2011 2:53 PM
  • Merge Replication + WebSync ( FTPS, HTTPS)  should be able to fulfill your request.


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.

    Sevengiants.com
    Monday, February 28, 2011 3:36 PM
  • Thank you for pointing me in the direction of CDC - it sounds very promising.

    Enabling CDC enables me two track changes in a certain table and it actually records those changes and stores them in a CDC created table - very nice.

    Now I would be able to, on a regular basis, query the table for changes and transfer those changes to another database - also very nice. Now that I have processed one of the changes I need it do disappear from the CDC table - or do I? How do I prevent me from fetching the exact same record the next time I query for changes? If a record is updated two times - how do I know which update should be processed before the other - there is no timestamp in the CDC table? 

    Monday, February 28, 2011 3:44 PM
  • We can get Change Data from two built in table-valued functions created during CDC setup

    For all changes (meaning a row is returned for each DML)
    use cdc.fn_cdc_get_all_changes_<instance>

    For the net changes (meaning one row returned for each source row modified among one or more DMLs)
    use cdc.fn_cdc_get_net_changes_<instance>

    Here is sample http://weblogs.sqlteam.com/derekc/archive/2008/01/28/60469.aspx

    Monday, February 28, 2011 4:03 PM
  • Hi,

    Based on your requrements (only replicate those 3 tables), replication would be a proper solution to you. You can either use Snapshot Replication or Transactional Replication, you can also use Merge Replication if you want to replicate data in bi-directional way. For more information, you can refer to http://msdn.microsoft.com/en-us/library/ms152531.aspx.

    Hope this helps. Please let us know if you need more help.


    Best Regards,
    Chunsong Feng

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
    Tuesday, March 1, 2011 5:34 AM
  • Hi,

    Thank You for your feedback.

    Having the poor internet connection to Thailand in mind, I fell concerned about how it will effect the transactional replication. How will Transactional Replication work if the connection is lost during replication? Is there any built in failsafe or other mechanism that makes the Publisher know which transactions were sent or not. Put in another way, using Transactional Replication, will there be any risk of dataloss at the subscriber due to the poor internet connection?

    Let's say that the connection to Thailand is lost during replication - will that scenario have any negative performance impact on the Publisher database?

    Thanks to all of you, for taking you time to help me out.
    Regards, Torben 

    Tuesday, March 1, 2011 7:17 AM
  • Hi Torben,

    Don't worry, transactional replication will make sure transaction commands are propagated to all Subscriber, which means the replication will resume if a connection is dropped and can work over low bandwidth connections. However, please note that transaction commands are stored in the distribution database until they are propagated to all Subscribers or until the maximum distribution retention period has been reached.

    Hope this helps.


    Best Regards,
    Chunsong Feng

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
    Wednesday, March 2, 2011 3:07 AM