locked
Extracting history from OLTP - recommended methods RRS feed

  • Question

  • Hi there

    Simple and conventional OLTP database, we need to capture all changes for insert into a DW via staging / ods etc.

    Is there a recommended approach for this? Obviously it has to be real time as there might be multiple updates for a time period. I'm thinking of triggers on OLTP tables (bad for performance as it's synchronous), or change data capture, service broker as asynchronous methods.

    Any preferences? If so, why please?

    Much appreciated.

    Tuesday, November 3, 2015 10:55 AM

Answers

  • That's right, yes...I was on about storing history - i.e. ultimately for slowly changing dimensions.

    Just lastly (sorry to keep dragging on), but not everyone has enterprise edition so what's the most common approach then?

    I know there's loads of ways of processing deltas; ssis, merge, I,U,D etc ... but I'm curious as to the added complication of it having to be real time from the oltp db, i.e. the above methods might miss some changes if it's not event driven. I guess it has to be triggers correct? No other ways?

    As for the transformations thing, yes SSIS after the data has been extracted.

    • Marked as answer by JonnyStr Tuesday, November 3, 2015 4:51 PM
    Tuesday, November 3, 2015 2:18 PM

All replies

  • There are various methods like

    1. CDC 

    http://www.databasejournal.com/features/mssql/implementing-sql-servers-change-data-capture.html

    But this would require additional schemas, objects etc which system creates

    2. Transactional Replication (you will get close to real time sync)


    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Tuesday, November 3, 2015 11:44 AM
  • Do you wan to capture the changes for one or all tables in the database? It is going to be pretty costly operation, I hope you are aware of. 

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Tuesday, November 3, 2015 11:58 AM
  • Cheers. I initially thought that tx replication was just keeping another db in sync with the publisher but I've just read it can capture history:

    "

    Transactional replication is typically used in server-to-server environments and is appropriate in each of the following cases:

    The application requires access to intermediate data states. For example, if a row changes five times, transactional replication allows an application to respond to each change (such as firing a trigger), not simply the net data change to the row."

    Is this just a trigger on the subscriber tables where you'd want to capture history? If so, is there any difference to creating the triggers on the source OLTP (ignoring performance / blocking issues for the sake of argument). Or have I misunderstood something?

    Thanks.

    Tuesday, November 3, 2015 12:04 PM
  • Not sure yet...requirements yet to be defined. But most likely just dimensional data so not a lot.

    I think this is a normal DW scenario isn't it? Basically if you want to store dimensional history, then what do developers generally do / use?

    • Edited by JonnyStr Tuesday, November 3, 2015 12:11 PM
    Tuesday, November 3, 2015 12:09 PM
  • Cheers. I initially thought that tx replication was just keeping another db in sync with the publisher but I've just read it can capture history:

    "

    Transactional replication is typically used in server-to-server environments and is appropriate in each of the following cases:

    The application requires access to intermediate data states. For example, if a row changes five times, transactional replication allows an application to respond to each change (such as firing a trigger), not simply the net data change to the row."

    Is this just a trigger on the subscriber tables where you'd want to capture history? If so, is there any difference to creating the triggers on the source OLTP (ignoring performance / blocking issues for the sake of argument). Or have I misunderstood something?

    Thanks.

    Nope its not a trigger but its like a job which keeps on checking the data changes (DML) happening on the publisher and publishes it to each of the identified subscribers. We can define the tables, objects that need to published in the article and it will only monitor and apply changes happening to them

    More details

    http://www.databasejournal.com/features/mssql/article.php/1438201/Setting-Up-Transactional-Replication-A-Step-by-step-Guide.htm


    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Tuesday, November 3, 2015 12:19 PM
  • Sorry...with that link (plus I've tried to find others), I can't see an explanation on how to configure the subscriber to accept history. For instance, how to configure the subscriber to not overwrite existing records...or...how to configure a row version or modified by date to be implicitly inserted with the new (updated) row?

    Disclaimer: admittedly I am quite bored and i've been staring at my screen for far too long so might have missed a bit.

    Tuesday, November 3, 2015 12:53 PM
  • Sorry...with that link (plus I've tried to find others), I can't see an explanation on how to configure the subscriber to accept history. For instance, how to configure the subscriber to not overwrite existing records...or...how to configure a row version or modified by date to be implicitly inserted with the new (updated) row?

    Disclaimer: admittedly I am quite bored and i've been staring at my screen for far too long so might have missed a bit.

    Nope if you want capture history you need to go for Change Data Capture itself. And I hope you don't have need to apply any transformations in between else you need to consider ETL tool like SSIS


    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Tuesday, November 3, 2015 1:01 PM
  • That's right, yes...I was on about storing history - i.e. ultimately for slowly changing dimensions.

    Just lastly (sorry to keep dragging on), but not everyone has enterprise edition so what's the most common approach then?

    I know there's loads of ways of processing deltas; ssis, merge, I,U,D etc ... but I'm curious as to the added complication of it having to be real time from the oltp db, i.e. the above methods might miss some changes if it's not event driven. I guess it has to be triggers correct? No other ways?

    As for the transformations thing, yes SSIS after the data has been extracted.

    • Marked as answer by JonnyStr Tuesday, November 3, 2015 4:51 PM
    Tuesday, November 3, 2015 2:18 PM