Tuesday, March 27, 2012 1:14 AM
Can anyone tell me what sys.sp_cdc_dbsnapshot is suppossed to return. I had assumed it was the max LSN in the associated snap shot.
I expected the below two functions to return the same result - they didn't:
Function 1. select fm75821_dbss.sys.fn_cdc_get_max_lsn() as MaxFromSnapShot Function 2. declare @lastLSN binary (10), @lastLSN_str nvarchar(42) .exec sys.sp_cdc_dbsnapshotLSN 'fm75821_dbss', @lastLSN output, @lastLSN_str output select @lastLSN
I ran the sys.fn_cdc_get_max_lsn function against my snapshot to return:
The sys.sp_cdc_dbsnapshotLSN returns:
Given the lack of documentation for these functions, I'm at a loss to know what is going on.
Any assistance greatly appreciated. Thanks in advance.
Monday, April 09, 2012 3:16 AMModerator
The stored procedure sys.sp_cdc_dbsnapshotLSN is used to obtain an appropriate low end-point that will synchronzie the first extraction interval with the initial load. The function sys.fn_cdc_get max_lsn is used to determine the high end-point for the first interval. So, they return different LSN. To learn more information, please download the whitepaper SQL Server 2008 CHANGE DATA CAPTURE SINCE LAST REQUEST PACKAGE SAMPLE, where the system stored mentioned.
- Marked As Answer by clay123123123 Tuesday, April 10, 2012 11:16 PM
Tuesday, April 10, 2012 11:16 PM
Thanks Jerry, still not clear what sys.sp_cdc_dbsnapshotLSN returns exactly.
But it seems to be working when I:
1. take a snap shot of my source
2. use my snap shot for my initial load (i.e. load all records from snap shot - I do not need to use any CDC functions here)
3. I then point my source components in my SSIS packages to my actual source rather than the snap shot.
a) I use the sys.sp_cdc_dbsnapshotLSN & increment for my low end point
b) the high end point is obtained by the sys.fn_cdc_get max_lsn
c) if ETL is succesful I persist the high end LSN in one of my data warehouse audit tables - subsequent ETL's will obtain their low end from the persisted LSN
Now for next hurdle (this is quite a learning curve!) i.e. with CDC implemented I have removed the logic to check if a record in the ETL already exists in my data warehouse - therefore, if my packages crash half way through the ETL I cannot simply rerun the ETL after fixing the issue - as it will attempt to load duplicate records. I suspect transactions will be the way to go & while my Proof of Concept worked nicely - I can't get to it working with my production packages that have many connections within each package (ADO.NET and OLE DB). I will leave that for another thread....