locked
How much rollsback when applying a snapshot RRS feed

  • Question

  • I created a new transactional publication and created a snapshot.  When the snapshot was applied to the subscriber it filled up the log drive.  This caused the snapshot to fail to apply and rollback.

    My problem is:  I have BcpBatchSize to 10,000 in the profile.  So I expected it to commit every 10,000 rows and roll back a max of 10,000 rows.  It appears to have rolled back an entire file (number 7 of the 30).  This kind so makes sense because it applies the bcp by files and picks back up at the file which failed.  

    Can anyone confirm this is how it works when snapshot fails to apply?

    My problem in this case is the table is huge and the 30 files each have 20+ billion rows.  This caused the rollback to take 16 hours.   If this is how it works, I will need to increase the number of files in the snapshot profile to allow for quicker recovery.



    Wednesday, August 22, 2012 3:38 PM

Answers

  • To close the loop on this thread, this is what I found.

    SQL 2005+ must roll back an entire bcp file.   On restart it skips any previously delivered bcp files.  This directly affects the size of the log file of the target database.

    In my case, the snapshot generated 30 bcp files, 29 were equal size, and the 30th file was huge (10 times the size of the others).  This resulted in filling my log hard drive and causing a roll back of the entire huge file. 

    The number of files in the snapshot are affected by the snapshot option MaxBCPThreads (default 15), 2 times MaxBCPThreads = number of files.  Changing this option caused more files to be generated, but still the last file was huge.

    The snapshot agent apparently depends on the stats of the Primary Key to determine how to divide the file into multiple parts.   Update Stats on the PK with 20% sample, did not change my results.  Update stats on the PK with FULLSCAN, produced equal size files.

    So, moral of the story is, keep your stats updated. 


    • Edited by Tom Phillips Monday, August 27, 2012 2:51 PM
    • Marked as answer by Tom Phillips Monday, August 27, 2012 2:51 PM
    Monday, August 27, 2012 2:50 PM

All replies

  • It whacks the entire table and starts from scratch again.

    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941

    Wednesday, August 22, 2012 4:41 PM
    Answerer
  • Actually it does not appear to whack the table and start over (at least in 2008 R2).  The table ended up with 40+Billion rows and the replication log prints messages:

    Skipping file '{table}#x' because it has already been delivered for a previous article or by a previous interrupted snapshot.

    So it must be restarting at a file level.

    Wednesday, August 22, 2012 4:49 PM
  • I had to repro this and watch the messages. Midway through the snapshot application I stopped the distribution agent and then restarted it. I saw that it restarted at the beginning.

    I did this with a table of 1 million rows and it had multiple files associated with that table.

    I am wondering what is different between our publications. I only had one table in my publication and will try it with several tables.  This was on SQL 2012.


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941

    Wednesday, August 22, 2012 9:51 PM
    Answerer
  • That is strange. 

    I actually found my "real" problem why the log filled up.  The snapshot agent created 30 files, 29 are about the same size.  The #30 is HUGE.  This is filling up my log trying to apply the entire #30 file.  It looks like it created 29 files of about 1 billion rows, and #30 of the rest of the table about 100 BILLION. 

    I am trying to track down why snapshot agent did not create 30 equal size files.  Is this due to stats or an index or something?  Does it try to use Stats to determine an estimated row count per file and my stats are out of date?

    Any ideas?


    Thursday, August 23, 2012 1:47 PM
  • How many articles are in your publication?  Are there multiple bcp files for the same article?


    Convert DTS to SSIS | Document SSIS | 30+ SSIS Tasks | Real-time SSIS Monitoring | Quick Starts | BI Blitz

    Thursday, August 23, 2012 4:41 PM
  • In my case I made a new publication with just 1 article to test.  It creates 30 bcp out files for the 1 table, named tablename#{filenumber}.
    Thursday, August 23, 2012 5:10 PM
  • To close the loop on this thread, this is what I found.

    SQL 2005+ must roll back an entire bcp file.   On restart it skips any previously delivered bcp files.  This directly affects the size of the log file of the target database.

    In my case, the snapshot generated 30 bcp files, 29 were equal size, and the 30th file was huge (10 times the size of the others).  This resulted in filling my log hard drive and causing a roll back of the entire huge file. 

    The number of files in the snapshot are affected by the snapshot option MaxBCPThreads (default 15), 2 times MaxBCPThreads = number of files.  Changing this option caused more files to be generated, but still the last file was huge.

    The snapshot agent apparently depends on the stats of the Primary Key to determine how to divide the file into multiple parts.   Update Stats on the PK with 20% sample, did not change my results.  Update stats on the PK with FULLSCAN, produced equal size files.

    So, moral of the story is, keep your stats updated. 


    • Edited by Tom Phillips Monday, August 27, 2012 2:51 PM
    • Marked as answer by Tom Phillips Monday, August 27, 2012 2:51 PM
    Monday, August 27, 2012 2:50 PM