Wednesday, November 28, 2012 6:10 PM
I'm running SQL 2005 Developer SP4 on 2 test servers with Windows 2003 SP2. I have set-up log shipping of 3 databases from Server A to Server B using the Transaction Log Shipping wizard to create the transaction log backup, copy, restore, and alert jobs which run every 1 minute, staggering the copy jobs to run on the 00:00:30 times. I am using Idera SQLsafe to perform the full and differential backups. All jobs and log-shipping shares on both servers have been set-up to run with the SQL Agent Service Account, which is both a domain admin, and local administrator on both servers, and has full control permissions on all shares.
I am experiencing weird behaviour when I try to initialize log-shipping using my full and differential backups. All 3 databases are backed up using the same jobs for full and differential backups, to different sub-directories under the same common share/file directory in Server A. All 3 databases have their t-logs copied to different sub-directories under the same common share/file directory in Server B. I have verified all jobs to have been created successfully and working. I have also made sure no other backup jobs run outside the full, differential, and log-ship backup jobs I have created, and even taken the extra step to disable all maintenance plan and log-ship backup jobs and wait for them to stop executing before proceeding with the full and differential backups to ensure proper log chain.
However, only 1 of the 3 databases was successfully restored on Server B after initialization. The second database only started restoring weirdly after 45 minutes (and hence 45 additional t-log backups) from a t-log created and copied over at that time, instead of from the first t-log after the differential backup was restored. The third database was did not restore until I performed a transaction on that database on the primary server to cause there to be something to restore. In the meantime I was getting alerts on it since the restore job was reporting that it was skipping all t-logs and finally the message "Could not find a log backup file that could be applied to secondary database". I suspect that the second database must not have had an active transaction until 45 minutes after the initial restore, and that the first database must have luckily contained one in the first set of t-logs.
This behaviour is random across my dbs if I initialize from new full and/or differential backups, but when the restore does not occur, I can consistently get it to start working if I fake a transaction.
What am I missing? In a production scenario, I do not want to have to fake a transaction (as everything is audited) in order for log-shipping to catch-up!
Any help on this would be most appreciated!
- Edited by Diane Sithoo Wednesday, November 28, 2012 6:54 PM
Wednesday, November 28, 2012 8:13 PM
There will be some delay in the log shipping method, but what you described here sounds different. There are couple of places you can check. Saying that keep in mind that if your secondary database is in use or any user connected to it log restore fails. Make sure no users are connected to secondary.
1) Go to primary server agent right click and see the history. It should have a history for every minute as you scheduled for every minute. (If pass next step)
2) Go to secondary in the agent job find the job that copies the files across. It should have history for the scheduled time frame (if pass next step)
3) Find the job that stores the log to secondary database. It must have history for every minute.
This information tells you everything is correct or something wrong, if so where.
Wednesday, November 28, 2012 10:00 PM
Thanks for responding. I have checked the job history for all jobs, and everything is working correctly, with the exception that the Restore jobs on the secondary read but skip every t-log file in the directory until an active transaction occurs.
Thursday, January 17, 2013 4:37 PM
Well, it happened again. We were testing our disaster recovery procedures, and afterwards I restored log-shipping, and NOT to my surprise, all but 1 of the databases started restoring. The one that didn't had no active transactions since the restore point.
This time, I ran UPDATE STATISTICS dbo.my_table on a random small table in the database, which generated a write to the log file, and low and behold, it caught up immediately!
Very very weird! At least I have found a work-around.
Thursday, January 17, 2013 8:42 PMModeratorQuery the msdb.dbo.backupset table looking at first_lsn, last_lsn, and checkpoint_lsn, ordered by backup_start_date. I suspect you are creating empty log files, which have metadata but no transactions since the source database has no changes and thus no new log entries to back up. Restoring an empty log file does nothing, since there are no log records to apply.
Geoff N. Hiten Principal Consultant Microsoft SQL Server MVP
Thursday, January 17, 2013 11:10 PM
i did not see any thing strange or challenging in your scenario. Please contact your DBA i am sure they will fix your issue.
As you said you were testing disaster recovery: do you have test case for that. please review or get it reviewed.
when disaster happens you have to take tail of the transaction log back up from primary. then you need to find all the transaction logs that have not applied to your secondary. And then apply all transaction logs with no recovery in order to your secondary. Finally you have to apply tail of the log back up with recovery to bring secondary as primary.
If you want to restore your primary from backups procedure is same.
for your information: please read more about log shipping how it works. there is plenty on information on internet. cheers kumar
Friday, February 22, 2013 7:17 PM
That's exactly what's happening, but then does that mean I have to generate a transaction on a system which has no further transactions since the last full backup?
Friday, February 22, 2013 7:21 PM
Thanks for the advice. I am the DBA, and have been for several years now. It's not my first time setting up and taking down log-shipping, and have been able to successfully recover from disaster a few times now in production.
All the steps you mentioned had been performed, so if you re-read my posts, you will see that that was not my problem. Even having restored from the full backup and all t-logs since made no impact until a transaction actually occured on the db in question.
I have a work-around anyways so I'm good. I was only wondering if there was something missing from the full backup.