Monday, December 17, 2012

A Reboot Ate My Log Shipping

I was recently contacted to troubleshoot some Log Shipping issues that arose after some SAN and server maintenance. The Server and SAN teams were working on the server; performing some maintenance on the MPIO drivers. After the work was finished and the server was rebooted, log shipping started to get out of sync on this server.

First, Get a High-Level View, Then Dive Into the Details

The best place to get a high-level view of Log Shipping is by using the built-in reports from SQL Server. In SSMS, right-click on the server node, select Reports, and then choose the Transaction Log Shipping Status report. Do this on both the primary and the secondary server in your log shipping configuration.

Log Shipping Report - Primary Server

The Secondary Server's report is a bit more informative in my opinion...

Log Shipping Report - Secondary Server

By viewing the report, I could see at a glance that the log backups were running fine, but the copy job was having trouble. Next I looked at the copy job history and saw the following error message:
  • The network name cannot be found.
The name of the share can be found in two locations, the error log, or the Log Shipping Setup Properties. In the error log, look for this text in one of the substeps:
  • Backup Source Directory
  • Backup Destination Directory

Log Shipping Errors

Or, in the Log Shipping Setup Properties look for this:

Log Shipping Backup Share

From there, I logged onto the primary server and verified that the share was indeed missing. I also verified that the LOG backups were being written properly on the primary server, but missing from the secondary server.

To fix the situation, all that was necessary was to re-create the share on the primary server and grant read-only permissions to the secondary server's service account.

After that, I manually ran the copy job to start the log backups flowing to the secondary server once again. Then, I ran the restore job and watched the secondary server get back in sync.

But Don't Rush In

Often, when confronted with a Log Shipping sync problem, a person's first reaction is to remove Log Shipping, and then reconfigure it. In this case, nothing would have been solved since the LOG share would have still been missing.

Instead, take the time to troubleshoot the problem and try to find the root cause. Log Shipping is a very simple and stable technology and not much will break it. I have never had to remove and reconfigure Log Shipping to get it to work again.

Give Yourself Some Cushion

Another key to success with Log Shipping is to keep the LOGs around long enough so that you don't break the LSN chain in case of a problem. You want to make sure you keep enough LOGs on hand to cover a long weekend or holiday break in case your DBAs or other staff are not available to respond immediately.

Log Shipping Retention

My preference is to keep seven days worth of LOGs available. If you can't get approval for that much space, walk it back a bit, but hold firm at three days worth. Think about the weekend/holiday factor. If management pushes back, remind them how long it took to initialize the secondary server with the FULL backup. Ask them which is worse, a little extra space, or a long delay while you get a 1TB FULL backup copied from one coast to another.

But What About the Root Cause

Oh yeah, so why did the share disappear to begin with? This server had been migrated from one SAN to another and ended up with two different vendor's MPIO drivers present. So, the old one was slated to be removed. After the work was completed and the server rebooted, all of the shares went missing.

It turns out this is a known issue with the iSCSI Initiator if it is not configured correctly. Basically, the Server Service needs to have a dependency set on the iSCSI Initiator Service. Take a look at kb870964 for some more details on how to prevent this from happening to you.


  1. Thank you very much for excellent blog. Please let me know if there further good blogs like this about high availability solution errors and their solutions. Thanks again

  2. I'm glad you like the blog. The best reward is helping others. Pay it forward...