Monday, December 17, 2012

A Reboot Ate My Log Shipping

I was recently contacted to troubleshoot some Log Shipping issues that arose after some SAN and server maintenance. The Server and SAN teams were working on the server; performing some maintenance on the MPIO drivers. After the work was finished and the server was rebooted, log shipping started to get out of sync on this server.

First, Get a High-Level View, Then Dive Into the Details

The best place to get a high-level view of Log Shipping is by using the built-in reports from SQL Server. In SSMS, right-click on the server node, select Reports, and then choose the Transaction Log Shipping Status report. Do this on both the primary and the secondary server in your log shipping configuration.

Log Shipping Report - Primary Server

The Secondary Server's report is a bit more informative in my opinion...

Log Shipping Report - Secondary Server

By viewing the report, I could see at a glance that the log backups were running fine, but the copy job was having trouble. Next I looked at the copy job history and saw the following error message:
  • The network name cannot be found.
The name of the share can be found in two locations, the error log, or the Log Shipping Setup Properties. In the error log, look for this text in one of the substeps:
  • Backup Source Directory
  • Backup Destination Directory

Log Shipping Errors

Or, in the Log Shipping Setup Properties look for this:

Log Shipping Backup Share

From there, I logged onto the primary server and verified that the share was indeed missing. I also verified that the LOG backups were being written properly on the primary server, but missing from the secondary server.

To fix the situation, all that was necessary was to re-create the share on the primary server and grant read-only permissions to the secondary server's service account.

After that, I manually ran the copy job to start the log backups flowing to the secondary server once again. Then, I ran the restore job and watched the secondary server get back in sync.

But Don't Rush In

Often, when confronted with a Log Shipping sync problem, a person's first reaction is to remove Log Shipping, and then reconfigure it. In this case, nothing would have been solved since the LOG share would have still been missing.

Instead, take the time to troubleshoot the problem and try to find the root cause. Log Shipping is a very simple and stable technology and not much will break it. I have never had to remove and reconfigure Log Shipping to get it to work again.

Give Yourself Some Cushion

Another key to success with Log Shipping is to keep the LOGs around long enough so that you don't break the LSN chain in case of a problem. You want to make sure you keep enough LOGs on hand to cover a long weekend or holiday break in case your DBAs or other staff are not available to respond immediately.

Log Shipping Retention

My preference is to keep seven days worth of LOGs available. If you can't get approval for that much space, walk it back a bit, but hold firm at three days worth. Think about the weekend/holiday factor. If management pushes back, remind them how long it took to initialize the secondary server with the FULL backup. Ask them which is worse, a little extra space, or a long delay while you get a 1TB FULL backup copied from one coast to another.

But What About the Root Cause

Oh yeah, so why did the share disappear to begin with? This server had been migrated from one SAN to another and ended up with two different vendor's MPIO drivers present. So, the old one was slated to be removed. After the work was completed and the server rebooted, all of the shares went missing.

It turns out this is a known issue with the iSCSI Initiator if it is not configured correctly. Basically, the Server Service needs to have a dependency set on the iSCSI Initiator Service. Take a look at kb870964 for some more details on how to prevent this from happening to you.

Monday, December 10, 2012

Where are your indexes located?

When taking over a database or looking at it for the first time, one area I like to focus on is the indexes. There are the standard things, such as index fragmentation, usage, etc. However, one area that is often overlooked is where the indexes are.

But aren't they in the database?

Yes, but where in the database. If your database only has one filegroup, PRIMARY, then that is where your indexes are. However, some databases have multiple filegroups. They could be for archiving purposes, security segregation, performance, or any number of reasons.

A fairly common setup to encounter is to have a separate filegroup for non-clustered indexes. This is not as common a technique as it used to be, and there is some debate as to whether this helps performance. The point here is not to debate the wisdom of index filegroups, but rather to get a sense of how your database is laid out.

The following script will give you a high-level breakdown of your database and show you which filegroup your clustered indexes, non-clustered indexes, and heaps are living in.

[sql]

-- index location aggregations
select ds.name as 'FG Name',
i.type_desc as 'Index Type',
count(*) as 'Index Count'
from sys.objects o
join sys.indexes i
on o.object_id = i.object_id
join sys.data_spaces ds
on i.data_space_id = ds.data_space_id
where o.type_desc = 'USER_TABLE'
-- and i.type_desc = 'CLUSTERED'
-- and i.type_desc = 'NONCLUSTERED'
-- and i.type_desc = 'HEAP'
-- and ds.name = 'PRIMARY'
-- and ds.name = 'MyDataFG'
-- and ds.name = 'MyIndexFG'
group by ds.name, i.type_desc
order by 'FG Name', 'Index Type'
go

[/sql]

Index Filegroup Aggregations

What catches my eye in this example, aside from all the heaps, are any clustered indexes that are not in the data filegroup, and any non-clustered indexes that are not in the index filegroup. So I might flag those indexes to be moved to the appropriate filegroup during a future maintenance window.

The next part of the script will give you a detailed list of each index or heap and where it lives.

[sql]

-- index location details
select
ds.name as 'FG Name',
object_name(i.object_id) as 'Table Name',
i.name as 'Index Name',
i.type_desc as 'Index Type'
-- , o.*
-- , i.*
from sys.objects o
join sys.indexes i
on o.object_id = i.object_id
join sys.data_spaces ds
on i.data_space_id = ds.data_space_id
where o.type_desc = 'USER_TABLE'
-- and i.type_desc = 'CLUSTERED'
-- and i.type_desc = 'NONCLUSTERED'
-- and i.type_desc = 'HEAP'
-- and ds.name = 'PRIMARY'
-- and ds.name = 'MyDataFG'
-- and ds.name = 'MyIndexFG'
order by 'FG Name', 'Table Name', 'Index Name', 'Index Type'
go

[/sql]

Index Filegroup Details

You can modify the where clause to filter the results based on the type of index or the specific filegroup you are interested in.

Go forth and analyze...