SQL Server is typically an IO bound application. That means, IO is almost always your constraining factor. Whenever I am troubleshooting a system, IO is one of the areas that I always take a look at.
Enabling compression will reduce the amount of IO that SQL Server uses to satisfy SELECT queries. Since more data is being stored on each page, it takes less pages to complete a query.
A Quick Demo
We'll use the following simple query as our baseline. Run the query and then take a look at the number of IOs. To see this click on the Messages Tab after you run the query.
-- example query use AdventureWorksDW2012; set statistics io on; select ProductKey, count(*) AS QuantitySold, sum(SalesAmount) AS TotalSales from dbo.FactInternetSales group by ProductKey order by ProductKey; go
-- turn on row compression use AdventureWorksDW2012; alter table dbo.FactInternetSales rebuild partition = all with (data_compression = row); go
-- turn on page compression use AdventureWorksDW2012; alter table dbo.FactInternetSales rebuild partition = all with (data_compression = page); go
There's No Such Thing as a Free Lunch
One thing to keep in mind is that compression will increase your CPU usage. In practice, I have usually found this to be in the range of one to three percent. That said, if you are currently experiencing CPU issues with your server, it would behoove you to address that first. Oddly enough, quite often I find that CPU problems are being driven by IO problems. So be sure to tune your queries and check your indexes.
Row Compression versus Page Compression
There are two types of compression available for use with SQL Server; Row Compression and Page Compression.
Row Compression stores fixed data type columns using a variable length format. Page Compression adds to that by incorporating Prefix and Dictionary Compression to the mix. Page Compression works very well when you have lots of repeating values in your tables. Like a Data Warehouse...
Generally speaking, I recommend using Row Compression with OLTP databases, and using Page Compression with Data Warehouses.
Now this doesn't mean you should blindly enable compression for all tables and indexes on all of your databases. Do a little analysis first and start small.
Focus on your largest tables first; the ones that are causing you pain. Run some checks and see if those tables would benefit from having compression enabled. Pick your top ten.
The best candidates for compression are tables that are not being updated frequently. So if you have a table that is getting 25% of its rows updated every day, that may not be the best table to compress. As always, you will need to test your servers and workload to see what combination works best for your environment.
Show me the T-SQL
The script below will check all of your tables and indexes. It will report back the current size, current compression methods being used, and an estimation of the space savings you can achieve by using either Row Compression or Page Compression. It runs on SQL 2008, SQL 2012, and SQL 2014 CTP1.
-- Steven Ormrod -- 7/11/13 -- estimates row and page compression savings on all tables in a database -- Version: 2.1 -- Source: http://stevenormrod.com/ -- License: Creative Commons. Attribution-NonCommercial CC BY-NC -- http://creativecommons.org/licenses/by-nc/3.0/ set nocount on; set transaction isolation level read uncommitted; declare @currenttable numeric(10, 2); declare @tablecount numeric(10, 2); -- var to hold cursor values declare @table varchar(255); declare @schema varchar(255); declare @sql nvarchar(255); -- temp table for row compression information create table #rowcompression ( TableName varchar(255), SchemaName varchar(255), IndexID int, PartitionNumber int, CurrentSizeKB bigint, RequestedSizeKB bigint, CurrentSampleKB bigint, RequestedSampleKB bigint ); -- temp table for page compression information create table #pagecompression ( TableName varchar(255), SchemaName varchar(255), IndexID int, PartitionNumber int, CurrentSizeKB bigint, RequestedSizeKB bigint, CurrentSampleKB bigint, RequestedSampleKB bigint ); -- current compression information select s.name as SchemaName, t.name as TableName, i.name as IndexName, i.index_id, p.data_compression_desc as CompressionType into #currentcompression from sys.schemas s join sys.tables t on s.schema_id = t.schema_id join sys.indexes i on t.object_id = i.object_id join sys.partitions p on i.object_id = p.object_id and i.index_id = p.index_id order by s.name, t.name, i.name; select @tablecount = count(*) from sys.tables; set @currenttable = 0; -- declare variable for the cursor declare curTables cursor for -- sql statement for the cursor select t.name as TableName, SCHEMA_NAME(t.schema_id) as SchemaName from sys.tables t order by SchemaName, TableName; -- open 'er up open curTables; -- load the first row fetch next from curTables into @table, @schema; -- loop through the cursor while @@fetch_status = 0 begin -- do some work print 'Estimating row and page compression for: ' + @schema + '.' + @table + '...'; set @sql = 'exec sp_estimate_data_compression_savings ''' + @schema + ''', ''' + @table + ''', null, null, row'; -- print @sql insert into #rowcompression execute sp_executesql @sql; -- estimate page compression set @sql = 'exec sp_estimate_data_compression_savings ''' + @schema + ''', ''' + @table + ''', null, null, page'; -- print @sql insert into #pagecompression execute sp_executesql @sql; -- executive update set @currenttable = @currenttable + 1; print char(9) + 'Percent Complete: ' + cast(cast(@currenttable / @tablecount * 100.0 as numeric(10, 2)) as varchar(255)); -- advance to the next row fetch next from curTables into @table, @schema; end; -- build the executive executive summary select r.SchemaName, r.TableName, -- r.IndexID, i.name as IndexName, r.CurrentSizeKB, c.CompressionType as CurrentCompression, r.RequestedSizeKB as RowCompressionKB, case r.CurrentSizeKB when 0 then 0 else cast((1.0 - r.RequestedSizeKB / (r.CurrentSizeKB * 1.0)) * 100 as numeric(10, 2)) end as RowDecreasePercent, p.RequestedSizeKB as PageCompressionKB, case r.CurrentSizeKB when 0 then 0 else cast((1.0 - p.RequestedSizeKB / (r.CurrentSizeKB * 1.0)) * 100 as numeric(10, 2)) end as PageDecreasePercent into #executivesummary from #currentcompression c join #rowcompression r on c.SchemaName = r.SchemaName and c.TableName = r.TableName and c.index_id = r.IndexID join #pagecompression p on r.TableName = p.TableName and r.SchemaName = p.SchemaName and r.IndexID = p.IndexID join sys.indexes i on r.TableName = OBJECT_NAME(i.object_id) and r.IndexID = i.index_id order by r.SchemaName, r.TableName, IndexName; -- show everything select * from #executivesummary; /* -- if you want to change how reporting is done -- experiment with the following queries -- be sure to comment out the original one, above -- focus on the largest tables select top 10 -- focus on the biggest bang for the buck * from #executivesummary where RowDecreasePercent + PageDecreasePercent > 50 -- peel off the bottom layer order by CurrentSizeKB desc; -- focus on the larger tables and indexes -- focus on the largest compression select top 10 -- focus on the biggest bang for the buck * from #executivesummary where CurrentSizeKB > 1024 order by RowDecreasePercent + PageDecreasePercent desc; */ -- doing the dishes close curTables; deallocate curTables; drop table #currentcompression; drop table #rowcompression; drop table #pagecompression; drop table #executivesummary; go
|Compression Estimates Percent Complete|