Monday, May 30, 2011

Memorial Day 2011

Today is Memorial Day.  The purpose of this holiday is to commemorate any soldiers who have lost their lives while serving their country.  Please take a moment to consider and be grateful for the things you have in your life, and have a little thought in your heart for those who have been lost.

I would like to thank the Marine Corps for helping to fund my way through college and giving me the drive to achieve the things I have in my life.

Memorial Day is also used to mark the start of the Summer vacation season.  So it's time to get the Family Truckster headed to the beach.  I hope everyone is enjoying the day off with family and friends, and of course, digging into some good BBQ!

Wednesday, May 25, 2011

SQL Saturday #63 Dallas

SQL Saturday #63 Dallas Schwag

This was my second SQL Saturday after attending the one in Houston earlier in the year.  I was curious to see if this one would be as awesome as my first.  Now, I am excited about the prospect of a Texas Trifecta in Austin later in the year.

SQL Saturday is an almost free, all day event dedicated to all things SQL Server.  I saw almost-free since you will still have travel costs: hotel, gas, rental car, meals, etc.  However, I think this is the best bang for the buck that you will find anywhere.

Location

Dallas' SQL Saturday was held on April 2nd at the Region 10 Education Service Center.  All of the conference rooms were very nice and had adequate audiovisual equipment.  However, in between the sessions, the hallways and common areas were a little cramped.  I think it would have been better if a few more conference rooms were availble.  This way, the vendor and networking areas could have been hosted there, thus alleviating some of the congestion.

Networking

Friday, before the event, Idera hosted a casual meet and greet at Wizard's Sports Cafe.  This was an excellent opportunity to meet your fellow attendees, have a cold beer, and display your lack of pool skills.

SQL Saturday itself is a great way to meet other DBAs from across the state and the country.  You can talk shop, trade horror stories, and even get feedback on your resume.

Saturday, the official networking event was held at Dave and Busters.  There was free food to much on, other attendees to trade notes with, and presenters to pepper with more questions.

Sessions

SQL Saturday #63 Schedule
There were a total of forty-two sessions covering seven different tracks.  So a wide variety of topics to choose from.  Some sessions were of the same quality you would find at SQL PASS.  While others with from beginning speakers looking to polish their craft.  I think it is wonderful that there is a hospitable venue for the aspiring speaker.

Highlights

I began my morning with a deep dive on Solid State Storage with Wes Brown (blog | twitter).  He took us through the differences amongst the various vendors.  Additionally, we learned about some of the gotchas that come along with flash storage, and what the difference between enterprise and consumer grade was.

Next, I took in Kevin Kline's (blog | twitter) presentation on SQL Internals and Architecture.  This was a lively show and the room was literally standing room only.  Afterwards, Kevin was very gracious and took the time for some one-on-one questions from me.

For the afternoon, I checked out Kevin Boles' (blog | twitter) session on Parallel Query Execution.  He have us all the basics and showed up what all the options available with sp_configure mean.  In addition to sharing his technical skills with us, he also displayed his sense of humor while dealing with a heckler.

To close out the day, I saw Suresh Kandoth (blog | twitter) give a lesson on Database Availability and Recovery.  This is an area that we all have to deal with from time to time.  Having some knowledge of the fundamentals will help make things easier when the time comes.

Deep Ellum Arts Festival

Deep Ellum Music
As luck would have it, the Deep Ellum Arts Festival was being held on the same weekend as SQL Saturday.  Before checking into the hotel, I decided to explore a little bit.  It was held in Deep Ellum and several streets were blocked off to traffic and pedestrianized.  There were multiple stages and several local bands were playing a variety of styles.  Additionally, there were local artists and vendors with booths setup to showcase their wares.  While the event itself was free, you did have to pay for the concessions.  Wandering the streets, listening some local bands, drinking a cold brew...what a great way to get ready for SQL Saturday.

Conclusion

If you have never been to a SQL Saturday, then you are missing out.  It is a great way to get some nice training, make some new friends, and re-energize yourself about SQL Server.

Tuesday, May 24, 2011

Heaps of Heaps

Recently, I was looking into a stored procedure that was running poorly.  As part of my analysis, I examined the underlying tables and their indexes.  By using Kimberly Tripp's (blog | twitter) rewrite of  sp_help, I was able to quickly determine that one of the tables was a heap.

Heaps


A Heap is a table without a Clustered Index.  This is bad for a few reasons:

  • When you update a Heap, you can get forwarded records

  • When you insert to the Heap, an 8-byte uniquifier will be added

  • Can compromise insert performance

How Many Heaps in a Pile

Like most things in nature, when there is one, there is another, and another, and another...

Knowing this, I set about to discover how many other tables in this database were heaps.

First, let's look at our overall count of heaps in relation to the total table count.

[sql]

-- percentage of heaps
declare @heapCount numeric;
declare @tableCount numeric;

select @heapCount = count(t.object_id)
from sys.tables as t
join sys.indexes as i
on t.object_id = i.object_id
where i.type_desc = 'HEAP';

select @tableCount = COUNT(*) from sys.tables;

select @heapCount as 'Heap Count',
@tableCount as 'Table Count',
CAST((@heapCount / @tableCount * 100) as numeric(5, 2)) as 'Percentage of Heaps';
go

[/sql]

Next, let's enumerate all the tables that are heaps.

[sql]

-- list all the heaps in the currently selected database
select OBJECT_NAME(t.object_id) as 'Table Name', i.type_desc
from sys.tables as t
join sys.indexes as i
on t.object_id = i.object_id
where i.type_desc = 'HEAP'
order by OBJECT_NAME(t.object_id);
go

[/sql]

Time to Make the Clusters

What I found was startling to say the least.  On this particular server, all three databases contain quite a few heaps.

Heaps of Heaps

Obviously, this is going to be where I start my tuning efforts.  After seeing how many heaps existed on a strange database, I think it would be wise to make this a standard check when first looking at any unfamiliar database.

Monday, May 23, 2011

Clearing the Plan Cache

While at SQL Saturday #63 in Dallas, I got some questions about clearing the plan cache.  This evolved into an email discussion; which resulted in me writing up some example scripts.  Since I received some good feedback, I thought this would be a good opportunity to turn them into a blog post.

Warning: These scripts should almost never be run on a production system.  These are meant for debugging and testing on DEV.  If you do run these on PROD, be sure to understand the consequences.

I could see running these on PROD in a few limited cases.  For example, if you have a query that is stuck with a bad plan in cache, you can use these scripts to remove that individual plan.  Another possibility would be to clear out the adhoc plans, perhaps just after you have enabled the optimize for adhoc workloads setting.

Examining the Plan Cache

The Plan Cache is where SQL Server stores the Execution Plans for any queries or stored procedures it has run.  The purpose of this is to enable SQL Server to reuse Execution Plans for each subsequent run of the query or stored procedure.  This allows SQL Server to scale up as more requests begin to pile on.

To examine the Plan Cache you can start by querying the sys.dm_exec_cached_plans DMV.

[sql]

-- look at the plan cache, list all plans
select * from sys.dm_exec_cached_plans;
go

[/sql]

This is not very useful by itself.  In order to make this query a bit more useful, we can link in either the Query Text or the Query Plan by using two other DMVs: sys.dm_exec_sql_text and sys.dm_exec_query_plan respectively.

[sql]

-- link in the sql text
select decp.*, dest.text
from sys.dm_exec_cached_plans decp
cross apply sys.dm_exec_sql_text(decp.plan_handle) dest;
go

-- link in the query plan
select decp.*, deqp.query_plan
from sys.dm_exec_cached_plans decp
cross apply sys.dm_exec_query_plan(decp.plan_handle) deqp;
go

[/sql]

And if you wish to see both the Query Text and the Query Plan together, you can link multiple DMVs in the same query.

[sql]

-- link in both
select decp.*, dest.text, deqp.query_plan
from sys.dm_exec_cached_plans decp
cross apply sys.dm_exec_sql_text(decp.plan_handle) dest
cross apply sys.dm_exec_query_plan(decp.plan_handle) deqp;
go

[/sql]

Clearing the Plan Cache

As I work through clearing the Plan Cache, I'm going to start with the broadest example, and then get more granular as we go along.  If you need to clear the entire Plan Cache, you can do so with a DBCC FREEPROCCACHE command.

I feel the need to issue another warning here, and reiterate that this is almost always a bad idea on a production system.

[sql]

-- clears everything in the plan cache
dbcc freeproccache;
go

[/sql]

Clearing by Database

If clearing the entire Plan Cache seems a bit excessive, as well it should, then you can get a wee bit more focused by only clearing the plans for a particular database.  This can be achieved by using the DBCC FLUSHPROCINDB command.  This is an undocumented command in SQL 2008, however, you can read about it under the SQL 2005 documentation.

As with clearing the entire cache, this is probably not something you want to do with a production system unless you have a good reason.  You may have one database that is a special case, which needs to be tamed.

[sql]

-- clear the plan cache for a particular database
declare @v_DBID tinyint;

select @v_DBID = dbid from sys.sysdatabases where name = 'AdventureWorks2008R2';

dbcc flushprocindb (@v_DBID);
go

[/sql]

Clearing by Object Type

Next, let's look at clearing the Plan Cache for a particular type of object.  By using DBCC FREESYSTEMCACHE we can choose to clear out the Adhoc Plans, Stored Procedure Plans, Views, or Extended Stored Procedures.

Most likely, the ones you will be interested in are Adhoc Plans and Stored Procedures.

[sql]

-- clear plans for a particular type of object

-- clear sql plans: adhoc sql, prepared statements
dbcc freesystemcache('SQL Plans');
go

-- clear object plans: stored procedures, functions, triggers
dbcc freesystemcache('Object Plans');
go

-- clear bound trees: views, constraints, defaults
dbcc freesystemcache('Bound Trees');
go

-- clear extended stored procedures
dbcc freesystemcache('Extended Stored Procedures');
go

[/sql]

Like a broken record, I gotta remind you that it is a bad idea to do this on a production system.

Optimize for Adhoc Workloads

The primary reason you may wish to clear out the Adhoc Plans from your cache, is if you have just enabled the SQL Server setting 'optimize for adhoc workloads'.  This setting tells SQL Server not to cache the Execution Plans for Adhoc queries until they have been run more than once.  This helps keep Plan Cache Bloat under control.

You can enable this setting by using sp_configure.

[sql]

-- show advanced options - enable all options to be seen / set
use [master]
go
exec sp_configure 'show advanced options', 1
go
reconfigure
go
reconfigure with override
go

-- optimize for ad hoc workloads
sp_configure 'optimize for ad hoc workloads', 1
go
reconfigure
go
reconfigure with override
go

-- verify change
sp_configure 'optimize for ad hoc workloads'
go

[/sql]

Once you have enabled this setting, the Single-Use Adhoc Plans will remain in cache until either the SQL Service has been restarted, or you have explicitly cleared the Plan Cache of Adhoc Plans by using the code example above.

Resource Governor

If you are using Resource Governor, then you have the option of clearing the Plan Cache for a particular Resource Pool.  Resource Governor allows you to create various Resource Pools and then designate a certain percentage of CPU or Memory to that pool.

In order to get a list of the Resource Pools that are defined on your system, you can query the sys.dm_resource_governor_resource_pools DMV.

[sql]

-- list the resource governor pools
select distinct name as 'Resource Governor Pools'
from sys.dm_resource_governor_resource_pools;
go

[/sql]

Once you have identified the Resource Pool that needs to have its associated Cached Plans cleared, you can clear the plans by using the DBCC FREEPROCCACHE command.

[sql]

-- clear the default resource governor pool
dbcc freeproccache ('default');
go

-- clear the internal resource governor pool
dbcc freeproccache ('internal');
go

-- clear the default resource governor pool
dbcc freeproccache ('YOURRESOURCEGOVERNORPOOL');
go

[/sql]

Again, please do not do this on a production system unless you are certain this is what you need.

Clearing a Single Plan

If you have a particular query which is causing you problems, for example, it got compiled with a bad plan, then, one option is to remove the offending plan from the cache, which will cause a recompile.  There is no guarantee, however, that the new plan will be any better than the old plan.

You may wish to capture the existing plan so that you can compare it with the new plan.  This way you can explore what the differences are, and focus your tuning efforts there.

To find the query plan for a particular query we will need a search string; the SQL from your query.  We will place your query inside of mine, so we can search the Plan Cache for it.  It is not necessary to use the entire text of the query, that would be a bit cumbersome. All we need is enough to make it unique, or unique enough that we can find it in the results.

[sql]

-- or it may be even better to search by the sql text
-- so that you only see the ones that interest you
-- also link in the query plan so that you can analyze it
select decp.usecounts, decp.size_in_bytes, dest.text, decp.plan_handle, deqp.query_plan
from sys.dm_exec_cached_plans decp
cross apply sys.dm_exec_sql_text(decp.plan_handle) dest
cross apply sys.dm_exec_query_plan(decp.plan_handle) deqp
where dest.text like '%PLACE YOUR SQL CODE SNIPPET HERE%';
go

[/sql]

Once you have identified the offending plan, it is a simple matter to remove it from the cache.  Use the plan_handle with the DBCC FREEPROCCACHE command to clear it from the Plan Cache.

[sql]

-- once you find the plan you wish to prune
-- place the plan_handle in the following statement
dbcc freeproccache (0x0600010010450926B8E00C07000000000000000000000000);
go

[/sql]

So this is the only one of the options where I can see a valid case for doing it on Production.  All the same, be careful and understand what you are doing.  It is important to clear only the Query Plan that is needed, and no more.

Conclusion

It is important to keep tabs on what is going on in your Plan Cache.  Being able to clear out all or part of your Plan Cache can be a powerful debugging technique.  Please bear in mind that this is not something that should be done on a Production server without careful consideration of the side effects.

However, playing with this in DEV can help you understand more of what is happening under the covers of SQL Server.

Thursday, May 19, 2011

Back to the Drawing Board

Like many DBAs, I have a variety of scripts I use to check on the health of my servers.  Some are borrowed, others stolen, and a few I have written myself.  A common theme amongst them is the need to filter either user or system processes depending on the situation.

Filtering User Processes

Typically I have done this by adding a where clause which filters the results on session_id.

[sql]

-- incorrect method to filter for user sessions
select *
from sys.dm_exec_sessions
where session_id > 50

-- incorrect method to filter for user requests
select *
from sys.dm_exec_requests
where session_id > 50

[/sql]

You see, I have always been taught that system processes will always have a session_id of one through fifty, while user processes will always be greater than fifty.

Recently, I had the opportunity to work with a Microsoft engineer while troubleshooting some systems.  As we were running through my scripts he mentioned that this was no longer the case; system SPIDs can exist above fifty.  When I asked him how I should go about filtering user and system processes, he said he did not know of a way.

System Processes Have SPIDs Above 50

I was crestfallen to say the least.  Here I was, using an outdated technique and thinking I was cool. However, I was sure there must be a way to achieve the results I was looking for.

As luck would have it, the following week I was taking a SQLskills class with Paul Randal (b | t). During one of the breaks, I asked him about this and if he knew of a new method for filtering user and system processes.  True to form, he did some research, reached out to one of his contacts at Microsoft and had an answer for me within the hour.

New way?  What's this about a new way?

Apparently, this behavior has existed since SQL 2005 but is not widely known.  The correct way to filter user processes is to query sys.dm_exec_sessions and filter on the is_user_process column.

[sql]

-- list user processes
select *
from sys.dm_exec_sessions
where is_user_process = 1

-- list system processes
select *
from sys.dm_exec_sessions
where is_user_process = 0

[/sql]

So, in order to leverage this in other DMVs, you will need to use a join clause.  My updated code looks like this...

[sql]

-- correct method to filter user requests
select der.*
from sys.dm_exec_requests der
join sys.dm_exec_sessions des
on der.session_id = des.session_id
where des.is_user_process = 1

[/sql]

An Exercise for the Writer

It always amazes me how long you can use a product and still stumble upon 'new' features at any moment.  So now, I will be reviewing my existing scripts and making updates where appropriate.

Wednesday, May 18, 2011

CACTUSS May 2011

This month Tim Costello (blog | twitter) gave a presentation on Rapid Fire Business Intelligence using Tableau.  Tim works for InterWorks, where he specializes in ETL and Data Analytics.  When you visit his profile and look at his picture, he wants you to keep in mind he was participating in No-Shave November at the time.

Tableau

Tableau is not SQL Server Reporting Services, it is an Analytics Tool.  The strength of SSRS lies in tabular reports, using string data.  You use it to answer a question that you already know.  Contrast that with Tableau, whose strength is in helping you identify a question you did not know about, but should be asking.

The Big Picture
Most things are done at the aggregate level to enable you to see the big picture visually.  Machines are able to recognize patterns in text very easily, but for us mere humans, it is easier to do this visually.  This is what Tableau focuses on doing.  Furthermore, Tableau hired a professional color designer, who designed some custom color palettes for users who are colorblind.  You can see a nice example of this that uses good ole AdventureWorks for the data source.

The internals of how you setup Tableau are similar to SQL Server Analysis Services.  You will see all the familiar dimension and fact tables.  However, the setup is more visual and user friendly.  It is intended for users, well...power users, instead of designers and engineers.  Tableau is cross platform and you can point it to SQL Server, SSAS, Oracle, Firebird, MySQL, or many other data sources.  Similar to Adobe Reader, Tableau is able to use disconnected data so that you may view and analyze it offline.

After you have connected to your data source, you are presented with a nice, easy to use WYSIWYG designer to setup your Data Analysis.  At this point, it is helpful to run through the Categorical Analysis.  This is a basic statistical breakdown of your data; how many rows, how are your values spread out, etc.  When setting up your design, consider one of the Principles of Data Visualization, "don't overwhelm your customers with data."  You can embed SQL queries directly into the connections, however the best practice to create views on the database server so that you can reuse code.

As you begin to analyze your data, it pays to be cognizant of how standard deviation plays into your data.  Seventy-five percent of your population should be contained within one standard deviation, and ninety-five percent of your population should be contained within two standard deviations.  So you want to keep this in mind while looking for outliers in your data set.

While checking out his example, we did see one such outlier; Null Island.  Null Island is conveniently located at the intersection of the Prime Meridian and the Equator.  For my next vacation, I think I may go visit.

Visual Resume
There is a public version of Tableau available.  When using this version, you are limited to 500,000 rows and your data must be uploaded to their servers.  Obviously, this is not a viable option for most production environments.  However, this could serve well for a proof of concept or test environment.  Tim has a demo of this where he has visual version of his resume posted.  By leveraging the visual tools in Tableau, you are able to see a graphic breakdown of Tim's skill set and experience.  Not that he's looking...  ;-)

In conclusion, please keep in mind that analytics is not reporting, and reporting is not analytics

User Group Business

We are trying to rearrange the San Antonio SQL Server Users Group to meet on Mondays.  Doing so, would bring us closer to the goal of having each major metropolitan area hold their respective meeting on a different day of the week.  Having the days aligned in this manner will facilitate having out of state or "big name" speakers make the Texas Tour.

Austin is looking to experiment with adding a second SQL Server Users Group that is located in the downtown area.  This will compliment the existing group which meets in North Austin.

And finally, the news that you have all been waiting for...

We are getting closer to hosting a SQL Saturday in Austin.  The tentative date is September 10th, so mark your calendars.  Let's all think happy thoughts to help make this a reality.

Monday, May 16, 2011

Week One Fun

Performance and Internals Curriculum

Recently, I participated in the SQLskills Immersion Event for Internals and Performance hosted by Kimberly Tripp (b | t) and Paul Randal (b | t).  My classmates consisted of Senior DBAs from a variety of countries and industries.  We had people from coast to coast and a couple from Europe, as well.

I say participated because this truly is an interactive experience.  Consider a typical class where you sit back and passively listen to an instructor who clicks through the installer wizard and PowerPoint slides.  In this class, you have to actively participate in order to get the most from it.


College Daze Revisited
There is so much information thrown at you, if you do not actively try to assimilate it and ask deep, focused questions, then you do do not stand much of a chance of retaining any of it.  Think about it like like drinking water from a fire hose.  You will get wet, but your thirst will not have been quenched.  By the end of the week, I had taken seventy-seven pages of notes!


The Class

While the class followed a formal curriculum, you are encouraged to ask questions that deviate from the planned outline.  Paul and Kimberly each have the requisite knowledge to answer any question to any level of depth.  Internals and architecture are their respective sweet spots.

They really work hard to foster an open environment where the participants feel comfortable with one another and are willing to open up and share their experience and knowledge.  There will be times when the answers and learning come from the other participants.  When this happens, Paul and Kimberly transcend the role of instructors and become facilitators; guiding us all on a journey of SQL Immersion.

During the breaks, we were able to ask follow-up questions or to dive deeper into an impromptu architecture sessions.  On multiple occasions, I was able get advice specific to my environment from both Kimberly and Paul.

The Food

Ask any teacher and they will tell you the importance of a good breakfast for learning.  If you have ever been to training in a strange city, you know how difficult it can be trying to get a good breakfast.  Between getting from the hotel to the training center and finding a place to eat; either you end up late to class, or you skip breakfast altogether.  In the end, your learning suffers.  Then comes lunch; you have to go find a place to eat, rush through, and still come back late.  Lather, rinse, repeat.

By providing a quality breakfast and lunch on site, all of those problems are removed.  Plus you get extra time to bond with your classmates and talk some shop.  These truly became valuable times where we were able to discuss stuff from class, trade stories, or share some techniques with one another other.

The Camaraderie

Goose Island Honker's Ale
After each day's formal class ended, there was an informal session where the learning continued.  The first night, we had an ice breaker session.  We all met at the hotel bar to have a drink, loosen up, and really get to know each other.  This had the effect of making everyone comfortable with one another.  As the week progressed, we deepened our trust with one another, and were able to ask riskier questions and even take a stab at answering a few ourselves.

After our socializing, one of the locals organized a Portillo's run.  Portillo's is a local chain which serves Chicago style hotdogs and Italian Beef's.  I gotta say, these were some of the best dogs I've ever had; I am a convert.

The second night, the Chicago SQL Server User Group meeting was hosted on site.  First, Kimberly ran a question and answer session from the audience.  The coolest part for me, was the realization that I could answer many of the questions myself after being in her class for only two days.  Afterwards, we were treated to a demo session of SQL Sentry's Plan Explorer.  This is a free tool which gives you a much more useful view of Query Execution Plans than SSMS.  It renders the execution plan in a much better fashion, and allows you to sort the various pieces by relative cost, IOs, etc.  This allows you to drill down to the problem area very quickly.

On the third night, they took a group to the local Microsoft office to see a presentation by Andy Leonard on SSIS.  I must admit I opted out of this excursion.  By midweek, by batteries needed a little recharging, plus I had seen his presentation during SQL PASS.

Thursday Free4All

For the forth night, Kimberly and Paul opened up the floor for the students to have a go at presenting some mini-sessions.  We got to learn about survival skills for being a DBA, gained some insight from the perspective of a software vendor, and saw how to use SSDs in an office with a limited budget.  I think is a wonderful opportunity to get some experience with presentations and to get some feedback from the masters.

Friday, we had some goodbye socializing.  It was a sad day, saying goodbye to the friends you have made during the week.  I think it speaks volumes when you consider how bummed we all were to have to leave and return to the real world.

The Location

Chicago
The Performance and Internals Immersion Event was held at the Oakbrook Terrace Hilton Suites just outside of Chicago from May 2nd through the 5th.  The hotel and rooms were were very nice, and we received a great room rate courtesy of SQLskills; which included free Wi-Fi access.  The hotel bar had some excellent local brew on tap, Goose Island's Honker's Ale.  There was also a nice workout room, pool, and hot tub which allowed you to work off any stress and relax.  The hotel staff was very nice and were ready to help with any requests.  If you wanted to explore the surrounding area, a shuttle was available that could take you to some local shopping or the commuter train station in Elmhurst.  From there, you could ride into downtown Chicago; $4 one way, $7 round trip.

Rick Bayless' Xoco
Chicago is a wonderful city with many attractions and friendly people.  Some fun excursions include the Navy Pier, Canal Boat Architecture Tour, Lakeshore Drive, and Steve's Segway Tours.  There are tons of great restaurants, as well.  One of my highlights was dining at Rick Bayless' Xoco.  I'm a huge fan of his Mexico - One Plate at a Time series on PBS.  Thank you to Brent Ozar (b | t) for recommending the restaurant to me.

Depending on the time of year, be sure to bring a sweater or jacket as Chicago can get a little cool. Especially if you're coming from Texas.  Most days were very pleasant with clear skies.

The Cost

The quality of education you receive from SQLskills is far and away superior to any you will receive from the standard Microsoft classes.  What is interesting, is that they both cost the same.  A one week Microsoft class will set you back three large, assuming you don't have any vouchers; SQLskills charges the same price.  On top of that, they feed you a very high quality breakfast and lunch.  You get extra night-class sessions if you so desire.  You get to network with a variety of Senior DBAs from across the industry and planet.  And, you can even get some one-on-one advice from Kimberly Tripp and Paul Randal.

The Executive Summary

This is not a class for the beginner Database Administrator.  However, if you are mid to senior level, then this is where you want to be.  You will be in a class taught by the best, surrounded by the best.

Tuesday, May 10, 2011

T-SQL Tuesday #18

T-SQL Tuesday #18

T-SQL Tuesday is a rotating blog series where a topic is chosen by the host, and all the participants write an article on that topic.  This month, T-SQL Tuesday is being hosted by Bob Pusateri (blog | twitter) and the topic is Common Table Expressions.

Common Table Expressions

I must admit that I am a fairly recent convert to using CTEs, or Common Table Expressions.  I learned about CTEs while preparing for Exam 70-433, SQL Server 2008 Database Development.

In fact, when I first studied the topic, it went right over my head.  Looking back on it, I think they were so sublime that I didn't get it at first.  The way I look at it is, a CTE is basically a mini-view that you can embed directly in your queries.

Once I grokked what CTEs were, I have been applying them in numerous situations.  After I worked out a few simple examples to learn the ins and outs of CTEs, I dove right in and developed one that I use on an almost daily basis.

Buffer Pool

As a production DBA, I am constantly keeping tabs on the internals of my SQL Servers.  One area that I am particularly interested in is the Buffer Pool.  How big is it?  How much is each database using?  How many Dirty Pages do I have?  You get the idea.

I have seen various scripts from numerous sources that broke down the Buffer Pool by Database, Clean Pages, and Dirty Pages; but always as separate queries.  It bugged me to have to run multiple scripts to get my answers, so I set about to see if I could solve my problem using CTEs.  I figured if I could write a useful script using CTEs, then I should be able to answer any questions on the exam regarding this topic.

I must admit, the inspiration for the base SELECT statements for this script came from my readings of Professional SQL Server 2008 Internals and Troubleshooting.

So, without further ado, here is my script...

[sql]

-- T-SQL Tuesday #18
-- CTEs - Common Table Expressions

-- define the data cache CTE
with dc as (
select case database_id
when 32767 then 'ResourceDB'
else db_name(database_id)
end as 'Database',
count(*) * 8 / 1024 as 'Cached Size (MB)'
from sys.dm_os_buffer_descriptors
group by db_name(database_id), database_id
),

-- define the dirty pages CTE
dp as (
select db_name(database_id) as 'Database',
count(page_id) * 8 /1024 as 'Dirty Pages (MB)'
from sys.dm_os_buffer_descriptors
where is_modified = 1
group by DB_NAME(database_id)
)

-- link the data cache and dirty pages CTEs together
select dc.[Database],
dc.[Cached Size (MB)],
coalesce(dp.[Dirty Pages (MB)], 0) as 'Dirty Pages (MB)'
from dc left join dp
on dc.[Database] = dp.[Database]
order by dc.[Cached Size (MB)] desc;
go

[/sql]

Nothing too fancy here, just a basic everyday script that lets me know how much buffer pool each database is taking up, and how many pages are dirty.  While I realize it would be much cooler if I were calculating the coefficient of thermal expansion with my CTE, but being as I wrote this script on Valentine's Day, I saved the thermal expansion for later in the day.  :-)

Plan Cache

After my initial success, I went on to write several more CTE queries for various administrative tasks.  Additionally, I started using CTEs as a way to stitch together queries in a horizontal fashion, instead of vertically by using UNIONs.

Along with the Buffer Pool, another area that I am constantly examining is the Plan Cache.  One particular slice that I like to analyze is how much memory is being used wasted on Single-Use Adhoc Query Plans.

One way I have done this is to run several queries and UNION them together.  This way I can glue my intermediate results together into one results set.  From here, I often import the results into Excel for Executive Formatting.

[sql]

-- use UNIONs to query one server
-- vertical formatting

-- all plans
select 'All Plans' as 'Data Point',
COUNT(*) as 'Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'MB'
from sys.dm_exec_cached_plans

union

-- adhoc plans
select 'Adhoc Plans' as 'Data Point',
COUNT(*) as 'Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'MB'
from sys.dm_exec_cached_plans
where objtype = 'Adhoc'

union

-- single-use adhoc plans
select 'Single-Use Adhoc Plans' as 'Data Point',
COUNT(*) as 'Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'MB'
from sys.dm_exec_cached_plans
where usecounts = 1
and objtype = 'Adhoc';
go

[/sql]

What I did next, was to rewrite the queries using CTEs, which allows me to format the data Horizonally instead of Vertically.  Also, it made it easier for me to query across multiple servers and have the results formatted much closer to my final Excel format.

You may see my CTE version of this code by viewing yesterday's blog post.

Conclusion

So there you have it.  Using Common Table Expressions to stitch and glue T-SQL together.  Now head on down to Hobby Lobby and ask them where the CTEs are.

Monday, May 9, 2011

Filing for a Cache Refund

My original plan was to publish this article on April 18th in celebration of Tax Filing Day.  However, life has a way of happening, so I filed IRSQL Form 1433 and applied for an extension.

Say Brother, can you spare me some Cache?

Have you used up all of your Plan Cache?  Would you like a refund?  Are ad-hoc queries bloating your Plan Cache?  How can you tell?  What are ad-hoc queries?  What’s a Plan Cache?

Ad-hoc Queries

Ad-hoc queries are TSQL queries that are issued directly against the database.  Typically, the query text is created directly in the application, then sent directly to the database for the data.  This is very common in web-based applications such as ASP or PHP.  However, you also see this with traditional, desktop application written in C/C++ or Java.

Contrast this with Stored Procedures.  Similar to procedures in other programming languages, Stored Procedures can be thought of as functions, methods, or procedures that are internal to SQL Server.  They are written in TSQL but stored within SQL Server.  The first time a Stored Procedure is executed, SQL Server parses and compiles it, and stores the resultant Execution Plan in the Plan Cache.  The Execution Plan tells SQL Server how it is going to get the data to satisfy the request.

The reason you want to cache Query Plans is that if you run a query once, chances are you will run it again.  Some queries may run thousands of times during the course of a day.  It all depends upon the application.

Plan Cache

The Plan Cache is the area of memory where SQL Server stores the Execution Plans for any queries or stored procedures.  Compiling and caching the execution plan does take a few extra CPU cycles away from the server, so why would you do it?  The next time that query or stored procedure is run, SQL Server already knows how to get the data in an efficient manner.  As your application scales, and your query or stored procedure is being run hundreds or thousands of time, this helps the execution to be a fast as possible.

Single-Use Plans

The goal with caching Query Plans is for the execution count be as high as possible.  This means we are getting the most bang for our buck.  However, if you have lots of single-use plans, then you are wasting CPU cycles calculating the plans, and you are wasting memory by storing them.

Every CPU cycle that is wasted compiling and caching a single-use Query Plan is a CPU cycle that could have been doing some work for your application.  Every byte of memory that is wasted storing a single-use Query Plan is memory that could have been used somewhere else.

If you have a high number of single-use plans in your cache, this could be contributing to high CPU utilization, excessive IO, or low memory on your server.

How to Find Ad-hoc Queries and Single-Use Plans

One way to find out if you have ad-hoc queries hitting your database is to ask the development teams if they have any ad-hoc queries in their code.  There are a couple of problems with this approach.

For one, they may not be aware if they are using ad-hoc queries.  If you have a large application, each developer may only familiar with a small part of it.  Also, if you have inherited an existing application, it may be difficult or impossible to wrap your head around the entire code base.

Another scenario is that the developers may be aware of some of their ad-hoc queries, but may underestimate the number of times they are being called, or they may not realize the impact they have upon the system.

As better approach is to ask SQL Server to tell you what sort of plans it has cached.

sys.dm_exec_cached_plans

SQL Server provides us with a Dynamic Management View, sys.dm_exec_cached_plans which lists out every Execution Plan that is currently stored in the Query Cache.  At the most basic level, you can query this DMV to see the details on every Query Plan in your system.

[sql]

-- basic query
select * from sys.dm_exec_cached_plans

[/sql]

By itself, this is not very useful. Of course, like all SQL queries you can add various WHERE clauses or aggregates in order to make it more meaningful for you.

So, how much are you wasting?

The way I approached this problem was to calculate how much memory the entire Plan Cache was using, then compare that to how much the Single-Use Adhoc Plans were using.  Once I had those numbers, I did some quick math to figure out what percentage of my Plan Cache was going to Single-Use Adhoc Plans.

[sql]

-- check percentage of the plan cache that is wasted on single-use adhoc plans

-- use a CTE to check across multiple servers
-- horizontal formatting
-- all plans
;with all_plans as (
select @@SERVERNAME as 'ServerName',
COUNT(*) as 'All Plans Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'Total MB',
avg(cast(size_in_bytes as bigint)) / 1024 as 'Average KB'
from sys.dm_exec_cached_plans
),

-- adhoc plans
adhoc_plans as (
select @@SERVERNAME as 'ServerName',
COUNT(*) as 'Adhoc Plans Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'Total MB',
avg(cast(size_in_bytes as bigint)) / 1024 as 'Average Bytes'
from sys.dm_exec_cached_plans
where objtype = 'Adhoc'
),

-- single-use adhoc plans
single_adhoc as (
select @@SERVERNAME as 'ServerName',
COUNT(*) as 'Single-Use Adhoc Plans Count',
sum(cast(size_in_bytes as bigint)) / 1024 / 1024 as 'Total MB',
avg(cast(size_in_bytes as bigint)) / 1024 as 'Average Bytes'
from sys.dm_exec_cached_plans
where usecounts = 1
and objtype = 'Adhoc'
)

-- query the CTEs

-- geek version, show all the data
select ap.[All Plans Count], ap.[Total MB], ap.[Average KB],
ah.[Adhoc Plans Count], ah.[Total MB], ah.[Average Bytes],
sa.[Single-Use Adhoc Plans Count], sa.[Total MB], sa.[Average Bytes],
cast(cast(sa.[Total MB] as numeric) / cast(ap.[Total MB] as numeric) * 100 as numeric(5,2)) as 'Percent Wasted on Single-Use Adhoc Plans'
from all_plans ap
join adhoc_plans ah
on ap.ServerName = ah.ServerName
join single_adhoc sa
on ah.ServerName = sa.ServerName

[/sql]

I also have an Executive Version, which shows the Percentage Wasted without adding in all those confusing details.  Replace the Geek Version with this one when you query the CTEs.

[sql]

-- executive summary, show how much memory is being wasted on single-use adhoc plans
select ap.[Total MB] as 'All Plans MB',
sa.[Total MB] as 'Single-Use Adhoc MB',
cast(cast(sa.[Total MB] as numeric) / cast(ap.[Total MB] as numeric) * 100 as numeric(5,2)) as 'Percent Wasted on Single-Use Adhoc Plans'
from all_plans ap
join adhoc_plans ah
on ap.ServerName = ah.ServerName
join single_adhoc sa
on ah.ServerName = sa.ServerName

[/sql]

Getting Buy In

If you encounter resistance to changing this setting, run this query against multiple servers and save the results in Excel.  Bosses love Excel; it helps them think.  :-)

Cache Refund

Even better, write up a simple Word Document or PowerPoint Presentation and tie it back to the business or the budget.  That will make an impression.

How much did you spend on memory when you built the server?  Do you see evidence of memory pressure on your server?  Are some of your key business processes running slow because they are not able to get enough memory.

One technique I have used is to put your conclusion first.  Then, lead them along the path that led you to your conclusion.  Put in a few key graphs and screenshots, and voila!  Many managers will not read beyond the first slide or paragraph so you need to work on getting your core message conveyed in a very brief space.

How to fix it

The fix is quite simple: optimize for adhoc workloads.

[sql]

-- show advanced options - enables all options to be seen / set
use [master]
go
exec sp_configure 'show advanced options', 1
go
reconfigure
go

-- check to see if it is turned on
sp_configure 'optimize for ad hoc workloads'
go

-- optimize for ad hoc workloads
sp_configure 'optimize for ad hoc workloads', 1
go
reconfigure
go

-- verify that it has been set correctly
sp_configure 'optimize for ad hoc workloads'
go

[/sql]

This is an Instance Level setting that tells SQL Server not to compile and cache the Query Plan the first time it encounters an Adhoc Query.  Instead, it compiles a Plan Stub.

The Plan Stub is a small identifier that allows SQL Server to recognize that a particular plan has been looked at before.  This way, if it does end up getting multiple uses down the road, SQL Server is able to compile and cache the plan as normal.

For many of my servers, the Plan Stub was an Order of Magnitude smaller than the size of the full Query Plan; 320 bytes versus 25,000k-35,000 bytes for the average Single-Use Adhoc Query Plan.  Of course, your mileage may vary.

Make it so, Number One!

Something to keep in mind, is that you will not be able to predict or control what SQL Server will decide to use its newfound memory on.  Kimberly Tripp (blog | twitter) has an excellent article which discusses some of the behaviors you may observe.

Conclusion

Taking a look inside your Plan Cache can be an illuminating experience.  While you may not get any money back, it is worth looking into.