Attribution
SQL Server doesn’t do much to help you figure out what’s wrong with it. Sure, it makes a lot of information available, but logging, decoding, and assembling all that information is hard work.
And it sure doesn’t help that a lot of it is stored in XML. Even when there are graphical options for displaying it, they end up hiding a lot of the really useful information.
It also has a lot of optional things that you have to turn on, and all of these things require you to monitor and alert on them.
In this post, I’m going to talk about why you should turn on Query Store, the Blocked Process Report, and capture deadlocks, and link you to free tools to analyze them.
This is all stuff that I help clients with as well, along with actually helping them fix the problems, but if you’re feeling keen on getting to it yourself, here’s what you should do.
Query Store
This thing is great for most SQL Servers out there. It’s like the plan cache, except it won’t keep disappearing on you.
To get it up and running, just run this command:
ALTER DATABASE [YourDatabase] SET QUERY_STORE ( OPERATION_MODE = READ_WRITE, MAX_STORAGE_SIZE_MB = 1024, QUERY_CAPTURE_MODE = AUTO );
After it’s turned on, you can check in on it in two ways:
- Through the GUI in SSMS
- With my stored procedure, sp_QuickieStore
Whichever one you use, you’re going to be in much better shape than hoping what you need is in the plan cache after something went bump in the night.
EXEC sp_QuickieStore @database_name = 'your awesome database';
Just running this will go back up to seven days and give you the top 10 queries by average CPU time. In other words, it’s the stuff that uses the most CPU whenever it runs.
If you find stuff you don’t care about, just copy and paste the contents of the query_id
column, and do this to ignore them:
EXEC sp_QuickieStore @database_name = 'your awesome database', @ignore_query_ids = '1, 2, 3, 4, 5, 6, 7, 8';
Blocked Processes
The next thing you’ll wanna do is start logging which queries are blocking each other. You never know when this will ratchet up and become a real big problem.
The number of times I hear “we had a bad blocking problem” but no one even has a screenshot of it hanging out would boggle the very depths of your mind.
To avoid that, you need to:
- Enable the blocked process report
- Enable something to collect it
First, do this:
EXEC sys.sp_configure N'show advanced options', 1; RECONFIGURE; GO EXEC sys.sp_configure N'blocked process threshold', 5; --Seconds RECONFIGURE; GO
This will get SQL Server capturing any blocking events that go on longer than five seconds. This is about the level where people start hitting problems, but you can set it to a higher number if you feel like it.
Next, you’ll want to set up an Extended Event to house the data from it:
CREATE EVENT SESSION blocked_process_report ON SERVER ADD EVENT sqlserver.blocked_process_report ADD TARGET package0.event_file ( SET filename = N'bpr' ) WITH ( MAX_MEMORY = 4096KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 5 SECONDS, MAX_EVENT_SIZE = 0KB, MEMORY_PARTITION_MODE = NONE, TRACK_CAUSALITY = OFF, STARTUP_STATE = ON ); GO ALTER EVENT SESSION blocked_process_report ON SERVER STATE = START; GO
If you’re using Azure SQL DB, you’ll need to tweak this to use it successfully. It’s a lot harder to successfully set up and access event files there, and the ring buffer is far less valuable because it doesn’t hold onto data very long.
The easiest way to get at the data in there is to use my stored procedure sp_HumanEventsBlockViewer.
It’s as simple as pointing it at the event session you created, but there are a whole bunch of other parameters to filter data in or out.
EXEC dbo.sp_HumanEventsBlockViewer @session_name = N'blocked_process_report';
Deadlocks
The final thing you should do is set up something to collect deadlocks in the long term.
To do that, Extended Events is your friend all over again.
CREATE EVENT SESSION deadlock ON SERVER ADD EVENT sqlserver.xml_deadlock_report ADD TARGET package0.event_file ( SET filename = N'deadlock' ) WITH ( MAX_MEMORY = 4096KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 5 SECONDS, MAX_EVENT_SIZE = 0KB, MEMORY_PARTITION_MODE = NONE, TRACK_CAUSALITY = OFF, STARTUP_STATE = ON ); GO ALTER EVENT SESSION deadlock ON SERVER STATE = START; GO
While this stored procedure isn’t in my GitHub repo, it’s one that I do a heck of a lot of work on. The easiest way to parse everything out is with sp_BlitzLock.
It’s as simple as pointing it at your extended event session.
EXEC dbo.sp_BlitzLock @EventSessionName = N'deadlock';
Nice To Have
Having these three things set up and running will make your life a lot easier the next time you face a performance emergency with your SQL Server.
Sure, there are a lot of other things that could go wrong, but these are the most common areas I see clients having problems in.
If you need to do a post mortem on an issue, you might want to look at:
- sp_LogHunter: Dig through your error logs for high severity issues
- sp_HealthParser: Dig through the system health extended event for a wide variety of issues
And as always, if you need help working through these kinds of problems, hit the link below to get some consulting help from me.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.
Outside of an Elastic Pool, individual Azure SQL Databases have a limit of 256 MB each for the Premium DTU model and 128 MB for all others. This is generally not a problem. However, it is important to note that there is a limit to the amount of memory available for Extended Events Sessions at the Elastic Pool level for all sessions in all databases within the pool **combined** (512 MB).
So, if you have many databases in an Elastic Pool, you can easily run into that limit, especially if you have a monitoring tool that creates its own Extended Events Sessions. This could limit the number of databases you can keep in one pool and therefore increase your Azure costs by requiring additional pools. I have run into this myself.
Reference: https://techcommunity.microsoft.com/t5/azure-database-support-blog/lesson-learned-426-identifying-and-mitigating-query-regressions/ba-p/3924281
Yeah, I’ve run into that working with some clients using Azure SQLDB. It’s quite frustrating, especially because of how difficult it is to work with files in Extended Events using it.