Blog

My Precon At PASS Data Summit: The Foundations Of SQL Server Performance Tuning @PASSDataSummit #PASSDataSummit

The Foundations Of SQL Server Performance Tuning


This year at the PASS Data Summit, I’m presenting my precon The Foundations Of SQL Server Performance Tuning:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.
Goals:

  • Understand which metrics matter for performance

  • Using Extended Events and Query Store to find problems

  • Solving locking, deadlocking, and general query performance issues

What-For


A lot of the time when I teach performance tuning, I like to show attendees a specific query with some specific problem(s), and how I go about fixing those in different scenarios.

But a lot of the time when I talk to clients and other folks in the SQL Server community, they have a hard time even knowing where to start and what to look for.

In this session, I want to show you how I figure out what kind of shape a server is in, performance-wise, and how I use different clues to figure out which queries to go after.

Anyone can find a slow query; I want to find the queries that users complain about when their loading screens sit there spinning for a millennia. And that’s what I want to show you how to do.

Think of yourself like a conductor in front of an orchestra of totally untuned instruments. You may find some things:

  • It’s hard to figure out which one to tune first
  • Tuning one thing in isolation might not translate to a big change overall
  • You might be tuning something that plays an insignificant role in the symphony

Bribery


All attendees will get a coupon code to my full training catalog.

Along with that, everyone loves these t-shirts, so I’ll be bringing them back.

I’m also going to be making a new one that is TOP secret, along with some spankin’ new stickers for you to slather your laptops with.

Only precon attendees will get one of these.

Thanks for reading, and see you there!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Perils Of Change: Max Degree Of Parallelism

Downtune


Like in yesterday’s post about Cost Threshold For Parallelism, changing MAXDOP settings will have a universal effect on the workload.

This is true whether you change it at the server level for all databases, or at the database level using a database scoped configuration for a single database.

It is a guardrail to prevent unwanted conditions as a whole, like excessive concurrent parallel queries causing worker thread starvation (THREADPOOL waits), or just pushing CPU to 100% for extended periods of time.

You may also find times when parallel threads within a single query are more prone to involvement in blocking or deadlocking.

The problem is similar to yesterday’s post, and requires some level of attention to detail at the query level after you change MAXDOP.

Don’t just look at all the various CX waits, pronounce that there are fewer of them, and call the job done.

You should examine parallel query metrics to make sure that runtimes are still adquate.

Real World


If we take a single query and run it at reduced DOPs, you can see the pattern I’m talking about.

Yesterday, we looked at DOP 8 vs. DOP 1, and I’m including a DOP 1 run here too. It’s just not as important, unless you’re the kind of wackadoo SharePoint admin who changes MAXDOP to 1.

(I’m going to be honest with you here, it’s been YEARS since I’ve seen a SharePoint database, or looked at the SharePoint documentation to see if the MAXDOP 1 requirement still exists).

quality sleep

When you change DOP, it’s up to you to monitor (you can use Query Store for this) execution and CPU time to look for big regressions.

Keeping the above picture in mind:

  • Is it okay if this query runs for ~300ms longer at DOP 4?
  • Is it okay if this query runs for ~1.3 seconds longer at DOP 2?

Taking a full three seconds longer at DOP 1 is probably out the door for queries that users care about, but that’s more likely from changing Cost Threshold For Parallelism.

A diligent caretaker for SQL Server may even look for critical queries that currently use parallel execution plans, and time them with a lower DOP to make sure there’s no substantial regression before making the change.

If there’s a big increase in duration, add the higher DOP hint to queries where it makes sense.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Perils Of Change: Cost Threshold For Parallelism

Non-Default


First, I’m not suggesting that anyone should be using the default value for Cost Threshold For Parallelism. It’s old and moldy and not a good fit for most workloads functioning on modern hardware.

My apologies to Azure SQLDB users who can’t change this setting and leave it up to Microsoft to maybe manage it for them based on ???

Some people out there really like fiddling with settings in a usually ill-informed reaction to Some Script They Found On The Internet, without reading the fine print.

Constantly pushing Cost Threshold For Parallelism up higher because there’s “too much parallelism” is usually a bad strategy, for several reasons.

You should be looking at:

  • Which queries are going parallel
  • Why parallel plans are being chosen
  • The duration vs. CPU of parallel queries

Just to put some easy numbers out there, let’s say you have a query that runs at DOP 8 and:

  • Runs for 8 seconds
  • Uses 64 seconds of CPU

If you push Cost Threshold For Parallelism up to the point where that query no longer qualifies for a parallel plan and runs at DOP 1, it will:

  • Run for 64 seconds
  • Use 64 seconds of CPU time

See why this isn’t a bold and exciting strategy embraced by experts around the world?

Scaling


Of course, in the real world, parallelism rarely scales perfectly linearly. That’s okay, as long as you’re seeing a reasonable reduction in wall clock time.

In this picture, I’ve captured the same query running at DOP 8 and 1.

scaling

What I said above holds close enough to true.

  • At DOP 8, we used ~6.5 seconds of CPU time in ~800ms of wall clock time
  • At DOP 1, we used ~3.9 seconds of CPU time in ~3.9 seconds of wall clock time

While we used more CPU time at DOP 8, we only used about 2.5 seconds more of it, and we used it for about 3 fewer seconds overall.

This is the efficiency tradeoff parallelism offers, and this is especially desirable for queries that have to process a lot of data.

Sticking with DOP 8, I’d rather process 10 million rows across 8 threads than 10 million rows on a single thread (especially in Row Mode, but it would be still be preferable in Batch Mode).

Anyway, if you’ve got Cost Threshold For Parallelism set to the default, you should probably bump it up. I still think 50 is a good starting place to get you to a spot where you can make sane decisions.

Remember, global settings are the guard rails for your workload. They are the general rules you want queries to use.

After you set up the guard rails, it’s up to you to define the exceptions.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The SQLBits 2023 Call For Speakers Is Open! @SQLBits #SQLBits

What A Great Time To Be Alive


I absolutely adore the SQLBits conference, and it was the Biggest Bummer In Bummsville last year when I had family travel plans that overlapped with the conference dates.

So this year, I’m back with a mach 5 vengeance, and ready to get back to the most highingest-flyingest conference in the world.

If you’re interested, and you wanna throw your hat in the ring, here are the links for general and full training day submissions:

To learn more about speaking at the event, click here.

To learn more about attending the event, click here.

I’ll have all my fingers and toes crossed until November 14th when the training day schedule is announced.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About The CXSYNC_PORT Wait In SQL Server 2022

A Little About The CXSYNC_PORT Wait In SQL Server 2022


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Performance Regression With Batch Mode Sorts When Spilling To Disk

Gress and Regress


To vote for this issue, please click here.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Waiting Life Of A SQL Server Query, Revisited

Often Asked


Sometimes people will ask me penetrating questions like “why does SQL Server choose a bad execution plan?” or “why is this query sometimes slow?”

Like many things in databases, it’s an endless spiral of multiverses (and turtles) in which many choose your own adventure games are played and, well, sometimes you get eaten by a Grue.

In this post, I’m going to talk at a high level about potential reasons for both.

I can’t possibly go in depth on both in a single blog post, but I thought it would be fun to talk about all the things that could possibly cause your Unfortunate Circumstances™️.

At any rate, I hope you enjoy it. And maybe learn something. And maybe feel bad about blaming SQL Server for things that are all your fault.

Getting There


A lot of assumptions are gonna get made here, because a lot of things can stand between you and your query getting to SQL Server.

We’re talking networks, security, Active Directory, the cloud. Who knows? Not me. I stick to what happens inside the server.

Just know that there are a lot of moving parts between your application server and your database server.

I’ve seen logins take 30+ seconds to process. Not on purpose, mind you. I stick to what happens inside the server.

But hey, maybe we can just say this all worked quickly, and now we can talk about the stuff I stick to.

Inside the server.

Upon Arrival


There are even more things that have to happen now, even after you log in.

Your connection has to pass some checks and do a bit of routing, and then at a high level, your query will either:

  • Re-use a cached plan if one exists
  • Start building a query plan

This is when SQL Server’s query processor comes along and goes through the various phases of figuring things out:

  • Parse
  • Bind
  • Expand views
  • OPTIMIZE!
  • Come up with an executable plan
  • Apply memory grant and degree of parallelism

There’s one step in that process that seem important. The actual optimization portion. In there, you have:

  • Simplification
  • Cardinality estimation
  • Join ordering (depends a bit on which search phase you make it to: 0, 1, 2
  • Plan exploration phases

Rules, Smoky


During plan exploration is when the optimizer starts applying various rules to your query.

If you want to see the names of all the crazy rules that get may get applied, look in here:

SELECT
    deqts.*
FROM sys.dm_exec_query_transformation_stats AS deqts
ORDER BY
    deqts.name;

In SQL Server 2022, there are about 440 of them. It’s a wonder anything gets done around here.

This process has to happen quickly. So when you throw your crazy-town queries at SQL Server, it may not figure out the single best possible plan ever.

If your queries are parameterized, this brand spankin’ new plan will be created for those parameters.

At this point, it would be good to remind you that your query hasn’t even started executing yet.

It’s just been waiting to get going.

Once it has an executable plan, that happens.

Inside the server.

Ready Set Go


Once your query gets on its horse, you’re subject to many potential holdups, both physical and logical. It may even have to wait for memory to compile an execution plan at times.

  • Memory to compile a query plan (RESOURCE_SEMAPHORE_QUERY_COMPILE)
  • Read pages from disk into memory (PAGEIOLATCH_SH, PAGEIOLATCH_EX)
  • Writing to the transaction log (WRITELOG)
  • Memory for a memory grant (RESOURCE_SEMAPHORE)
  • CPU threads to run on (THREADPOOL)
  • CPU attention while other queries are busy (SOS_SCHEDULER_YIELD)
  • Spilling to disk (IO_COMPLETION, SLEEP_TASK)
  • Building spools (EXECSYNC — parallel plans only)
  • Blocking (LCK_X)
  • Latching (LATCH_XX and PAGELATCH_XX)
  • Parallel threads to process (All those CX waits)

Not enough memory? Wait. Not enough CPU? Wait. Need to read data? Wait. Need to write data? Wait. Blocked? Wait.

And that doesn’t even count all the work your query might have to do while it’s running and not waiting on anything.

The next time you have a gripe about a slow query, try to keep all this stuff in mind.

There’s a lot going on.

Inside the server.

Since You Want To Know


A great way to see all the stuff your queries are waiting on all at once is to run sp_PressureDetector.

Inside the server.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

Three Free Things You Should Do Before You Have Your Next SQL Server Performance Problem

Attribution


SQL Server doesn’t do much to help you figure out what’s wrong with it. Sure, it makes a lot of information available, but logging, decoding, and assembling all that information is hard work.

And it sure doesn’t help that a lot of it is stored in XML. Even when there are graphical options for displaying it, they end up hiding a lot of the really useful information.

It also has a lot of optional things that you have to turn on, and all of these things require you to monitor and alert on them.

In this post, I’m going to talk about why you should turn on Query Store, the Blocked Process Report, and capture deadlocks, and link you to free tools to analyze them.

This is all stuff that I help clients with as well, along with actually helping them fix the problems, but if you’re feeling keen on getting to it yourself, here’s what you should do.

Query Store


This thing is great for most SQL Servers out there. It’s like the plan cache, except it won’t keep disappearing on you.

To get it up and running, just run this command:

ALTER DATABASE 
    [YourDatabase]
SET QUERY_STORE 
(
    OPERATION_MODE = READ_WRITE,
    MAX_STORAGE_SIZE_MB = 1024,
    QUERY_CAPTURE_MODE = AUTO
);

After it’s turned on, you can check in on it in two ways:

Whichever one you use, you’re going to be in much better shape than hoping what you need is in the plan cache after something went bump in the night.

EXEC sp_QuickieStore
    @database_name = 'your awesome database';

Just running this will go back up to seven days and give you the top 10 queries by average CPU time. In other words, it’s the stuff that uses the most CPU whenever it runs.

If you find stuff you don’t care about, just copy and paste the contents of the query_id column, and do this to ignore them:

EXEC sp_QuickieStore
    @database_name = 'your awesome database',
    @ignore_query_ids = '1, 2, 3, 4, 5, 6, 7, 8';

Blocked Processes


The next thing you’ll wanna do is start logging which queries are blocking each other. You never know when this will ratchet up and become a real big problem.

The number of times I hear “we had a bad blocking problem” but no one even has a screenshot of it hanging out would boggle the very depths of your mind.

To avoid that, you need to:

  • Enable the blocked process report
  • Enable something to collect it

First, do this:

EXEC sys.sp_configure
    N'show advanced options',
    1;
RECONFIGURE;
GO
EXEC sys.sp_configure
    N'blocked process threshold',
    5; --Seconds
RECONFIGURE;
GO

This will get SQL Server capturing any blocking events that go on longer than five seconds. This is about the level where people start hitting problems, but you can set it to a higher number if you feel like it.

Next, you’ll want to set up an Extended Event to house the data from it:

CREATE EVENT SESSION 
    blocked_process_report
ON SERVER
    ADD EVENT 
        sqlserver.blocked_process_report
    ADD TARGET 
        package0.event_file
    (
        SET filename = N'bpr'
    )
WITH
(
    MAX_MEMORY = 4096KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    MAX_EVENT_SIZE = 0KB,
    MEMORY_PARTITION_MODE = NONE,
    TRACK_CAUSALITY = OFF,
    STARTUP_STATE = ON
);
GO

ALTER EVENT SESSION
    blocked_process_report
ON SERVER 
    STATE = START;
GO

If you’re using Azure SQL DB, you’ll need to tweak this to use it successfully. It’s a lot harder to successfully set up and access event files there, and the ring buffer is far less valuable because it doesn’t hold onto data very long.

The easiest way to get at the data in there is to use my stored procedure sp_HumanEventsBlockViewer.

It’s as simple as pointing it at the event session you created, but there are a whole bunch of other parameters to filter data in or out.

EXEC dbo.sp_HumanEventsBlockViewer
    @session_name = N'blocked_process_report';

 

Deadlocks


The final thing you should do is set up something to collect deadlocks in the long term.

To do that, Extended Events is your friend all over again.

CREATE EVENT SESSION 
    deadlock
ON SERVER
    ADD EVENT 
        sqlserver.xml_deadlock_report
    ADD TARGET 
        package0.event_file
    (
        SET filename = N'deadlock'
    )
WITH
(
    MAX_MEMORY = 4096KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    MAX_EVENT_SIZE = 0KB,
    MEMORY_PARTITION_MODE = NONE,
    TRACK_CAUSALITY = OFF,
    STARTUP_STATE = ON
);
GO

ALTER EVENT SESSION
    deadlock
ON SERVER 
    STATE = START;
GO

While this stored procedure isn’t in my GitHub repo, it’s one that I do a heck of a lot of work on. The easiest way to parse everything out is with sp_BlitzLock.

It’s as simple as pointing it at your extended event session.

EXEC dbo.sp_BlitzLock
    @EventSessionName = N'deadlock';

Nice To Have


Having these three things set up and running will make your life a lot easier the next time you face a performance emergency with your SQL Server.

Sure, there are a lot of other things that could go wrong, but these are the most common areas I see clients having problems in.

If you need to do a post mortem on an issue, you might want to look at:

  • sp_LogHunter: Dig through your error logs for high severity issues
  • sp_HealthParser: Dig through the system health extended event for a wide variety of issues

And as always, if you need help working through these kinds of problems, hit the link below to get some consulting help from me.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About Select Queries That Cause Writes In SQL Server

A Little About Select Queries That Cause Writes In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Why I Love NOLOCK Hints In SQL Server (Video Edition)

Oh, Just Do it.


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.