SQL Server Needs Built-In Performance Views

No, Not More DMVs


Though I would be cool with new ones, as long as they’re not memes.

When you’re trying to gauge some high level performance metrics on a new server, you probably have your favorite scripts.

They could be various Blitzes, they could be some stuff you’ve had sitting in your script folder, maybe they’re from the Tiger Toolbox.

Whatever.

The point is that you, dear reader, are smart and engaged enough to know about and use these things.

A lot of people aren’t.

I’m not talking about another thing to go find and install. I mean these should come with the product.

Perf Schema


It would be really cool if SQL Server had a system schema called perf. In there you could have views to all sorts of neat things.

It would exist in every database, and it would have views in it to fully assemble the mess of DMVs that accompany:

  • Query Store
  • Plan Cache
  • Index Usage
  • Missing Indexes
  • File Stats
  • Wait Stats
  • Locking
  • Deadlocks

Assembling all those views is painful for beginners (heck Query Store is painful for everyone). Worse, they may find scripts on the internet that are wrong or outdated (meaning they may not have new columns, or they may give outdated advice on things).

What would make these particularly helpful is that they could aggregate metrics at the database level. Server-wide counters are cool until your server is really wide, and it’s impossible to tell where stuff like wait stats are coming from. This wouldn’t be too difficult to implement, since Azure SQLDB already has to have a bunch of self-contained stuff, due to the lack of cross-database queries.

Best of all, Microsoft can keep them up to date based on which version and edition of SQL Server you’re on, and if certain changes get back ported.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Getting Smart About Fixing Key Lookups In SQL Server Query Plans To Fix Performance Problems

Unavoidable


Having some key lookups in your query plans is generally unavoidable.

You’ll wanna select more columns than you wanna put in a nonclustered index, or ones with large data types that you don’t wanna bloat them with.

Enter the key lookup.

They’re one of those things — I’d say even the most common thing — that makes parameterized code sensitive to the bad kind of parameter sniffing, so they get a lot of attention.

The thing is, most of the attention that they get is just for columns you’re selecting, and most of the advice you get is to “create covering indexes”.

That’s not always possible, and that’s why I did this session a while back on a different way to rewrite queries to sometimes make them more efficient. Especially since key lookups may cause blocking issues.

Milk and Cookies


At some point, everyone will come across a key lookup in a query plan, and they’ll wonder if tuning it will fix performance.

There are three things to pay attention to when you look at a key lookup:

SQL Server Query Plan Tool Tip
I know what to do
  1. Number of executions: This is usually more helpful in an actual plan
  2. If there are any Predicates involved: That means there are parts of your where clause not in your nonclustered index
  3. If there’s an Output List involved: That means you’re selecting columns not in your nonclustered index

For number of executions, generally higher numbers are worse. This can be misleading if you’re looking at a cached plan because… You’re going to see the cached number, not the runtime number. They can be way different.

Notice I’m not worried about the Seek Predicates here — that just tells us how the clustered index got joined to the nonclustered index. In other words, it’s the clustered index key column(s).

Figure It Out


Here’s our situation: we’re working on a new stored procedure.

CREATE PROCEDURE dbo.predicate_felon (@Score INT, @CreationDate DATETIME)
AS
BEGIN

    SELECT *
    FROM dbo.Comments AS c
    WHERE c.Score = @Score
    AND   c.CreationDate >= @CreationDate
    ORDER BY c.CreationDate DESC;

END;

Right now, aside from the clustered index, we only have this nonclustered index. It’s great for some other query, or something.

CREATE INDEX ix_whatever 
ON dbo.Comments (Score, UserId, PostId)
GO

When we run the stored procedure like this, it’s fast.

EXEC dbo.predicate_felon @Score = 6, --Sixer
                         @CreationDate = '2013-12-31';
SQL Server Query Plan
SEND IT TO PRESS

SQL Server wants an index — a fully covering index — but if we create it, we end up a 7.8GB index that has every column in the Comments table in it. That includes the Text column, which is an NVARCHAR(700). Sure, it fixes the key lookup, but golly and gosh, that’s a crappy index to have hanging around.

Bad Problems On The Rise


The issue turns up when we run the procedure like this:

EXEC dbo.predicate_felon @Score = 0, --El Zero
                         @CreationDate = '2013-12-31';
SQL Server Query Plan
Not so much.

This happens because there are a lot more 0 scores than 6 scores.

SQL Server Query Results
Quiet time

Smarty Pants


Eagle eyed readers will notice that the second query only returns ~18k rows, but it takes ~18 seconds to do it.

The problem is how much time we spend locating those rows. Sure, we can Seek into the nonclustered index to find all the 0s, but there are 20.5 million of them.

Looking at the actual plan, we can spot a few things.

SQL Server Query Plan
Hunger Management
SQL Server Query Plan
Hangman

The 18k rows we end up with are only filtered to with they key lookup, but it has to execute 20.5 million times to evaluate that extra predicate.

If we just index the key columns, the key lookup to get the other columns (PostId, Text, UserId) will only execute ~18k times. That’s not a big deal at all.

CREATE NONCLUSTERED INDEX keys_only
    ON dbo.Comments ( Score, CreationDate );

This index is only ~500MB, which is a heck of a lot better than nearly 8GB covering the entire thing.

With that in place, both the score 6 and score 0 plans are fast.

SQL Server Query Plan
rq

Why This Is Effective, and When It Might Not Be


This works here because the date filter is restrictive.

When we can eliminate more rows via the index seek, the key lookup is less of a big deal.

If the date predicate were much less restrictive, say going back to 2011, boy oh boy, things get ugly for the 0 query again.

EXEC dbo.predicate_felon @Score = 6,
                         @CreationDate = '2011-12-31';

EXEC dbo.predicate_felon @Score = 0,
                         @CreationDate = '2011-12-31';
SQL Server Query Plan
Typical

Of course, returning that many rows will suck no matter what, so this is where other techniques come in like Paging, or charging users by the row come into play.

What? Why are you looking at me like that?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Going To SQL Saturday Portland? Come To My Precon!

Training Daze


I’ve delighted to announce that I’ve been selected to present a full day session for SQL Saturday Portland.

The Oregon one, not the Maine one.

I’ll be delivering my Total Server Tuning session, where you’ll learn all sorts of horrible things about SQL Server.

I’m going to be talking about how queries interact with hardware, wait stats that matter, and query tuning.

Seats are limited, so hurry on up and get yourself one before you get FOMO in your MOJO.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Unused Indexes Hurt: Lock Escalation

Facts


Yesterday we looked at how unused indexes can muck up the buffer pool, because writes to them will bring pages in.

Today we’re going to look at how unused indexes add to locking problems.

Hold My Liquor


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Unused Indexes Hurt SQL Server Performance: Buffer Pool Space

Lost In The Woods


When you find unused indexes, whether using Some Script From The Internet™, sp_BlitzIndex, or Database Telepathy, the first thing most people think of is “wasted space”.

Sure, okay, yeah. That’s valid. They’re in backups, restores, they get hit by CHECKDB. You probably rebuild them if there’s a whisper of fragmentation.

But it’s not the end of the story.

Not by a long shot.

Today we’re going to look at how redundant indexes can clog the buffer pool up.

Holla Back


If you want to see the definitions for the views I’m using, head to this post and scroll down.

Heck, stick around and watch the video too.

LIKE AND SUBSCRIBE.

Now, sp_BlitzIndex has two warnings to catch these “bad” indexes:

  • Unused Indexes With High Writes
  • NC Indexes With High Write:Read Ratio

Unused are just what they sound like: they’re not helping queries read data at all. Of course, if you’ve rebooted recently, or rebuilt indexes on buggy versions of SQL Server, you might get this warning on indexes that will get used. I can’t fix that, but I can tell you it’s your job to keep an eye on usage over time.

Indexes with a high write to read ratio are also pretty self-explanatory. They’re sometimes used, but they’re written to a whole lot more. Again, you should keep an eye on this over time, and try to understand both how important they might be to your workload, or how much they might be hurting your workload.

I’m not going to set up a fake workload to generate those warnings, but I am going to create some overlapping indexes that might be good candidates for you to de-clutter.

Index Entrance


The Votes table is pretty narrow, but it’s also pretty big — 53 million rows or so as of Stack 2013.

Here are my indexes:

CREATE INDEX who ON dbo.Votes(PostId, UserId) INCLUDE(BountyAmount); 
CREATE INDEX what ON dbo.Votes(UserId, PostId) INCLUDE(BountyAmount); 
CREATE INDEX [where] ON dbo.Votes(CreationDate, UserId) INCLUDE(BountyAmount); 
CREATE INDEX [when] ON dbo.Votes(BountyAmount, UserId) INCLUDE(CreationDate); 
CREATE INDEX why ON dbo.Votes(PostId, CreationDate) INCLUDE(BountyAmount); 
CREATE INDEX how ON dbo.Votes(VoteTypeId, BountyAmount) INCLUDE(UserId);

First, I’m gonna make sure there’s nothing in memory:

CHECKPOINT;
GO 2
DBCC DROPCLEANBUFFERS;
GO 

Don’t run that in production. It’s stupid if you run that in production.

Now when I go to look at what’s in memory, nothing will be there:

SELECT *
FROM dbo.WhatsUpMemory AS wum
WHERE wum.object_name = 'Votes'

I’m probably not going to show you the results of an empty query set. It’s not too illustrative.

I am going to show you the index sizes on disk:

SELECT *
FROM dbo.WhatsUpIndexes AS wui
WHERE wui.table_name = 'Votes';
SQL Server Query Results
Size Mutters

And I am going to show you this update:

UPDATE v
SET v.BountyAmount = 2147483647
FROM dbo.Votes AS v
WHERE v.BountyAmount IS NULL
AND   v.CreationDate >= '20131231'
AND v.VoteTypeId > 2;

After The Update


This is when things get more interesting for the memory query.

SQL Server Query Results
Life Of A Moran

We’re updating the column BountyAmount, which is present in all of the indexes I created. This is almost certainly an anti-pattern, but it’s good to illustrate the problem.

Pieces of every index end up in memory. That’s because all data needs to end up in memory before SQL Server will work with it.

It doesn’t need the entirety of any of these indexes in memory — we’re lucky enough to have indexes to help us find the 10k or so rows we’re updating. I’m also lucky enough to have 64GB of memory dedicated to this instance, which can easily hold the full database.

But still, if you’re not lucky enough to be able to fit your whole database in memory, wasting space in the buffer pool for unused (AND OH GODD PROBABLY FRAGMENTED) indexes just to write to them is a pretty bad idea.

After all, it’s not just the buffer pool that needs memory.

You also need memory for memory grants (shocking huh?), and other caches and activities (like the plan cache, and compressed backups).

Cleaning up those low-utilization indexes can help you make better use of the memory that you have.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Alternatives To Debugging SQL Server: PerfView

Buggery


If you seriously want to debug, you’ve gotta use something like WinDbg. It’s tough, but powerful.

It’s also… Where do you even start? The most common thing to do is set a Break Point.

Figuring out where you wanna set it can be challenging.

  • Sometimes you can capture a thread and step through the code.
  • Sometimes you can find the call stack you want by searching through symbols

The problem with these is that stepping through code that runs in milliseconds has taken me over an hour at times, and call stacks often have cryptic names.

This is where tools like PerfView can help.

Not A Developer


I’m not a developer, or particularly smart. I wish particularly smart developers would blog about this in depth.

In the meantime, you’ll have to get the intro material from me. I recently used it to help answer a tough question.

There are times when traditional troubleshooting methods won’t help you.

Instructional


PerfView is just an executable. I like to stick it in its own folder; it keeps the files that get created together.

When you open it up, you’re gonna wanna hit the Collect menu.

Greetings

Hit collect (or Alt+C, now that you peeked, cheater).

Jane’s Collection

I usually skip zipping, unless I know I’m going to send these to someone else. You can always zip them later.

I like to grab thread time. I dunno why. I’ve always done it, and I’m superstitious.

I also listen to Planet Earth every time I’m on a plane taking off because it assures a safe trip. Right?

Right.

Choices, Choices

Next, when you’re ready to get stuff, hit Start Collection. This button will change to Stop Collection after you click it.

You’ll wanna stop the trace when the thing you wanna profile is done running, or when you think you’ve got sufficient sample.

Starter

Now, I have to warn you here. You don’t wanna take a long collection. These files get big.

You also have to factor in that profiling code can slow it down.

CPU Stacks

CPU stacks are where I go next. When you double click on that, you’ll get a pop up to pick which application you wanna dig into.

PerfView collects system wide. So really, please, please, please don’t run it for a very long time.

sqlservr

Double click into sqlservr, and you’ll get a list of stuff.

Greatest

It won’t look like this for you at first. There’ll be a lot of jibberish and question marks. You need to get symbols for them to make sense.

Symbology

Hit ctrl+a to highlight the whole list of lines, then either hit alt+s or right click and select Lookup Symbols.

If it’s your first time doing this, it might take a while to download them, but you really need to do this part to make any sense of what’s going on.

Do Not Group Or Fold

The next thing you wanna do is clear out GroupPats — just highlight, hit delete, and then enter.

Next, change Fold% to 0 and hit enter.

This’ll make sure you get the most detail from your sample.

Fortunate

You can click around here to get other views of the data.

Mean to it.

How Does This Relate To Debugging?


Well, see those list of call stacks up there? That’s what you use to set Break Points in a debugger.

You might see something interesting there that either doesn’t have the full detail you’re after, or you want to follow more closely.

Either way, this is a good example of how to take a closer look at what SQL Server is up to under the hood without getting lost in a debugger.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

When Read Queries Block Write Queries In SQL Server

Gritty


You may occasionally see in your favorite monitoring tool, or Most Fantastic And Glorious Script For Determining Whom Activity Belongs To™ that a read query, perhaps even a long running one, is blocking modification queries.

Under most circumstances, this won’t happen. Most.

Shared locks taken by read queries will let go pretty quickly.

Under most circumstances.

Out, Liars


To hold onto Shared locks, you’d need to use an isolation level escalation hint, like REPEATABLE READ.

I could do that here if I were a lazy cheater.

Instead, I’m going to show you a more common and interesting scenario.

You see, like a lot of important specks of knowledge, this one comes from Craig Freedman:

Note the “WITH UNORDERED PREFETCH” keywords on the nested loops join.

I am not going to demontrate it , but when SQL Server executes this query, it holds S locks on the rows returned by the index seek until the query finishes executing.

And to generalize a bit from the source of the rest of the important knowledge in the world: These types of key lookups are common in plans with Star Join optimizations.

On To The Text


I’m going to be a little bit of a lazy cheater here, and rather than show you where this can happen with parameter sniffing or some other weird optimizer choice, I’m going to use an index hint to use this index:

CREATE INDEX whatever ON dbo.Votes(CreationDate, VoteTypeId);

Then I’m going to run this query, which’ll take about 10 seconds:

DECLARE @i INT
SELECT   @i = v.PostId
FROM     dbo.Votes AS v WITH (INDEX = whatever)
WHERE    v.CreationDate >= '20080101'
AND      v.VoteTypeId > 5
GROUP BY v.PostId
ORDER BY v.PostId;

Here’s what the query plan looks like:

SQL Server Query Plan
Your My Best Wishes

And when we get the properties of the Nested Loops Join, we’ll see the Unordered Prefetch property set to true.

SQL Server Query Plan
Steve Prefetchtaine

If I kick that query off and look at the results of sp_WhoIsActive @get_locks = 1, I’ll see this:

    <Object name="Votes" schema_name="dbo">
      <Locks>
        <Lock resource_type="OBJECT" request_mode="IS" request_status="GRANT" request_count="1" />
        <Lock resource_type="OBJECT" request_mode="S" request_status="GRANT" request_count="4" />
      </Locks>

Which is exactly what I want — a Shared object lock on Votes that has been GRANTed. That’ll get held onto for the duration of the query.

Now when I try to run this update, it’ll get blocked:

BEGIN TRAN
UPDATE dbo.Votes SET UserId = 2147483647 
ROLLBACK

Note that I’m only wrapping it in a transaction here so it’ll roll back. It will still get blocked without that, but then I’d have to reverse the update on my own.

See, everyone’s kind of a lazy cheater.

The locks that the update wants look like this:

    <Object name="Votes" schema_name="dbo">
      <Locks>
        <Lock resource_type="OBJECT" request_mode="IX" request_status="WAIT" request_count="1" />
      </Locks>

We see the IX lock has a request status of WAIT.

sp_WhoIsActive
They make me suspicious

And Who Is Active is showing us that the read query has been blocking the write query for around 4 seconds.

Capper


If you’re out there in the world and you see a read query that blocked a write query, take a close look at the query plan for a Nested Loops Join with the Unordered Prefetch property set to true.

I bet you’ll find one. And I’ll bet your query wasn’t fast.

Fixing the Key Lookup may not make your query faster, but it should alleviate the blocking because of long-held shared locks.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Function Plan Caching In SQL Server: Inlined Functions

Led Better


In yesterday’s post, we looked at MSTVFs. Today, we’re going to look at inlined functions.

And I know, that sounds weird. But it’s because we’re going to circle back to scalar valued functions, and how they’re inlined in SQL Server 2019 as well.

There’s stuff in here you’ll have to look out for when you move to SQL Server 2019.

Side By Side


The first thing we need is the function itself, which is just an inlined version of the others.

CREATE OR ALTER FUNCTION dbo.CommentsAreHorribleInline(@Id INT)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
SELECT (SELECT SUM(Score) FROM dbo.Posts AS p WHERE p.OwnerUserId <= @Id)
	   -
	   (SELECT COUNT_BIG(*) FROM dbo.Comments AS c WHERE c.UserId <= @Id) AS Tally
GO

Where these differ from both kinds of functions, is that you can’t find them ANYWHERE on their own.

They don’t appear in any DMVs, or in the plan cache as isolated code. Since they’re inlined, it’s just a part of whatever query references it.

Let’s start simple, though:

SELECT * FROM dbo.CommentsAreHorribleInline(22656);

SELECT * FROM dbo.CommentsAreHorribleInline(138);

Like in the other posts, the chief difference between these two plans is the index access choice.

The 22656 plan scans the clustered index, and the 138 plan does a nonclustered index seek with a key lookup.

Check the other posts in the series for pictures of that if you’re interested.

SQL Server Query Plan
Tops and Bottoms

As in the other examples, the clustered index scan is considerably slower than the nonclustered index seek plan.

But that’s not where things are interesting. Where they’re interesting is when we call the function in a “larger” query.

SELECT TOP (5) u.DisplayName,
       (SELECT * FROM dbo.CommentsAreHorribleInline(u.Id))
FROM dbo.Users AS u
GO

Spool And Spool Alike


This query runs for a full minute.

SQL Server Query Plan
Adios

We see why early on — an eager index spool.

SQL Server Query Plan
You again.

This appears to happen because the optimizer is drunk as hell doesn’t like the idea of scanning the clustered index, or recycling the seek + lookup 5 times.

The index reflects what a two column clustered index keyed on OwnerUserId would look like.

Think about it like if you selected OwnerUserId and Score into a #temp table and put a clustered index on the table with the key on OwnerUserId.

SQL Server Query Plan
Damaja

Remember that part of the reason for this plan change is that we’re no longer just subtracting one aggregate from another like when we call the function on its own, there’s the additional “join” to the Users table. Correlated subqueries are just that — joins.

SQL Server Query Plan
Solitaire

Futurist


I’d love to have more to say about how this changes in SQL Server 2019, but an identical pattern occurs, similar to what I blogged about earlier in the summer.

I caught some private flack about how the TOP (1) pattern in that post could easily be replace with an aggregate like MIN.

While that’s totally true, there’s no similar replacement for this pattern. We could expand the index to cover the Score column to get rid of the spool, which goes back to another post I wrote about 2019 optimizer features, where you’ll need wider (covering) indexes for them to be used.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Function Plan Caching In SQL Server: Multi Statement Table Valued Functions

Too Clean


In yesterday’s post, we looked at we looked at simple scalar function plan caching.

Today, we’ll look at MSTVFs. If you’re not sure what that means, look at the title of the post real quick.

Yeah, up there.

On we go.

Dirty Version


The function will do the same thing as before, just rewritten to be a MSVTF.

CREATE OR ALTER FUNCTION dbo.CommentsAreHorribleMulti(@Id INT)
RETURNS @Tally TABLE(Tally BIGINT)
WITH SCHEMABINDING
AS
BEGIN
INSERT @Tally ( Tally )
SELECT (SELECT SUM(Score) FROM dbo.Posts AS p WHERE p.OwnerUserId <= @Id)
	   -
	   (SELECT COUNT_BIG(*) FROM dbo.Comments AS c WHERE c.UserId <= @Id)
RETURN
END
GO

Now, where these differ immediately from SVFs (scalar valued functions), is that they don’t show up in the plan cache by name.

Wrong Hook

Note that these are both “statements”.

Also, unlike SVFs, they don’t show up in dm_exec_function_stats. This is documented behavior, but whatever.

And even though they’re called a “Proc” in dm_exec_cached_plans, they only show up in dm_exec_query_stats, not dm_exec_procedure_stats (which is why BlitzCache calls them a Statement).

Different Sniffint


Unlike SVFs, which don’t have a restriction on the function body using parallelism, all table variable modifications are forced to run serially (unless you’re sneaky).

That means both insert queries will be serialized, with the main difference being index access.

SQL Server Query Plan
arrowed

Like before, if we cache either plan, it will get reused. And just like before, the clustered index scan plan is significantly slower.

SELECT TOP (5) u.DisplayName,
       (SELECT * FROM dbo.CommentsAreHorribleMulti(u.Id))
FROM dbo.Users AS u
SQL Server Query Plan
Old Dress

Plan Noir


Just like scalar functions, these can have different plans cached and reused, and may fall victim to parameter sniffing.

Again, this depends a lot on how the function is called and used. It’s just something to be aware of when tuning queries that call functions.

Execution times may vary greatly depending on… well…

Parameters.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.