Execution Plans Archives | Page 21 of 31

Let’s Design A SQL Server Index Together Part 1

Posted on January 9, 2020May 14, 2022 by Erik Darling

Just One Index

I want both of these queries to be fast.

SELECT TOP (5000)
       p.LastActivityDate,
       p.PostTypeId,
       p.Score,
       p.ViewCount
FROM dbo.Posts AS p
WHERE p.PostTypeId = 4
AND   p.LastActivityDate >= '20120101'
ORDER BY p.Score DESC;


SELECT TOP (5000)
       p.LastActivityDate,
       p.PostTypeId,
       p.Score,
       p.ViewCount
FROM dbo.Posts AS p
WHERE p.PostTypeId = 1
AND   p.LastActivityDate >= '20110101'
ORDER BY p.Score DESC;

Get to work.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Table Variables Prevent SQL Server From Using A Parallel Query Plan

Posted on January 7, 2020May 16, 2022 by Erik Darling

Well, huh

Going Further

Where Bitmaps Dare In SQL Server Query Plans

Posted on January 3, 2020May 14, 2022 by Erik Darling

I AIN’T NO

Going Further

Eager Index Spools From Nonclustered Indexes In SQL Server

Posted on January 2, 2020May 14, 2022 by Erik Darling

Dangarang

Going Further

Dates Aren’t Strings

Posted on December 31, 2019May 14, 2022 by Erik Darling

Seriously.

Going Further

Implicit Conversion Is A SARGability Problem, That’s Why It’s A Performance Problem In SQL Server

Posted on December 27, 2019May 16, 2022 by Erik Darling

Concerns

If you compare the things that non-SARGable queries cause issues with alongside the things that bad implicit conversions cause issues with, it’s an identical list.

Increased CPU
Inefficient use of indexes
Poor cardinality estimation
Maybe a bad memory grant based on that
Some “row by row” event

Though we often bucket the problems separately, they’re really the same thing.

That’s because, under the covers, something similar happens.

If you replace “CONVERT_IMPLICIT” with any other function, like ISNULL, COALESCE, DATEADD, DATEDIFF, etc. you may see the same performance degradation.

Probably not the most thought provoking thing you’ve ever heard, but if you understand why one is bad and not the other, this may help you.

Thanks for reading!

Going Further

Useful vs Useless Bitmaps In SQL Server Query Plans

Posted on December 24, 2019May 14, 2022 by Erik Darling

It’s Hot Out There

Paul white (b|t) did what I think he does best: make a casual, off-hand remark about something mystifying with such absolute certainty that it makes your brain halt. At least that’s what happens to me.

It all started with:

“The Bitmap is hopeless.”

Earth Shattering Kaboom

I’d never considered Bitmaps in excruciating detail.

Sorta Bloom Filter-y. They show up in some parallel Hash and Merge Join plans (simplifying a bit because they’re hidden in serial Hash Join plans).

They seem nice. Early row reductions. Never thought of one as “hopeless”.

But then!

Hope Bloats

Let’s take a look at a plan with a hopeless Bitmap.

In the outer (top) branch of the plan, a Bitmap is created. It gets applied at the Scan of the Users table.

SQL Server Query Plan Tool Tip — Louder than Bits

What makes it hopeless?

The Users table has 2,465,710 rows in it
Despite the Bitmap, we read 2,465,593 rows and
We pass 2,465,590 of those rows along to the Repartition Streams

In other words, the Bitmap barely filtered out any rows whatsoever. Did it hurt performance? Am I mad at Bitmaps? No and no.

At least not here.

Beware Bitmap Placement

In some query plans, the Bitmap may not make it all the way down to the Scan operator.

If there’s a Partial Aggregate after the Scan, you may find the Bitmap applied at the Repartition Streams.

Better late than never, I suppose.

Helpful Bitmaps

In a helpful Bitmap plan, the details look much different.

Visually, this plan looks much different than the Hopeless Bitmap plan.

The number of rows (39,789) read from the scan is much lower than the table cardinality (2,465,710).

The details of the scan are also interesting.

We did not have to read all 2,465,710 rows
We only had to read 83,144 of them
We were able to Bitmap out ~50% of them, down to 39,789

That’s a Bangin’ Bitmap.

Thanks for reading!

Going Further

Is RECOMPILE Really That Bad For SQL Server Query Performance?

Posted on December 23, 2019May 16, 2022 by Erik Darling

To Re Or Not To Re

It’s been a while since SQL Server has had a real RECOMPILE problem. And if you put it up against the performance problems that you can hit with parameter sniffing, I’d have a hard time telling someone strapped for time and knowledge that it’s the worst idea for them.

Obviously, you can run into problems if you (“you” includes Entity Framework, AKA the Database Demolisher) author the kind of queries that take a very long time to compile. But as I list them out, I’m kinda shrugging.

Here are some problems you can hit with recompile. Not necessarily caused by recompile, but by not re-using plans.

Long compile times: Admittedly pretty rare, and plan guides or forced plans are likely a better option.
No plan history in the cache (only the most recent plan): Sucks if you’re looking at the plan cache. Sucks less if you have a monitoring tool or Query Store.
CPU spikes for high-frequency execution queries: Maybe time for caching some stuff, or getting away from the kind of code that executes like this (scalar functions, cursors, etc.)

But for everything in the middle: a little RECOMPILE probably won’t hurt that bad.

Thinking of the problems it can solve:

Parameter sniffing
Parameter embedding (lack of)
Local variable estimates
Catch all queries

Those are very real problems that I see on client systems pretty frequently. And yeah, sometimes there’s a good tuning option for these, like changing or adding an index, moving parts of the query around, sticking part of the query in a temp table, etc.

But all that assumes that those options are immediately available. For third party vendors who have somehow developed software that uses SQL Server for decades without running into a single best practice even by accident, it’s often harder to get those changes through.

There’s More Than One Way To Recompile

Sure, you might be able to sneak a recompile hint somewhere in the mix even if it’d make the vendor upset. You can always yoink it out later.

But you have alternatives, too.

DBCC FREEPROCCACHE: No, not the whole cache. You can single out troublesome queries to remove specific plans.
Plan Guides: An often overlooked detail of plan guides is that you can attach hints to them, including recompile.

Using a plan guide doesn’t interfere with that precious vendor IP that makes SQL Server unresponsive every 15 minutes. Or whatever. I’m not mad.

And yeah, there’s advances in SQL Server 2017 and 2019 that start to address some issues here, but they’re still imperfect.

I like’em, but you know. They’re not quite there yet.

Thanks for reading!

Going Further

SQL Server’s Partial Aggregate Query Plan Operators CAN Quit You

Posted on December 20, 2019May 14, 2022 by Erik Darling

This Machine Makes Seltzer

Let’s say you have a parallel query running at DOP 4. The final logic of the query is some aggregate: COUNT, SUM, MIN, MAX, whatever.

Sure, the optimizer could gather all the streams, and then calculate one of those for all four of them, but why do that?

We have a Partial Aggregate operator that allows an aggregate per thread to be locally aggregated, then a final global aggregate can be more quickly calculated from the four locally aggregated values.

There are a couple odd things about Partial Aggregates though:

They ask for a fixed amount of memory, which is usually quite small
When they run out of memory, they don’t spill, they just stop aggregating

Which is why for identical executions of identical queries, you may see different numbers of rows come out of them.

We start with the same number of rows coming out of the Hash Join, which is expected.

We ran the same query.

However, the Partial Aggregate emits different numbers of rows.

It doesn’t matter much, because the global aggregate later in the plan will still be able to figure things out, albeit slightly less efficiently.

If we look at the spills in the Hash Match Aggregates from both of the above plans, the warnings are slightly different.

Hardly anything to worry about here, of course. But definitely something to be aware of.

No, SQL Server isn’t leaking memory, or full of bugs. It’s just sensitive.

Thanks for reading!

Going Further

SQL Server’s Batch Mode On Row Store Is Fickle

Posted on December 19, 2019May 14, 2022 by Erik Darling

Thanks Though

I’m excited about this feature. I’m not being negative, here. I just want you, dear reader, to have reasonable expectations about it.

This isn’t a post about it making a query slower, but I do have some demos of that happening. I want to show you an example of it not kicking in when it probably should. I’m going to use an Extended Events session that I first read about on Dmitry Pilugin’s blog here. It’ll look something like this.

CREATE EVENT SESSION heristix
    ON SERVER
    ADD EVENT sqlserver.batch_mode_heuristics
        ( ACTION( sqlserver.sql_text ))
    ADD TARGET package0.event_file
        ( SET filename = N'c:\temp\heristix' )
    WITH
        ( MAX_MEMORY = 4096KB,
          EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
          MAX_DISPATCH_LATENCY = 1 SECONDS,
          MAX_EVENT_SIZE = 0KB,
          MEMORY_PARTITION_MODE = NONE,
          TRACK_CAUSALITY = OFF,
          STARTUP_STATE = ON );
GO

The Setup

Let’s start with some familiar indexes and a familiar query from other posts the last couple weeks.

CREATE INDEX something ON dbo.Posts(PostTypeId, Id, CreationDate);
CREATE INDEX stuffy ON dbo.Comments(PostId, Score, CreationDate);

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0;

The query plan is unimportant. It just doesn’t use any Batch Mode, and takes right about 2 seconds.

If we look at the entry for this query in our XE session, we can see that the optimizer considered the heck out of Batch Mode, but decided against it.

SQL Server Extended Events — All The Heck

Curiouser

If we add a hash join hint to the query, it finishes in about 800ms.

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0
OPTION(HASH JOIN);

All the operators in this plan except Gather Streams are run in Batch Mode. Clearly it was helpful.

And according to the XE session, we can see that decision in writing. Yay.

Alt Roq

If we modify our indexes slightly, we can get an Adaptive Join plan.

CREATE INDEX something_alt ON dbo.Posts(PostTypeId, CreationDate, Id);
CREATE INDEX stuffy_alt ON dbo.Comments(Score, CreationDate, PostId);

And, yes, this is about twice as fast now (compared to the last Batch Mode query), mostly because of the better indexing.

Is There A Moral So Far?

Yes, don’t count on Batch Mode to kick in for every query where it would be helpful.

If you want queries to consistently use Batch Mode, you’ll need to do something like this.

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
LEFT JOIN dbo.t ON 1 = 0
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0;

But you have to be careful there too.

You might lose your nice parallel plan and end up with a slower query.

Huh.

Thanks for reading!