Execution Plans – Page 17

SQL Server 2019: When Batch Mode On Rowstore Isn’t Better For Performance

Posted on July 15, 2020May 14, 2022 by Erik Darling

Um Hello

WHAT DO YOU MEAN YOU’RE NOT ON SQL SERVER 2019 YET.

Oh. Right.

That.

Regressed

Look, whenever you make changes to the optimizer, you’re gonna hit some regressions.

And it’s not just upgrading versions, either. You can have regressions from rebuilding or restarting or recompiling or a long list of things.

Databases are terribly fragile places. You have to be nuts to work with them.

I’m not mad at 2019 or Batch Mode On Rowstore (BMOR) or anything.

But if I’m gonna get into it, I’m gonna document issues I run into so that hopefully they help you out, too.

One thing I ran into recently was where BMOR kicked in for a query and made it slow down.

Repro

Here’s my index:

CREATE INDEX mailbag ON dbo.Posts(PostTypeId, OwnerUserId) WITH(DATA_COMPRESSION = ROW);

And here’s my query:

SELECT u.Id, u.DisplayName, u.Reputation,
       (SELECT COUNT_BIG(*) FROM dbo.Posts AS pq WHERE pq.OwnerUserId = u.Id AND pq.PostTypeId = 1) AS q_count,
       (SELECT COUNT_BIG(*) FROM dbo.Posts AS pa WHERE pa.OwnerUserId = u.Id AND pa.PostTypeId = 2) AS a_count
FROM dbo.Users AS u
WHERE u.Reputation >= 25000
ORDER BY u.Id;

It’s simplified a bit from what I ran into, but it does the job.

Batchy

This is the batch mode query plan. It runs for about 2.6 seconds.

SQL Server Query Plan — who would complain?

Rowy

And here’s the row mode query plan. It runs for about 1.3 seconds.

What Happened?

Just when you think the future is always faster, life comes at you like this.

So why is the oldmode query more than 2x faster than the newhotmode query?

There are a reason, and it’s not very sexy.

Batch Like That

First, the hash joins produce Bitmaps.

You don’t see Bitmaps in Batch Mode plans as operators like you’re used to in Row Mode plans. You have to look at the properties (not the tool tip) of the Hash Join operator.

Even though both plans seek into the index on Posts, it’s only for the PostTypeId in the Batch Mode plan.

It would be boring to show you both, so I’m just going to use the details from the branch where we find PostTypeId = 2.

Remember this pattern: we seek to all the values where PostTypeId = 2, and then apply the Bitmap as a residual predicate.

You can pretty easily mentally picture that.

Rowbot

In the row mode plan, the Nested Loops Joins are transformed to Apply Nested Loops:

Which means on the inner side of the join, both the PostTypeId and the OwnerUserId qualify as seek predicates:

Reading Rainbow

The better performance comes from doing fewer reads when indexes are accessed.

Though both produce the same number of rows, the Hash Join plan in Batch Mode reads 28 million rows, or about 21 million more rows than the Nested Loop Join plan in row mode. In this case, the double seek does far fewer reads, and even Batch Mode can’t cover that up.

Part of the problem is that the optimizer isn’t psychic.

Fixing It

There are two ways I found to get the Nested Loop Join plan back.

The boring one, using a compat level hint:

SELECT u.Id, u.DisplayName, u.Reputation,
       (SELECT COUNT_BIG(*) FROM dbo.Posts AS pq WHERE pq.OwnerUserId = u.Id AND pq.PostTypeId = 1) AS q_count,
       (SELECT COUNT_BIG(*) FROM dbo.Posts AS pa WHERE pa.OwnerUserId = u.Id AND pa.PostTypeId = 2) AS a_count
FROM dbo.Users AS u
WHERE u.Reputation >= 25000
ORDER BY u.Id
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'));

And the more fun one, rewriting the correlated subqueries as outer apply:

SELECT u.Id, u.DisplayName, u.Reputation, q_count, a_count
FROM dbo.Users AS u
    OUTER APPLY(SELECT COUNT_BIG(*) AS q_count FROM dbo.Posts AS pq WHERE pq.OwnerUserId = u.Id AND pq.PostTypeId = 1) AS q_count
    OUTER APPLY(SELECT COUNT_BIG(*) AS a_count FROM dbo.Posts AS pa WHERE pa.OwnerUserId = u.Id AND pa.PostTypeId = 2) AS a_count
WHERE u.Reputation >= 25000
ORDER BY u.Id;

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common Table Expression Myths In SQL Server: Materialization

Posted on July 13, 2020May 16, 2022 by Erik Darling

I Got Five On It

I wish I had a dollar for every wrong thing I’ve heard about CTEs in my life. I’d buy a really nice cigar and light it with fire made by the gods themselves.

Or, you know, something like that.

One common thing is around some persistence of the queries contained inside of them in some form, whether in memory or in tempdb or something else.

I honestly don’t know where these things begin. They’re not even close to reality.

Getting It Right

Let’s take this query as an example:

SELECT u.Id, 
       u.Reputation
FROM dbo.Users AS u
WHERE u.Reputation * 2 = 22;

If you’ve been tuning queries for longer than a day, you can probably spot the issue here.

Applying expressions to columns in the where clause (or joins) messes up some things. Unfortunately, you can also run into the exact same issues doing this:

WITH cte AS 
(
SELECT u.Id, 
       u.Reputation, 
       (u.Reputation * 2) AS ReputationDoubler
FROM dbo.Users AS u
)
SELECT c.Id,
       c.Reputation
FROM cte AS c
WHERE c.ReputationDoubler = 22;

To be explicit: both of these queries have the same problem.

Erik D Is President

Starting with this index:

CREATE INDEX toodles ON dbo.Users(Reputation);

Both queries have the same execution plan characteristics:

I understand why you think a mature database product might be able to deal with this better:

Locate values in the index with a value of 11
Divide the literal value by 2 instead

But SQL Server doesn’t have anything like that, and neither do CTEs. Both indexes get scanned in entirety to retrieve qualifying rows, with the unseekable expression applied as a residual predicate:

Gopherville

To be clear, and hopefully to persuade you to write clear predicates, this is the end result that we’re after:

SELECT u.Id, 
       u.Reputation
FROM dbo.Users AS u
WHERE u.Reputation = 11;

While this is of course intuitive when writing simple queries, the point of this post is to show that expressions in CTEs don’t offer any advantage.

This goes for any flavor of derivation, too. Whether it’s wrapping columns in built in or user defined functions, combining columns, combining columns with values, etc.

Thanks for reading!

Going Further

Local Variables Also Cause Performance Problems In The ORDER BY Clause In SQL Server

Posted on July 7, 2020May 16, 2022 by Erik Darling

Local Disasters

I see this kind of pattern a lot in paging queries where people are doing everything in their power to avoid writing dynamic SQL for some reason.

It’s almost as if an entire internet worth of SQL Server knowledge and advice doesn’t exist when they’re writing these queries.

Quite something. Quite something indeed.

First, let’s get what doesn’t work out of the way.

DECLARE @order_by INT = 3

SELECT p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ORDER BY CASE WHEN @order_by = 1 THEN p.Score
              WHEN @order_by = 2 THEN p.CreationDate 
              WHEN @order_by = 3 THEN p.Id 
              ELSE NULL 
         END;
GO

You can’t write this as a single case expression with mismatched data types.

It’ll work for the first two options, but not the third. We’ll get this error, even with a recompile hint:

Msg 8115, Level 16, State 2, Line 46
Arithmetic overflow error converting expression to data type datetime.

What Works But Still Stinks

Is when you break the options out into separate case expressions, like so:

DECLARE @order_by INT = 1

SELECT p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ORDER BY CASE WHEN @order_by = 1 THEN p.Score ELSE NULL END,
         CASE WHEN @order_by = 2 THEN p.CreationDate ELSE NULL END,
         CASE WHEN @order_by = 3 THEN p.Id ELSE NULL END;
GO

This will work no matter which option we choose, but something rather disappointing happens when we choose option three.

Here’s the query plan. Before you read below, take a second to try to guess what it is.

What Stinks Even Though It Works

My issue with this plan is that we end up with a sort operator, even though we’re ordering by Id, which is the primary key and clustered index key, and we use that very same index. We technically have the data in order, but the index scan has False for the Ordered attribute, and the Sort operator shows a series of expressions.

The Sort of course goes away if we add a recompile hint, and the Scan now has True for the Ordered attribute.

DECLARE @order_by INT = 3

SELECT p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ORDER BY CASE WHEN @order_by = 1 THEN p.Score ELSE NULL END,
         CASE WHEN @order_by = 2 THEN p.CreationDate ELSE NULL END,
         CASE WHEN @order_by = 3 THEN p.Id ELSE NULL END
OPTION(RECOMPILE);
GO

You Shouldn’t Do This

Unless you’re fine with recompile hints, which I don’t blame you if you are.

SQL Server seems to get a whole lot more right when you use one, anyway.

My point though, is that adding uncertainty like this to your queries is more often than not harmful in the long term. Though this post is about local variables, the same thing would happen with parameters, for example:

DECLARE @order_by INT = 3
DECLARE @sql NVARCHAR(MAX) = N'
SELECT p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ORDER BY CASE WHEN @order_by = 1 THEN p.Score ELSE NULL END,
         CASE WHEN @order_by = 2 THEN p.CreationDate ELSE NULL END,
         CASE WHEN @order_by = 3 THEN p.Id ELSE NULL END;
';

EXEC sys.sp_executesql @sql, N'@order_by INT', 1;
EXEC sys.sp_executesql @sql, N'@order_by INT', 3;
GO

The way to address it would be something like this:

DECLARE @order_by INT = 3
DECLARE @sql NVARCHAR(MAX) = N'
SELECT p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ORDER BY ';

SET @sql += 
CASE WHEN @order_by = 1 THEN N'p.Score'
     WHEN @order_by = 2 THEN N'p.CreationDate'
     WHEN @order_by = 3 THEN N'p.Id' 
     ELSE N'' 
END;

EXEC sys.sp_executesql @sql
GO

Thanks for reading!

Going Further

Local Variables Cause Performance Problems With TOP In SQL Server

Posted on July 6, 2020May 16, 2022 by Erik Darling

Number One

In case you missed it for some reason, check out this post of mine about local variables. Though it’s hard to imagine how you missed it, since it’s the single most important blog post ever written, even outside of SQL Server. It might even be more important than SQL Server. Time will tell.

While live streaming recently about paging queries, I thought that it might make an interesting post to see what happens when you use variables in places other than the where clause.

After several seconds of thinking about it, I decided that TOP would be a good enough place to muck around.

Unvariables

Let’s say you’ve got these two queries.

DECLARE @pagesize INT = 10000;
SELECT TOP (@pagesize) p.Id
FROM dbo.Posts AS p
ORDER BY p.Id;
GO 

DECLARE @pagesize INT = 10000;
SELECT TOP (@pagesize) p.Id
FROM dbo.Posts AS p
ORDER BY p.Id
OPTION(RECOMPILE);
GO

Without a RECOMPILE hint, you get a 100 row estimate for the local variable in a TOP.

Which of course might could lead to some problems if you were selecting way more than 100 rows via TOP.

Not Unexpectedly

You can manipulate what the optimizer thinks it’ll get with optimizer for hints:

DECLARE @pagesize INT = 10000;
SELECT TOP (@pagesize) p.Id
FROM dbo.Posts AS p
ORDER BY p.Id
OPTION(OPTIMIZE FOR(@pagesize = 1));
GO

And of course, when used as actual parameters, can be sniffed.

DECLARE @pagesize INT = 10000;
DECLARE @sql NVARCHAR(1000) = 
N'
  SELECT TOP (@pagesize) p.Id
  FROM dbo.Posts AS p
  ORDER BY p.Id;
'

EXEC sys.sp_executesql @sql, N'@pagesize INT', 1;
EXEC sys.sp_executesql @sql, N'@pagesize INT', 10000;
GO

Got More?

In tomorrow’s post, I’ll look at how local variables can be weird in ORDER BY. If you’ve got other ideas, feel free to leave them here.

There’s not much more to say about WHERE or JOIN, I’m looking for more creative applications ?

Thanks for reading!

Going Further

In SQL Server, Lookups Are Just Nested Loops Joins With No Choices

Posted on July 2, 2020May 16, 2022 by Erik Darling

Such Optimize

At this point in your life, you’ve probably seen, and perhaps even struggled with how to fix a key lookup that was causing you some grief.

This post isn’t going to go terribly deep into anything, but I do want to make a few things about them more clear, because I don’t usually see them mentioned anywhere.

Lookups are joins between two indexes on the same table
Lookups can only be done via nested loops joins
Lookups can’t be moved around in the execution plan

I don’t want you to think that every lookup is bad and needs to be fixed, but I do want you to understand some of the limitations around optimizing them.

The Join

When you see a lookup in an execution plan, it’s natural to focus on just what the lookup is doing.

But there’s something else lurking in here, too.

That nested loops join is what’s bringing the data from a nonclustered index to the data in a clustered index (or heap, but but whatever).

For every row that comes out of the index seek on the nonclustered index, we go back to the clustered index to find whatever data is missing from it in the clustered index. It could be columns in the select list, where clause, or both.

Much like index union or index intersection, but much more common. For a table with a clustered index, the join condition will be on the clustered index key column(s), because in SQL Server, nonclustered indexes inherit clustered index key columns. For heaps, it’ll be on the row identifier (RID).

You can most often see that by looking at the tool tip for the Lookup, under Seek Predicates.

The Loop

At this point, SQL Server’s optimizer can’t use merge or hash joins to implement a lookup.

It can only use nested loops joins.

That’s a pretty big part of why they can be so tricky in plans with parameter sniffing issues. At some point, the number of loops you can end up doing is far more work than just scanning as clustered index all in one shot.

There’s also no “adaptive join” component to them, where SQL Server can bail on a loop join after so many executions and use a scan instead. Maybe someday, but for now this isn’t anything that intelligent query processing touches.

They can look especially off in Star Join plans sometimes, where it’s difficult to figure out why the optimizer went with the lookup for many more rows than what people often call the “tipping point” between lookups and clustered index scans.

The Glue

Another pesky issue with lookups is that the optimizer doesn’t currently support moving the join between the two indexes around at all.

You can get this behavior on your own by rewriting the lookup as a self join (which is all a lookup really is anyway — a self join that the optimizer chose for you).

For instance, here are two query plans. The first one is where the optimizer chose a lookup plan. The second is one where I wrote the query to self join the Users table to itself.

The thing to understand here is that when there’s a lookup in a query plan, it is inseparably coupled.

When you write queries as self joins, the optimizer has many more choices available to it as far as join order, join type, and all the other usual steps that it can take during optimization. A simplified example of doing that (not related to the query plans above), would look like this:

CREATE INDEX joan_jett
ON dbo.Posts
(
    PostTypeId, Score
);

                                 /* Not In The Index */
SELECT p.Id, p.PostTypeId, p.Score, p.CreationDate
FROM dbo.Posts AS p
WHERE p.PostTypeId = 7
AND   p.Score = 0
AND   p.OwnerUserId = -1;
       /* Not In The Index*/



                                    /* From p2 */
SELECT p.Id, p.PostTypeId, p.Score, p2.CreationDate
FROM dbo.Posts AS p
JOIN dbo.Posts AS p2 --Self join
    ON p2.Id = p.Id
WHERE p.PostTypeId = 7
AND   p.Score = 0
AND   p2.OwnerUserId = -1;
      /* From p2 */

The index is only on PostTypeId and Score, which means the CreationDate and OwnerUserId columns need to come from somewhere.

Probably more interesting is the second query. The Posts table is joined to itself on the Id column, which is the primary key and clustered index (for style points, I suppose), and the columns not present in the nonclustered index are selected from the “p2” alias of the Posts table.

AND BASICALLY

Sometimes I take these thing for granted, because I learned them a long time ago. Or at least what seems like a long time ago.

But they’re things I end up talking with clients about frequently, and sometimes even though they’re not optimizer oddities they’re good posts to write.

Hopefully they’re also good posts for reading, too.

Thanks for doing that.

Going Further

SQL Server CTEs Don’t Always Execute In The Order You Write Them

Posted on June 25, 2020May 14, 2022 by Erik Darling

Mythology

I’ve heard many times incorrectly over the years that CTEs somehow materialize data.

But a new one to me was that CTEs execute procedurally, and you could use that to influence plan shapes by always doing certain things first.

Unfortunately, that’s not true of them either, even when you use TOP.

Meaningful Life

Here’s the first example. Take some note of the order the CTEs are written and joined in, and the tables they touch.

Outside of the CTEs, there’s a join to a table not even in a CTE here.

WITH cte_1 AS 
(
SELECT u.Id
FROM dbo.Users AS u
WHERE u.Reputation = 1
),
     cte_2 AS
(
SELECT p.OwnerUserId, p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
),   cte_3 AS 
(
SELECT v.PostId
FROM dbo.Votes AS v --WITH(INDEX = three)
WHERE v.VoteTypeId = 4
)
SELECT COUNT(*)
FROM cte_1
JOIN cte_2
    ON cte_2.OwnerUserId = cte_1.Id
JOIN cte_3 
    ON cte_3.PostId = cte_2.Id
JOIN dbo.Comments AS c
    ON c.UserId = cte_1.Id;

The plan for it looks like this:

Not even close to happening in the order we wrote things in.

Darn that optimizer.

Machanically Speaking

If we use a TOP in each CTE, that doesn’t help us either.

WITH cte_1 AS 
(
SELECT TOP (2147483647) u.Id
FROM dbo.Users AS u
WHERE u.Reputation = 1
),
     cte_2 AS
(
SELECT TOP (2147483647) p.OwnerUserId, p.Id
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
),   cte_3 AS 
(
SELECT TOP (2147483647) v.PostId
FROM dbo.Votes AS v
WHERE v.VoteTypeId = 4
)
SELECT COUNT(*)
FROM cte_1
JOIN cte_2
    ON cte_2.OwnerUserId = cte_1.Id
JOIN cte_3 
    ON cte_3.PostId = cte_2.Id
JOIN dbo.Comments AS c
    ON c.UserId = cte_1.Id;

Tables get touched in the same order, but the plan takes an ugly turn:

Dis-spells

CTEs have no magic powers. They don’t boss the optimizer around, they don’t materialize, and they don’t fence optimization.

If you’re gonna start stacking these things together, make sure you’re doing it for a good reason.

And if you tell me it’s to make code more readable, I know you’re messing with me.

Thanks for reading!

Going Further

How Parameter Sniffing Impacts Locking In SQL Server

Posted on June 23, 2020May 14, 2022 by Erik Darling

Make It Out

I’m going to show you something bizarre. I’m going to show it to you because I care for your well-being and long term mental health.

Someday you’ll run into this and be thoroughly baffled, and I want to be here for you. Waiting, watching, lubricating.

I have a stored procedure. It’s a wonderful stored procedure.

But something funny happens when a parameter gets sniffed.

Wrote A Little Song About It

It’s not the usual parameter sniffing thing, where you get different plans and blah blah blah. That’s dull.

This is even more infuriating. Here’s the part where care about, where we read data to insert into the #temp table.

This is the “small” version of the plan. It only moves about 8200 rows.

Now here’s the “big” version of the plan.

We move way more rows out of the seek (9.8 million), but doesn’t it seem weird that a seek would take 5.6 seconds?

I think so.

Pay special attention here, because both queries aggregate the result down to one row, and the insert to the #temp table is instant both times.

Wanna Hear It?

Let’s do what most good parameter sniffing problem solvers do, and re-run the procedure after recompiling for the “big” value.

It’s the exact same darn plan.

Normally, when you’re dealing with parameter sniffing, and you recompile a procedure, you get a different plan for different values.

Not here though. Yes, it’s faster, but it’s the same operators. Seek, Compute, Stream, Stream, Compute, Insert 1 row.

Important to note here is that the two stream aggregates take around the same about of time as before too.

The real speed up was in the Seek.

How do you make a Seek faster?

YOU NEEK UP ON IT.

Three Days Later

I just woke up from beating myself with a hammer. Sorry about what I wrote before. That wasn’t funny.

But okay, really, what happened? Why is one Seek 4 seconds faster than another seek?

Locking.

All queries do it, and we can prove that’s what’s going on here by adding a locking hint to our select query.

Now, I understand why NOLOCK would set your DBA in-crowd friends off, and how TABLOCK would be an affront to all sense and reason for a select.

So how about a PAGLOCK hint? That’s somewhere in the middle.

The Seek that took 5.6 seconds is down to 2.2 seconds.

And all this time people told you hints were bad and evil, eh?

YTHO?

It’s pretty simple, once you talk it out.

All queries take locks (even NOLOCK/READ UNCOMMITTED queries).

Lock escalation doesn’t usually happen with them though, because locks don’t accumulate with read queries the way they do with modification queries. They grab on real quick and then let go (except when…).

For the “small” plan, we start taking row locks, and keep taking row locks. The optimizer has informed the storage engine that ain’t much ado about whatnot here, because the estimate (which is correct) is only for 8,190 rows.

That estimate is preserved for the “big” plan that has to go and get a lot more rows. Taking all those additional row locks really slows things down.

No Accumulation, No Escalation.

We stay on taking 9.8 million row locks instead of escalating up to page or object locks.

When we request page locks from the get-go, we incur less overhead.

For the record:

PAGLOCK: 2.4 seconds
TABLOCK: 2.4 seconds
NOLOCK: 2.4 seconds

Nothing seems to go quite as fast as when we start with the “big” parameter, but there’s another reason for that.

When we use the “big” parameter, we get batch mode on the Seek.

Welcome to 2019, pal.

Thanks for reading!

Going Further

What’s The Point Of Estimated Plans In SQL Server, Anyway?

Posted on June 22, 2020May 16, 2022 by Erik Darling

Further From Truth

Estimated plans can be a bit confounding, huh?

Whether you see one by hitting CTRL+L, looking in the plan cache, or looking in Query Store, the song remains the same.

The query is slow: sometimes? always? usually? rarely?

But where? But why?

Estimated plans can hide a lot of work from you. They’re sort of like government contracts, where the optimizer is the government: it picks the cheapest plan, but then…

(Okay, maybe it picks a plan because a friend or relative or big donor was behind it. Sometimes it seems that way, with the nonsense you end up with.)

Of course once execution is under way, you run into all sorts of unexpected costs: Everything is much more expensive, and takes a whole lot longer than projected.

Glad Ways

From a practical perspective, estimated plans can be useful to get a general idea of the plan shape that the optimizer is coming up with.

Despite all the lying estimated plans can do, you may be able to spot some obvious anti-patterns.

Of more utility is when the query is particularly long running, or when the query will modify underlying data.

When I’m tuning modification queries, there’s usually some pain involved when you need to reset things.

For example:

Most people don’t want to wait 5-10 minutes just to see if a plan changed after making changes
Deleting data that doesn’t exist
Updates that don’t change anything
Not being able to insert duplicate data a second time

…Are all hindrances to getting things done.

And so you need to restore or roll things back another way (and boy howdy has SQL Server 2019 spoiled me for rollbacks).

Thanks for reading!

Going Further

Query Tuning SQL Server 2019 Part 5: I’m Not Going Back

Posted on June 19, 2020May 14, 2022 by Erik Darling

Butt Out Bag

There was one thing that I didn’t talk about earlier in the week.

You see, there’s a mystery plan.

It only shows up once in a while, like Planet X. And when it does, we get bombarded by asteroids.

Just like when Planet X shows up.

I wouldn’t call it a good all-around plan, but it does something that we would want to happen when we run this proc for VoteTypeId 5.

Let’s go look!

The Optimizer Discovers Aggregates, Sort Of

This isn’t a good “general” plan. In fact, for any of the previously fast values, it sucks.

It sucks because just like the “optimize for unknown” plan, it has a bunch of startup costs, does a lot of scanning, and is generally a bad choice for VoteTypeIds that produce a small number of values.

Johnny Four

If you look carefully, you can see what the problem is.

For VoteTypeIds that filter out a lot of rows (which is most of them), that predicate doesn’t get applied until after Posts and Badges have been joined.

In other words, you fully join those tables, and then the result of that join is joined to the predicate-filtered result of Votes.

For this execution, the plan was compiled initially for VoteTypeId 2. It has 130 million entries in Votes. It’s the only VoteTypeId that produces this plan naturally.

The plan you’re looking at above was re-executed with VoteTypeId 4, which has… 8,190 rows in Votes.

I can’t stress enough how difficult it would be to figure out why this is bad just looking at estimated plans.

Though one clue would be the clustered index scan + predicate, if we knew that we had a suitable index.

This kind of detail with row discrepancies only surfaces with actual plans.

But there is one thing here that wasn’t showing up in other plans, when we wanted it to: The optimizer decides to aggregate OwnerUserId coming from the Posts table prior to joining to Votes.

Johnny Five

If you recall the previously used plan, one complaint was that the result of joining Posts and Badges then joined to Votes had to probe 932 million rows.

You can sort of see that here, where the Adaptive Join prior to the highlighted Hash Match Aggregate produces >100 million rows. It’s more here because we don’t have Bitmaps against both Posts and Badges, but… We’re going off track a bit with that.

That could have been avoided if the optimizer had decided to aggregate OwnerUserId, like it does in this plan.

To compare:

The top plan has a handy green square to show you a helpful pre-join aggregation.

The bottom plan has no handy green squares because there is no helpful pre-join aggregation.

The product of the aggregation is 3.2 million rows, which is exactly what we got as a distinct count when we began experimenting with temp tables:

SELECT COUNT_BIG(DISTINCT p.OwnerUserId) AS records --3,236,013 
FROM dbo.Posts AS p 
JOIN dbo.Badges AS b 
    ON b.UserId = p.OwnerUserId 
WHERE p.PostTypeId = 1;

Outhouse

If the optimizer had chosen to aggregate OwnerUserId prior to the join to Votes, we all could have gone home early on Friday and enjoyed the weekend

Funny, that.

Speaking of which, it’s Friday. Go enjoy the weekend.

Thanks for reading!

Going Further

Query Tuning SQL Server 2019 Part 4: Long Live The Query Tuner

Posted on June 18, 2020May 14, 2022 by Erik Darling

Rumors Of My Demise

Let’s talk about some common hints that people use to fix parameter sniffing:

RECOMPILE: Won’t work here to get us a better plan for VoteTypeId 5, because it sucks when the optimizer knows what’s coming
OPTIMIZE FOR UNKNOWN: Works like once every 5 years, but people still bring it up, and really sucks here (picture below)
OPTIMIZE FOR (VALUE): Plan sharing doesn’t work great generally — if we were gonna do this, it’d have to be dynamic

This is what happens when we optimize for unknown. The density vector guess is 13,049,400.

That guess for Vote Types with very few rows ends up with a plan that has very high startup costs.

This version of the query will run for 13-17 seconds for any given parameter. That sucks in zero gravity.

Pictured above is the plan for VoteTypeId 4, which previously finished sub-second using Plan 1 and Plan 2.

With those out of the way, how can we fix this thing?

The Mint

In some circumstances, a #temp table would help if we pre-staged rows from Votes.

The problem is that for many calls, we’d be putting between 7 and 130 MILLION rows into a temp table.

Not my idea of a good time.

SQL Server Query Results — RAMDISKS NINETY NINE CENTS

But what about the other part of the query?

If count up distinct OwnerUserIds, we get about 3.2 million.

Better, we can reduce the rows further in the procedure with an EXISTS to Votes (I’ll show you that in a minute).

SELECT COUNT_BIG(DISTINCT p.OwnerUserId) AS records --3,236,013
FROM dbo.Posts AS p
JOIN dbo.Badges AS b 
    ON b.UserId = p.OwnerUserId 
WHERE  p.PostTypeId = 1

That’s not too bad, depending on:

How frequently it runs
How concurrently it runs
How overwhelmed tempdb is
Your Mom

The Product

That gives us:

CREATE OR ALTER PROCEDURE dbo.VoteSniffing ( @VoteTypeId INT )
AS
BEGIN
SET XACT_ABORT, NOCOUNT ON;

SELECT DISTINCT p.OwnerUserId
INTO #p
FROM dbo.Posts AS p
JOIN dbo.Badges AS b
    ON b.UserId = p.OwnerUserId
WHERE p.PostTypeId = 1
AND EXISTS
(
    SELECT 1/0
    FROM dbo.Votes AS v
    WHERE v.UserId = p.OwnerUserId
    AND   v.VoteTypeId = @VoteTypeId
);

SELECT   ISNULL(v.UserId, 0) AS UserId,
         SUM(CASE WHEN v.CreationDate >= '20190101'
                  AND  v.CreationDate < '20200101'
                  THEN 1
                  ELSE 0
             END) AS Votes2019,
         SUM(CASE WHEN v.BountyAmount IS NULL
                  THEN 0
                  ELSE 1
             END) AS TotalBounty,
         COUNT(DISTINCT v.PostId) AS PostCount,
         @VoteTypeId AS VoteTypeId
FROM     dbo.Votes AS v WITH(FORCESEEK)
WHERE    v.VoteTypeId = @VoteTypeId
AND      NOT EXISTS
        (   
            SELECT 1/0
            FROM #p AS p
            WHERE  p.OwnerUserId = v.UserId
        )
GROUP BY v.UserId;

END;
GO

Which works pretty well across all calls, and avoids the primary issue with VoteTypeId 5.

SQL Server Query Execution Times — Navy Blue

I’m generally happy with this, with the slight exception of VoteTypeId 8. Yeah, it beats the pants off of when we sniff Plan 2, but it’s about 7 seconds slower than when we get Plan 1.

I pulled the 17 minute execution from this graph for Plan 2/VoteTypeId 5, too, because it’s so distracting. Not having to worry about that thing is a trade off I’m willing to make for Plan 3 being about a second slower than Plan 1.

Not bad for a lazy Sunday afternoon of blogging, though.

Save One For Friday

Query tuning in SQL Server 2019 isn’t always a whole lot different from performance tuning other versions of SQL Server.

You have some more help from optimizer features (especially if you’re on Enterprise Edition), but they don’t solve every problem, and you can run into some very common problems that you’re already used to solving.

You may even be able to use some very familiar techniques to fix things.

In tomorrow’s post, I want to look at a quirk that would have thrown us way off course to explore on our way here.

Thanks for reading!