One line I see over and over again — I’ve probably said it too when I was young and needed words to fill space — is that CTEs make queries more readable.
Personally, I don’t think they make queries any more readable than derived tables, but whatever. No one cares what I think, anyway.
Working with clients I see a variety of query formatting styles, ranging from quite nice ones that have influenced the way I format things, to completely unformed primordial blobs. Sticking the latter into a CTE does nothing for readability even if it’s commented to heck and back.
nope nope nope
There are a number of options for formatting code:
Formatting your code nicely doesn’t just help others read it, it can also help people understand how it works.
Take this example from sp_QuickieStore that uses the STUFF function to build a comma separated list the crappy way.
If STRING_AGG were available in SQL Server 2016, I’d just use that. Darn legacy software.
parens
The text I added probably made things less readable, but formatting the code this way helps me make sure I have everything right.
The opening and closing parens for the STUFF function
The first input to the function is the XML generating nonsense
The last three inputs to the STUFF function that identify the start, length, and replacement text
I’ve seen and used this specific code a million times, but it wasn’t until I formatted it this way that I understood how all the pieces lined up.
Compare that with another time I used the same code fragment in sp_BlitzCache. I wish I had formatted a lot of the stuff I wrote in there better.
carry the eleventy
With things written this way, it’s really hard to understand where things begin and end and that arguments belong to which part of the code.
Maybe someday I’ll open an issue to reformat all the FRK code ?
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
There was a three-part series of posts where I talked about a weird performance issue you can hit with parameterized top. While doing some query tuning for a client recently, I ran across a funny scenario where they were using TOP PERCENT to control the number of rows coming back from queries.
With a parameter.
So uh. Let’s talk about that.
Setup Time
Let’s start with a great index. Possibly the greatest index ever created.
CREATE INDEX whatever
ON dbo.Votes
(VoteTypeId, CreationDate DESC)
WITH
(
MAXDOP = 8,
SORT_IN_TEMPDB = ON
);
GO
Now let me show you this stored procedure. Hold on tight!
CREATE OR ALTER PROCEDURE dbo.top_percent_sniffer
(
@top bigint,
@vtid int
)
AS
SET NOCOUNT, XACT_ABORT ON;
BEGIN
SELECT TOP (@top) PERCENT
v.*
FROM dbo.Votes AS v
WHERE v.VoteTypeId = @vtid
ORDER BY v.CreationDate DESC;
END;
Cool. Great.
Spool Hardy
When we execute the query, the plan is stupid.
EXEC dbo.top_percent_sniffer
@top = 1,
@vtid = 6;
GO
the louis vuitton of awful
We don’t use our excellent index, and the optimizer uses an eager table spool to hold rows and pass the count to the TOP operator until we hit the correct percentage.
This is the least ideal situation we could possibly imagine.
Boot and Rally
A while back I posted some strange looking code on Twitter, and this is what it ended up being used for (among other things).
CREATE OR ALTER PROCEDURE dbo.top_percent_sniffer
(
@top bigint,
@vtid int
)
AS
SET NOCOUNT, XACT_ABORT ON;
BEGIN;
WITH pct AS
(
SELECT
records =
CONVERT(bigint,
CEILING(((@top * COUNT_BIG(*)) / 100.)))
FROM dbo.Votes AS v
WHERE v.VoteTypeId = @vtid
)
SELECT
v.*
FROM pct
CROSS APPLY
(
SELECT TOP (pct.records)
v.*
FROM dbo.Votes AS v
WHERE v.VoteTypeId = @vtid
ORDER BY v.CreationDate DESC
) AS v;
END;
GO
better butter
Soul Bowl
This definitely has drawbacks, since the expression in the TOP always gives a 100 row estimate. For large numbers of rows, this plan could be a bad choice and we might need to do some additional tuning to get rid of that lookup.
There might also be occasions when using a column store index to generate the count would be benefit, and the nice thing here is that since we’re accessing the table in two different ways, we could use two different indexes.
But for reliably small numbers of rows, this is a pretty good solution.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In the year 950 B.C., Craig Freedman write a post about subqueries in CASE expressions. It’s amazing how relevant so much of this stuff stays.
In today’s post, we’re going to look at a slightly different example than the one given, and how you can avoid performance problems with them by using APPLY.
Like most query tuning tricks, this isn’t something you always need to employ, and it’s not a best practice. It’s just something you can use when a scalar subquery doesn’t perform as you’d like it to.
How Much Wood
Our starting query looks like this. The point of it is to determine the percentage of answered questions per month.
SELECT
x.YearPeriod,
MonthPeriod =
RIGHT('00' + RTRIM(x.MonthPeriod), 2),
PercentAnswered =
CONVERT(DECIMAL(18, 2),
(SUM(x.AnsweredQuestion * 1.) /
(COUNT_BIG(*) * 1.)) * 100.)
FROM
(
SELECT
YearPeriod = YEAR(p.CreationDate),
MonthPeriod = MONTH(p.CreationDate),
CASE
WHEN EXISTS
(
SELECT
1/0
FROM dbo.Votes AS v
WHERE v.PostId = p.AcceptedAnswerId
AND v.VoteTypeId = 1
)
THEN 1
ELSE 0
END AS AnsweredQuestion
FROM dbo.Posts AS p
WHERE p.PostTypeId = 1
) AS x
GROUP BY
x.YearPeriod,
x.MonthPeriod
ORDER BY
x.YearPeriod ASC,
x.MonthPeriod ASC;
Smack in the middle of it, we have a case expression that goes looking for rows in the Votes table where a question has an answer that’s been voted as the answer.
Amazing.
To start with, we’re going to give it this index.
CREATE INDEX p
ON dbo.Posts(PostTypeId, AcceptedAnswerId)
INCLUDE(CreationDate)
WITH(MAXDOP = 8, SORT_IN_TEMPDB = ON);
Planpains
In all, this query will run for about 18 seconds. The majority of it is spent in a bad neighborhood.
but first
Why does this suck? Boy oh boy. Where do we start?
Sorting the Votes table to support a Merge Join?
Choosing Parallel Merge Joins ever?
Choosing a Many To Many Merge Join ever?
All of the above?
Bout It
If we change the way the query is structured to use OUTER APPLY instead, we can get much better performance in this case.
SELECT
x.YearPeriod,
MonthPeriod =
RIGHT('00' + RTRIM(x.MonthPeriod), 2),
PercentAnswered =
CONVERT(DECIMAL(18, 2),
(SUM(x.AnsweredQuestion * 1.) /
(COUNT_BIG(*) * 1.)) * 100.)
FROM
(
SELECT
YearPeriod = YEAR(p.CreationDate),
MonthPeriod = MONTH(p.CreationDate),
oa.AnsweredQuestion
FROM dbo.Posts AS p
OUTER APPLY
(
SELECT
AnsweredQuestion =
CASE
WHEN v.Id IS NOT NULL
THEN 1
ELSE 0
END
FROM dbo.Votes AS v
WHERE v.PostId = p.AcceptedAnswerId
AND v.VoteTypeId = 1
) oa
WHERE p.PostTypeId = 1
) AS x
GROUP BY
x.YearPeriod,
x.MonthPeriod
ORDER BY
x.YearPeriod ASC,
x.MonthPeriod ASC;
This changes the type of join chosen, and runs for about 3 seconds total.
buttercup
We avoid all of the problems that the parallel many-to-many Merge Join brought us.
Thanks, Hash Join.
It’s also worth noting that the OUTER APPLY plan asks for an index that would help us a bit, though like most missing index requests it’s a bit half-baked.
USE [StackOverflow2013]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Votes] ([VoteTypeId])
INCLUDE ([PostId])
GO
Index Plus
Any human could look at this query and realize that having the PostId in the key of the index would be helpful, since we’d have it in secondary order to the VoteTypeId column
CREATE INDEX v
ON dbo.Votes(VoteTypeId, PostId)
WITH(MAXDOP = 8, SORT_IN_TEMPDB = ON);
If we add that index, we can make the subquery fairly competitive, at about 4.5 seconds total.
bloop join
But the issue here is now rather than poorly choosing a Sort > Merge Join, we go into a Nested Loops join for ~6 million rows. That’s probably not a good idea.
This index doesn’t leave as profound a mark on the APPLY version of the query. It does improve overall runtime by about half a second, but I don’t think I’d create an index just to get a half second better.
astro weiner
But hey, who knows? Maybe it’d help some other queries, too.
Indexes are cool like that.
Back On The Map
If you’ve got subqueries in your select list that lead to poor plan choices, you do have options. Making sure you have the right indexes in place can go a long way.
You may be able to get competitive performance gains by rewriting them as OUTER APPLY. You really do need to use OUTER here though, because it won’t restrict rows and matches the logic of the subquery. CROSS APPLY would act like an inner join and remove any rows that don’t have a match. That would break the results.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Well over 500 years ago, Paul White wrote an article about distinct aggregates. Considering how often I see it while working with clients, and that Microsoft created column store indexes and batch mode rather than allow for hash join hints on CLR UDFs, the topic feels largely ignored.
But speaking of all that stuff, let’s look at how Batch Mode fixes multiple distinct aggregates.
Jumbo Size
A first consideration is around parallelism, since you don’t pay attention or click links, here’s a quote you won’t read from Paul’s article above:
Another limitation is that this spool does not support parallel scan for reading, so the optimizer is very unlikely to restart parallelism after the spool (or any of its replay streams).
In queries that operate on large data sets, the parallelism implications of the spool plan can be the most important cause of poor performance.
What does that mean for us? Let’s go look. For this demo, I’m using SQL Server 2019 with the compatibility level set to 140.
SELECT
COUNT_BIG(DISTINCT v.PostId) AS PostId,
COUNT_BIG(DISTINCT v.UserId) AS UserId,
COUNT_BIG(DISTINCT v.BountyAmount) AS BountyAmount,
COUNT_BIG(DISTINCT v.VoteTypeId) AS VoteTypeId,
COUNT_BIG(DISTINCT v.CreationDate) AS CreationDate
FROM dbo.Votes AS v;
In the plan for this query, we scan the clustered index of the Votes table five times, or once per distinct aggregate.
skim scan
In case you’re wondering, this results in one intent shared object lock on the Votes table.
This query runs for 38.5 seconds, as the crow flies.
push the thing
A Join Appears
Let’s join Votes to Posts for no apparent reason.
SELECT
COUNT_BIG(DISTINCT v.PostId) AS PostId,
COUNT_BIG(DISTINCT v.UserId) AS UserId,
COUNT_BIG(DISTINCT v.BountyAmount) AS BountyAmount,
COUNT_BIG(DISTINCT v.VoteTypeId) AS VoteTypeId,
COUNT_BIG(DISTINCT v.CreationDate) AS CreationDate
FROM dbo.Votes AS v
JOIN dbo.Posts AS p
ON p.Id = v.PostId;
The query plan now has two very distinct (ho ho ho) parts.
problemium
This is part 1. Part 1 is a spoiler. Ignoring that Repartition Streams is bizarre and Spools are indefensible blights, as we meander across the execution plan we find ourselves at a stream aggregate whose child operators have executed for 8 minutes, and then a nested loops join whose child operators have run for 20 minutes and 39 seconds. Let’s go look at that part of the plan.
downstream
Each branch here represents reading from the same spool. We can tell this because the Spool operators do not have any child operators. They are starting points for the flow of data. One thing to note here is that there are four spools instead of five, and that’s because one of the five aggregates was processed in the first part of the query plan we looked at.
The highlighted branch is the one that accounts for the majority of the execution time, at 19 minutes, 8 seconds. This branch is responsible for aggregating the PostId column. Apparently a lack of distinct values is hard to process.
But why is this so much slower? The answer is parallelism, or a lack thereof. So, serialism. Remember the 500 year old quote from above?
Another limitation is that this spool does not support parallel scan for reading, so the optimizer is very unlikely to restart parallelism after the spool (or any of its replay streams).
In queries that operate on large data sets, the parallelism implications of the spool plan can be the most important cause of poor performance.
Processing that many rows on a single thread is painful across all of the operators.
Flounder Edition
With SQL Server 2019, we get Batch Mode On Row store when compatibility level gets bumped up to 150.
The result is just swell.
yes you can
The second query with the join still runs for nearly a minute, but 42 seconds of the process is scanning that big ol’ Posts table.
Grumpy face.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I dive into a real-world scenario where I used my SSMS tuning setup to optimize a query for one of my help reviews. The query in question was particularly slow and caused significant delays during live demos due to its execution time. After analyzing the execution plan, which took over 2 minutes to run, I decided to tackle the issue by rewriting the query to eliminate unnecessary join conditions and reduce redundant operations. This approach not only sped up the query significantly but also provided a clearer, more efficient execution path for SQL Server to follow. By sharing this experience, I hope to offer some practical insights into optimizing complex queries for smoother performance in real-world scenarios.
Full Transcript
Oh boy, oh my goodness. This is, this is quite a day, quite a Friday. Uh, Erik Darling here with Erik Darling Data. It would be Darling Data, but someone else, someone else got there first. I mean, for the domain. I mean, I still have the company. Whatever. I mean, no one else got here first. That would be awkward. Like, hostile takeover of my own company. Anyway. Uh, today we are drinking something red that my wife walked in and handed me, so I have no details. Uh, it smells nice though. And, uh, this video is, uh, going to build, oh geez, we’re building things, we’re building on things. There’s a pattern forming here. Uh, where we’re going to, uh, build a little bit on a video that I recorded, I think, last week. I don’t remember.
The weeks are still, time is still very strange to me. Uh, talking about, you know, how I set up SSMS to tune queries. And here’s an example of actually when I used it to do that same, to do that very thing. Same very thing. Uh, recently. To, uh, one of my help reviews. Uh, what’s up memory? And, uh, I don’t know why I named it that. I was, I guess because who is memory sounded stupid. Uh, so, yeah. Uh, so this, this, this help review will tell you, uh, what’s in, what’s in your buffer pool currently. And that’s helpful, I guess, because everyone wants to know what’s in their buffer pool constantly all the time.
Uh, I use it for demos that I do for things that, uh, I find helpful. So, I don’t know. Maybe you will too. I don’t know. If you don’t, go do something else with your life. I don’t care that much. Um, but, what I found is every time I wanted to use this thing on a, uh, on an instance with a good chunk of memory, it would be, it was slow. Terribly, dreadfully, painfully slow. If we look at the execution plan, uh, we see that it took about 2 minutes and 44 seconds to run.
And, uh, who could wait that long for anything? Uh, and it made, uh, certain demos very difficult to do live or even to, like, sort of do off-the-cuff recording. I would have to, like, pre-stage everything so that it would be set up here. So, we were just, like, like, sitting around waiting for this and awkwardly staring at the camera.
Or, like, doing a cut scene or a montage, lifting weights in my short shorts, running on a beach, karate-kicking seagulls, whatever. Uh, but, I mean, just, you know, kind of, like, digging through the execution plan. Again, very helpfully looking at the operator times under these things, right?
So, we no longer have to care about percentages because, I mean, not that they meant anything anyway. Shut up, car. But, it was helpful, it’s much more helpful to sort of, like, follow the yellow brick road of operator times to when things sort of drop off and pick up.
So, kind of framing this a little bit, we have this nested loops join that hits 2.15. And we have this concatenation that is at 0.001. And down here, we have this table spool, which takes a minute and 36 seconds.
Prior to that table spool, we do some nonsense where we hit this view, sysalloc units. And, uh, I’m mispronouncing that would have been deadly. Uh, and this one down here, sys.buffer descriptors.
And we sort of join those together. And then we spend a bunch of time in the spool and a bunch of time in this nested loops join. And this, and the spool, of course, is trying to save us some trouble with the nested loops join.
It’s like, oh, I don’t know how repetitive this stuff is going to be. I don’t know if I want to do everything down here over and over again. I’m going to use a spool to cache some information and reuse it if I can.
Uh, you know, I don’t know how helpful this spool actually was. We look at the properties of the spool. And we look at the, uh, the rebinds and the rewinds.
Well, you know, I guess, I guess it’s okay. Deal with it. But, uh, I don’t like this spool. And if you look at the nested loops join, and kind of zoom in on the tool tip, it’s all happening because of this very, very difficult predicate.
Uh, there’s a lot of and and or logic in this join condition. Let’s go zoom in on what the join condition looked like beforehand. Uh, and this is, you know, sort of many times looking at, uh, queries and looking at, uh, you know, things that people wrote up in the real world.
This is like the enemy of performance. This is not a good thing to do if you want queries to run quickly. Uh, especially, you know, bigger queries, more rows, things you have to worry about, performance. Ah, terrifying.
So, uh, you know, my initial reaction. And I guess we can just come over here. My initial reaction was just to try a query hint on here for no performance spools. Zero performance spools.
Which would get rid of, uh, actually gets rid of a few different kinds of spools in query plans. And get rid of row count spools. Uh, lazy table spools. And, uh, lazy index spools. It does not get rid of eager index spools.
Uh, the only thing that does that is a, uh, uh, turning off certain query rules via other things that we’re not talking about here. But, um, no, with the no performance spool hit, if we look at the query plan, we get a faster one. Still faster.
56 seconds. So we did better. It still takes a full minute. It’s still a lot of time to kill. We don’t feel like talking, really. It’s, uh, you know, kind of dragging along the query plan. You know, it’s kind of the same, same set of yuckiness.
Like, why does it take five seconds to seek into an index? Like, what, what are you doing? You know, who designs these things? Add an index.
Add a good index. Why is this? This is silly. Uh, and then this one, or three seconds. And, you know, like, like focusing in on the operator times. There’s a thing that’s like, we have big jumps, right? Not like, we don’t want like incremental ones. Incremental ones are boring.
What if that big jump to where we got up there? And, you know, I guess like a six second jump is pretty big, right? 5.7 seconds up to there. Okay. Well, you know, something to think about. Uh, but, you know, kind of, again, following this road, right?
So now we have that same join, I think. Uh, but we have, uh, like a bunch of stuff down here that just sort of takes a lot of, without the, without the spool, we have to do a lot of, the optimizer chooses a much different plan.
Whereas like a bunch of different stuff happens. Right. And, uh, I don’t know. I’m kind of got lost with what I was going to say about this one, but you know, it’s Friday and you’re not paying attention anyway.
So it doesn’t matter all that much. Does this do? No. Outer references. So this time we get a part, we get an apply nested loops where, uh, rather than the regular nested loops where everything happened at the join, where it was with the spool. Now we have, we get an apply nested loops where it takes, uh, the type and the, uh, owner ID and it pushes things down.
If I’m, if I’m looking at the right join, even at this point, I’m, I forget. Anyway, uh, that’s how I tune queries. I forget things. Uh, but yeah. So, uh, what I, what I, what I thought looking at it was, geez, this is a performance problem that I’ve solved for other people.
A lot of them, a whole bunch of times. Why don’t I solve it for myself for once? And so I rewrote the query and I’ll show you what I did in a minute. But now the execution plan, I, this finishes in about seven seconds.
So now everything that’s slow in here is purely Microsoft’s fault. There’s no like really, really big jumps in the query plan. Uh, you know, we still get, have no buffer descriptors was kind of annoying, but not the biggest deal in the world.
But anyway, let’s get onto the rewrite. Uh, all the stuff up here is superfluous. I mean, that’s just like display level stuff. Uh, what really ended up making a difference was, so the first time I was thinking about rewriting this, I thought that I would just do this thing in here and separate out each of the join conditions.
So in this one, I have one explicit join condition with no, uh, this space there. Ugh, I stink. Uh, and then another one with this explicit join condition. So no like and or in crap in here.
And, you know, looking there could cool. And at first I thought that I would do this and I would, uh, just preserve the original join logic, which up here is, uh, from buffer descriptors joined over to allocation units.
But, uh, I found that hitting that buffer descriptors view three times was painful. So I ended up pulling the buffer descriptors part out and just doing one join to it at the end once I had everything else, uh, done out here.
So now the query is just grabbing all this stuff, right? Having fun doing things, pulling out information. And then, uh, when we get outside of that sort of inner query where we, after we just, I just called that X because, uh, I don’t know, Mr. X.
And then, uh, we do our grouping and our ordering out here. And we do all the summing in whatnot up here with our case expressions here where they’re far less dangerous.
And of course, uh, repeating this query with three explicit joins is a lot faster. Now it’s a lot faster for reasons that I’ve talked about in other posts where, you know, if we’re writing a single purpose queries, it’s a lot easier for the optimizer to go, uh, to deal with that and to make an efficient query plan.
Um, I’ll, I’ll probably blog more about that in the future, but for now, that’s what you get because we’re about at the 10 minute mark and I’m about to get my Friday evening started. And, uh, I hope, well, I mean, by the time you see this, it’ll be Tuesday, but that’s your fault.
Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
sp_PressureDetector now includes statement start and end offsets, in case you want to build plan guides for queries
WhatsUpMemory got a huge performance tuning, which I’ll talk about in tomorrow’s post!
Nothing else here. Happy downloading and analyzing.
Or analysing, if you’re from a refined country.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into an intriguing aspect of SQL Server query optimization by demonstrating how different database compatibility levels can influence execution plans and performance. I illustrate this concept using two databases: the master database set to compatibility level 150 (SQL Server 2019) and the stack overflow database at compatibility level 140. By running a specific query in each context, I show that the same operation yields different execution plans due to batch mode on rowstore being enabled only in the higher compatibility level. This example highlights how changing the database context can lead to more efficient query execution without altering the actual code or adding hints. I hope this video provides you with valuable insights into leveraging database compatibility levels for optimizing your queries, especially when dealing with mixed workloads across different databases.
Full Transcript
That champagne still smells wonderful. Legally wonderful. Anyway, I want to show you something kind of interesting. And you can interpret this in your own way, and you can implement this in your own way when it might suit you. And it may suit you someday. If you have certain groups of queries that work really well, when they do one thing but not really the same. really well when they do another thing. I’m going to show you exactly what I mean by that. Now, let’s start off by making very, very sure that we are all clear about what context this database, what context, what database context, that champagne may have smelt a little too good, what database context this query is taking place in. Let’s also be quite sure that we understand which compatibility levels of data. these two queries are taking place in across these two databases. The master database is in compatibility level 150 and the stack overflow database is in compatibility level 140. So master is in 2019 where we get all sorts of fancy things if we are on the most enterprising edition possible like batch mode on rowstore and well no, because scalar UDF inlining is both standard and enterprise.
So, I don’t know, there’s some other stuff in 2019 that I suppose is okay too. I’m not sure what though. Every time I install it, my computer just blue screens. I’m kidding. I’m kidding. It’s fine. It’s production ready. Go use it. Go crazy. Go crazy. And just to make extra sure, right? Like I’m not kidding with you. We are in the master database.
And yet, when I run this query to select a count of records from the post table in the stack overflow 2013 database, we get a very particular query plan. Now, if you’ve watched other videos of mine, you would know that this hash match aggregate to implement a global aggregate could only be done via stream aggregate in prior versions of SQL Server. So, with the proliferation of batch mode, this can now be a hash aggregate. We no longer have to use a stream aggregate.
So, this hash match aggregate is taking place in batch mode. That should be a vocal warm-up exercise. I’m going to tell my vocal coach about that. Hash match aggregate. So, this is a batch aggregate. As well as this clustered index scan are taking place in batch mode. So, the hash match aggregate is batched. I said that fast, didn’t I? And that’s interesting because the stack overflow 2013 database is in 140 compat level where batch mode on rowstore should not be possible.
This is, if you look at the storage. Shut up. Emergency. If you look at the storage, this is rowstore. And we’re not doing any tricky stuff like joining to another table with a clustered columnstore index it on it or something to get batch mode happening. This is a natural occurrence within the query. Right? It just happens. Right? It’s nice. It’s cool. So, great. We have this thing happening.
Now, if you come over here and we very, very clearly use the stack overflow database and we reiterate the fact that the stack overflow database is in 140 compat level and we look at masters and we only run the query in the context of the stack overflow database. And we actually get the query plan. Good job, me. We have a different execution plan. Don’t we?
We see that stream aggregate that was only, that had to be used in prior versions or, not in prior versions of SQL Server, just in the context of a rowstore only query. So, that’s that. And you may find this to be an attractive option if you have a group of, let’s say, reporting queries that you can execute from another database context that’s in compatibility level 150 against another database. Maybe that’s turned more transactional in nature in compatibility level 140 so that we don’t have to worry about hinting and changing all sorts of stuff.
We can just change the, we can just execute from a slightly different context and still get all the benefits of the optimization, the compatibility level and the optimizer abilities of the database where the query originates. So, take that as you will, implement it as you must. I hope you learned something. I hope you enjoyed this thankfully much shorter video so my champagne doesn’t get warm. Well, I don’t like the smell of warm champagne, especially on camera.
Thanks for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
It’s likely also obvious that your join clauses should also be SARGable. Doing something like this is surely just covering up for some daft data quality issues.
SELECT
COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
ON ISNULL(p.OwnerUserId, 0) = u.Id;
If 0 has any real meaning here, replace the NULLs with zeroes already. Doing it at runtime is a chore for everyone.
But other things can be thought of as “SARGable” too. But perhaps we need a better word for it.
I don’t have one, but let’s define it as the ability for a query to take advantage of index ordering.
World War Three
There are no Search ARGuments here. There’s no argument at all.
But we can plainly see queries invoking functions on columns going all off the rails.
Here’s an index. Please enjoy.
CREATE INDEX c ON dbo.Comments(Score);
Now, let’s write a query. Once well, once poorly. Second verse, same as the first.
SELECT TOP(1)
c.*
FROM dbo.Comments AS c
ORDER BY
c.Score DESC;
SELECT TOP(1)
c.*
FROM dbo.Comments AS c
ORDER BY
ISNULL(c.Score, 0) DESC;
The plan for the first one! Yay!
inky
Look at those goose eggs. Goose Gossage. Nolan Ryan.
The plan for the second one is far less successful.
trashy vampire
We’ve done our query a great disservice.
Not Okay
Grouping queries, depending on scope, can also suffer from this. This example isn’t as drastic, but it’s a simple query that still exhibits as decent comparative difference.
SELECT
c.Score
FROM dbo.Comments AS c
GROUP BY
c.Score
HAVING
COUNT_BIG(*) < 0;
SELECT
ISNULL(c.Score, 0) AS Score
FROM dbo.Comments AS c
GROUP BY
ISNULL(c.Score, 0)
HAVING
COUNT_BIG(*) < 0;
To get you back to drinking, here’s both plans.
the opposite of fur
We have, once again, created more work for ourselves. Purely out of vanity.
Indexable
Put yourself in SQL Server’s place here. Maybe the optimizer, maybe the storage engine. Whatever.
If you had to do this work, how would you prefer to do it? Even though I think ISNULL should have better support, it applies to every other function too.
Would you rather:
Process data in the order an index presents it and group/order it
Process data by applying some additional calculation to it and then grouping/ordering
That’s what I thought.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
My dear friend Kendra asked… Okay, look, I might have dreamed this. But I maybe dreamed that she asked what people’s Cost Threshold For Blogging™ is. Meaning, how many times do you have to get asked a question before you write about it.
I have now heard people talking and asking about in-memory table variables half a dozen times, so I guess here we are.
Talking about table variables.
In memory.
Yes, Have Some
First, yes, they do help relieve tempdb contention if you have code that executes under both high concurrency and frequency. And by high, I mean REALLY HIGH.
Like, Snoop Dogg high.
Because you can’t get rid of in memory stuff, I’m creating a separate database to test in.
Here’s how I’m doing it!
CREATE DATABASE trash;
ALTER DATABASE trash
ADD FILEGROUP trashy
CONTAINS MEMORY_OPTIMIZED_DATA ;
ALTER DATABASE trash
ADD FILE
(
NAME=trashcan,
FILENAME='D:\SQL2019\maggots'
)
TO FILEGROUP trashy;
USE trash;
CREATE TYPE PostThing
AS TABLE
(
OwnerUserId int,
Score int,
INDEX o HASH(OwnerUserId)
WITH(BUCKET_COUNT = 100)
) WITH
(
MEMORY_OPTIMIZED = ON
);
GO
Here’s how I’m testing things:
CREATE OR ALTER PROCEDURE dbo.TableVariableTest(@Id INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
DECLARE @t AS PostThing;
DECLARE @i INT;
INSERT @t
( OwnerUserId, Score )
SELECT
p.OwnerUserId,
p.Score
FROM Crap.dbo.Posts AS p
WHERE p.OwnerUserId = @Id;
SELECT
@i = SUM(t.Score)
FROM @t AS t
WHERE t.OwnerUserId = 22656
GROUP BY t.OwnerUserId;
SELECT
@i = SUM(t.Score)
FROM @t AS t
GROUP BY t.OwnerUserId;
END;
GO
If we flip database compatibility levels to 150, deferred compilation kicks in. Great. Are you on SQL Server 2019? Are you using compatibility level 150?
Don’t get too excited.
Let’s give this a test run in compat level 140:
DECLARE @i INT = 22656;
EXEC dbo.TableVariableTest @Id = @i;
everything counts in large amounts
Switching over to compat level 150:
yeaaahhhhh
Candy Girl
So what do memory optimized table variables solve?
Not the problem that table variables in general cause.
They do help you avoid tempdb contention, but you trade that off for them taking up space in memory.
Precious memory.
Do you have enough memory?
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
I got a mailbag question recently about some advice that floats freely around the internet regarding indexing for windowing functions.
But even after following all the best advice that Google could find, their query was still behaving poorly.
Why, why why?
Ten Toes Going Up
Let’s say we have a query that looks something like this:
SELECT
u.DisplayName,
u.Reputation,
p.Score,
p.PostTypeId
FROM dbo.Users AS u
JOIN
(
SELECT
p.Id,
p.OwnerUserId,
p.Score,
p.PostTypeId,
ROW_NUMBER() OVER
(
PARTITION BY
p.OwnerUserId,
p.PostTypeId
ORDER BY
p.Score DESC
) AS n
FROM dbo.Posts AS p
) AS p
ON p.OwnerUserId = u.Id
AND p.n = 1
WHERE u.Reputation >= 500000
ORDER BY u.Reputation DESC,
p.Score DESC;
Without an index, this’ll drag on forever. Or about a minute.
But with a magical index that we heard about, we can fix everything!
Ten Toes Going Down
And so we create this mythical, magical index.
CREATE INDEX bubble_hard_in_the_double_r
ON dbo.Posts
(
OwnerUserId ASC,
PostTypeId ASC,
Score ASC
);
But there’s still something odd in our query plan. Our Sort operator is… Well, it’s still there.
grinch
Oddly, we need to sort all three columns involved in our Windowing Function, even though the first two of them are in proper index order.
OwnerUserId and PostTypeId are both in ascending order. The only one that we didn’t stick to the script on is Score, which is asked for in descending order.
Dram Team
This is a somewhat foolish situation, all around. One column being out of order causing a three column sort is… eh.
We really need this index, instead:
CREATE INDEX bubble_hard_in_the_double_r
ON dbo.Posts
(
OwnerUserId ASC,
PostTypeId ASC,
Score DESC
);
mama mia
Granted, I don’t know that I like this plan at all without parallelism and batch mode, but we’ve been there before.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.