A Little About Adaptive Joins In SQL Server
Thanks for watching!
Video Summary
In this video, I dive into a scenario where my client upgraded from SQL Server 2014 to 2022, taking advantage of the new adaptive join features. I explain how these intelligent query processing capabilities were previously hindered due to insufficiently wide indexes and the additional overhead of potential lookups or sorts. By creating more comprehensive indexes that cover all necessary columns, we successfully enabled adaptive joins for certain queries, demonstrating their importance in optimizing performance. This video is packed with practical examples and insights into what’s required for SQL Server 2022 to leverage these advanced features effectively.
Full Transcript
Erik Darling here with Darling Data. Today’s video, we’re going to talk about a situation where a client of mine, you know, the nice people who pay me, you could be one of those too if you paid me, they had switched, upgraded to SQL Server, well actually to 2022. So they went from like 2014 to 2022. So it was a pretty big jump for them. And part of the reason why they wanted to go there was because they wanted all of these crazy, fancy, new, intelligent query processing features. One of them was adaptive joins. And the reason for that is because there were a number of queries that we found where adding a join hint, like either hash join or loop join, or sometimes a loop join, or hash join, or just to exclude merge joins from the picture, or just to exclude merge joins from the picture, because screw a merge join, were like helping query performance. And one of the problems that we found once we got to SQL Server 2022 and up the compatibility level, this is Enterprise Edition, of course, because Microsoft only gives the good stuff to people who spend money. I give the good stuff to everyone for free, even though even though I was recently accused of having useless drivel in my videos. The problem is that the nice folks at Beer Gut Magazine, they pay me by the minute to do these, so I do try to stretch them out a little bit. Maybe there is some useless drivel in here, I don’t know.
If you want some real useless drivel, I’ll tell you that this spot on my shirt is from my morning protein shake, because I can’t figure out how to pop the lid, shake up the thing, and then you get protein stuff on the flappy lid, and then drink from the flappy lid, or drink from the shaker, but the flappy lid drips on me. So this is my fitness credibility badge right here, this little protein shake spot on my shirt. Alright, so that’s all the useless drivel in this video. The useless drivel about a protein drivel, right there.
So one of the things that we found was that SQL Server was not choosing adaptive join plans, and one of the reasons why it was not choosing adaptive join plans is because we did not have sufficiently wide indexes. So, to give you a little background on adaptive joins, they were introduced in SQL Server 2017, and they require some sort of batchy mode-y thing to happen, either because you have a columnstore index on your table, or you’re using the fancy pants enterprise edition of SQL Server, and you’re getting batch mode on rowstore. Or maybe you create a temporary table, or just like a helper table in your database that has no rows in it, but has a clustered columnstore index on it, and you can do a fake left join on 1 equals 0 to that thing to get some of the batchy mode-y intelligent query processing features to kick in for you.
So, that wasn’t happening. An extra sort of level of costing that makes the join ineligible for an adaptive join. SQL Server doesn’t want to choose between a hash join and a nested loops join, where on the inner side of the join, you might have to do a lookup to get rows out.
At least, I’ve never seen it choose an adaptive join where there was a lookup involved. So, you know, maybe you can find an example of it and prove me entirely wrong, but, you know, kind of doubt it. Anyway, let’s look at a couple examples of this, right?
All batch mode adaptive joins start out as hash joins. And there’s this, like, threshold for the join, where if it passes that threshold, it will remain a hash join. But if it doesn’t pass that row threshold on the outer part of the join, then it will switch to nested loops.
All right? So, we have, I think I already created this index. Let’s make sure.
I’m not a fool. And let’s, actually, you know what? We’re going to do a little prep work because there’s another index down here. I’m just going to make sure that that index, the second index definitely isn’t there, but the first index definitely is there.
And that will make the demo go a lot smoother. All right? That will at least prolong my SQL Server career.
That will keep me out of the fitness industry for, like, another week or so, I think. All right? So, we’ve got this index.
Now, this is a simple example with just a single key column index. In real life, I know your indexes are probably a bit wider, and your queries are probably even wider than that. So, this is kind of a common thing that can happen.
So, what we got here is a query that does a couple left joins, and we’re going to force optimizer compatibility level 160 right here so that we are totally and completely eligible for batch mode on rowstore. And if we run this query with query plans turned on, it’s not terribly slow, but the point of this demo isn’t to show you a slow query and then a dramatic improvement with adaptive joins.
The point of this demo is to show you what your sort of requirements are for adaptive joins kicking in. All right? So, in this query plan, we get a nested loops join, which was not inappropriate, but this plan also features a key lookup.
And so, the adaptive join is not a thing here. All right? So, we do get a batch mode hash join later on, but SQL Server does not make an adaptive join choice here.
Okay? And that’s all because SQL Server doesn’t want to have to make that choice with the additional overhead of a potential lookup involved. So, let’s create an index that covers everything in the query.
Remember, we do a lookup down here, and what we’re doing in the lookup is we have a predicate on the score column, and we output the score column. So, if we have the score column in the index, we’re going to create a brand new one here, right, that has owner, user ID, and score on it. We’re going to make a brilliant indexing choice, maybe, and we rerun this query.
Now, we get an adaptive join between the users and the post table. We still don’t get one for the comments table. Now, you can have more than one adaptive join in a plan.
SQL Server just doesn’t go for it here. SQL Server doesn’t think it’s worthwhile to invest in an adaptive join here. You can totally have multiple adaptive joins in a single execution plan. But now we get the adaptive join here between users and posts that we didn’t get before because the post table did not have an index that adequately covered all the columns that we needed for it.
Now, I did tell you earlier that merge joins were not part of this consideration. And the reason why merge joins aren’t part of the consideration is because merge joins expect sorted input. And when we need to sort input, like, say, for this query, or rather for this join, SQL Server doesn’t want to think about, oh, we needed to sort that.
I don’t want to, like, in the same way that, like, having to do a lookup is an additional costing, like, perspective for the query, having to sort data to employ a merge join is also an additional costing thing. So merge joins are not part of the picture.
And, you know, of course, because I hinted, even though we don’t need to sort for this part between users and posts, I just have a merge join hint for the whole thing, so it shows a merge join hint here. I just wanted to show you the merge join plan to show you that, you know, if you need to, like, this is why merge joins aren’t a consideration for the adaptive join.
Now, what’s interesting is that neither merge joins nor nested loops joins can operate in row mode. I’m sorry, in batch mode. They can only operate in row mode.
So only, like, if you get a hash join for an adaptive join, that will be in batch mode. But if you get an adaptive join with nested loops, that’ll be in row mode, right? Okay, so that’s all good there.
And I do want to show you one kind of funny thing about this query. Let’s see if I can get a somewhat better execution plan for it. It looks like last night it was a little bit better.
But this query plan features some very odd operator timings. Very odd indeed. And if we look at what happens in here, right? Where’s the funny part?
Where’s the funny bones? It’s kind of right at the end here. If we look at the very end of this query, we get 565 milliseconds and then 1.1 seconds here. But then if we go and look at the query time stats, the CPU time and the elapsed time agree with the operator before the gather streams.
So that’s amusing to me. Anyway, apparently the operator time code could use some work. I would imagine with the influx of summer interns, now that we’re entering June, July, and August, Microsoft will have some top interns on top of this operator timing code.
Any day now, it’ll be all fixed. Anyway, that’s all I had to say for this one. Thank you for watching.
I hope you enjoyed my useless dribble. I hope you learned something, even from the useless dribble. I hope you enjoyed yourselves. If you like this video, as usual, thumbs up and nice comments. Even if comments supportive of my useless dribble are always appreciated.
And if you like this sort of SQL Server content, you can join the… Hold on. I want a freshly up-to-date… I want a read-committed up-to-date number here.
You can join nearly 3,719 other SQL Server professionals or some sort of… I don’t know. I actually don’t know the makeup of the audience.
I say SQL Server professionals. It could be just derelicts off the street who like to watch my videos. Maybe they’re like, I’m going to get my hands on his Adidas t-shirt someday. I don’t know.
But yeah, you could join nearly 3,700 and… Yeah, it hasn’t changed. 19 people who subscribe to this channel and get helpful little bonks on the head. Every time I publish one of these…
Well, I guess you could interpret them as either… Beautiful Gems of Wisdom. Or… Beautiful Gems of Wisdom Drizzled in Useless Dribble. Drift…
Protein Shake. I don’t know. However you want to call it. Anyway, thank you very much for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Hi Mr Drivel,
I must say I like you’re drivel, specially when it’s mixed up with useful information.
So thanks for spreading SQL knowledge, much appreciated!
Aw, thank you!