Learn T-SQL With Erik: EXISTS, NOT EXISTS, and Row Goals
Video Summary
In this video, I delve into the nuances of using `NOT EXISTS` in T-SQL queries, particularly focusing on why many data scientists and developers often misuse the `EXISTS` pattern to find rows that don’t exist in another table. I explain the inefficiencies of the traditional approach and how Microsoft has introduced optimizer rules like “Left Outer Join to Left Anti-Semi Join” to optimize these queries. Additionally, I demonstrate practical examples using SQL Server 2025’s compatibility levels and query optimizer hot fixes, showing how `NOT EXISTS` can significantly improve performance compared to left joins. The video also touches on the complexities of row goals in the query optimizer and provides a workaround for scenarios where index modifications are not feasible.
Full Transcript
Erik Darling here with Darling Data, and you wouldn’t believe what we’re going to talk about today. More T-SQL stuff. Why? Because I need to promote two things. One, the T-SQL course that you can buy at the presale price down in the video description. And two, the pre-cons that Kendra Little and I are doing in Seattle this November. The material that you will see here and that you can buy at the link down there is, they’re related. They’re like cousins of some variety. And today we’re going to talk about, well, we’re going to talk specifically about not exists for the most part today. There is, of course, this is, of course, the beginner material. This is, of course, just a small preview of what’s in the full course material. Because if I just put all the course material here, what would be the point of buying? Be quite silly. All right. Anyway, let’s talk about this. So every, almost every time I see a, like, learn T-SQL, with smart data scientist person, they screw up. And they say, this is how you find rows in one table that don’t exist in another table. And they even use the words don’t exist. And for some reason, they use the words don’t exist, but they completely leave the SQL, like, like, pattern not exists out of out of anything they say. And it makes me want to hit them with hammers.
They’re the, why? What is the point of view? So let’s talk a little bit about why this query pattern doesn’t often work out. And then something that Microsoft actually did to try to help idiots like that who write queries like this. And then how, of course, how to write this query with not exist to avoid the issue. So good, good, good. All things that we can look forward to. So let’s run this query. So let’s run this query. And let’s get a result. And let’s take careful note of this result here. 103.0987. Cool. Here’s the query plan. And here is why I generally don’t like this query pattern. So SQL Server was kind enough to execute this query pretty without reordering joins or anything. So let’s look at this. And we’ll see that we, zoom it, we have a clustered index scan on the user’s table. And we take all 2.4-ish million rows out of this thing.
And we have an index scan on the post table. And we take all 17 million-ish rows out of this thing. And then we fully join all of those rows together, right? We join everything from the user’s table. We join everything from the post table. And we do a hash join here. SQL Server chooses not to pre-aggregate any of the owner user IDs coming out of the post table, even though owner user ID has a lot of dupes in it. And then over here, we have a filter operator. And look how many rows leave this filter operator prior to aggregating them down to do the count.
1, 0, 3, 0, 9, 8, 7. That is exactly how many rows we got for our query result. Now, the reason why this is terribly inefficient is because SQL Server joins every single row together. And then only later removes rows that don’t match with this, right? So where p.id is null, right? So that is what we’re, these are the rows that we’re moving in the filter.
This query pattern is so prolific and so profoundly used that Microsoft actually went and added a couple new optimizer rules called Lodge to last-je and Rodge to last-je. And what do those mean? Left outer join to left anti-semi join and right outer join to left anti-semi join. And what do these do?
Well, under, if you’re on SQL Server 2025, which recently became public domain, right? So that’s where we are now. Those rules exist.
If you’re on compat level 170, which is the new highest compat level for SQL Server for 2025, or you enable query optimizer hot fixes, that’s, you know, this is a use hint, but you can also, there are, there’s also database level settings that control this. But let’s look at the differences in these three execution plans.
I’ve just put a little bit of staging data on my SQL Server 2025 instance. I haven’t, I haven’t fully moved or fully restored Stack Overflow to it and started doing anything with it there. I just want to show you these couple things.
So using compatibility level 160, which is the previous high for SQL Server 2022, we have a fairly similar plan to what we had in, well, when we were actually just running this on SQL Server 2022, where we have a join here that fully joins both tables together. And then we have a filter that brings all the rows out.
And we have the same number 1030987 here. But for the two queries below, in the first one where I have the enable optimizer hot fixes, and then in the second one where I have compat level 170 in use, these two query plans do what it would actually, like, this is what it looks like when you write the query as not exists for the most part.
Now, these did get moved around a little bit, right? Like, SQL Server reordered our joins because the post table is on the outer side of the join for both of these. And the user’s table is on the inner side of the join for both of these.
And now, SQL Server has actually been kind enough to pre-aggregate some of those duplicate owner user IDs from the post table for us. But notice that in both of these queries, there is no longer a filter operator. And that’s because SQL Server has replaced that left join with a right anti-semi join, right?
So we have taken, we have removed that, like, weird, like, full join null check with a filter and just changed it to an anti-semi join, which is about the equivalent query pattern that you would get if you wrote your query using not exists to begin with, right? So that’s this pattern, right?
So if we run this query, we’ll see something fairly similar, not precisely similar, but fairly similar, where, you know, SQL Server has, like, here it left the join order intact where users is on the outer side and post is on the inner side. But here, SQL Server has chosen to pre-aggregate some of the owner user IDs.
And then we have just about the same pattern as before, where there’s an anti-semi join, where the non-matching rows are removed at the join rather than in a filter operator later, right before getting counted. And you’ll notice that at the join, we get down to R1030987 rather than having to fully join all the rows together.
I forget what the full number was for this one. Let’s take a quick look. Yeah, so 17945522, that is an eight-digit number of rows that come out of there.
And so that’s no fun, right? Now, so when you’re writing queries that need to find rows that don’t exist, if you’re living in, you know, a compat-level 170 land or you enable optimizer hot fixes in, like, a cloud product, like Azure SQL database or managed instance where a compat-level 170 new stuff is available, you may not have to worry so much about, like, how you write the query because SQL Server will kind of fix it for you.
But if you’re not living in that world, you should always test the not exist pattern first. Generally, not exist works out faster than the left join. So coming back to the execution plans, and the difference here, I agree, is not dramatic, but, you know, the drama will increase or decrease depending on the state of your server.
This takes about 1.2 seconds, and the not exist pattern down here takes about 600 or so milliseconds. So this is twice as fast locally, though the twice as fast difference isn’t huge here, right? It’s about, like, 1.2 seconds versus 600 milliseconds.
Now, there are some cases where you’ll use exists or not exists or something like that, and sort of like a check in your store procedures or whatever query you’re writing. It could be an ad hoc batch, too, where it’ll not go so well.
And the reason why it won’t go so well is going to have something to do with another optimizer thing called row goals. Row goals are sort of like the optimizer saying, I bet I can find all the rows that I need for this really quickly, and I’m going to use a slightly different execution plan because I bet I can find these rows quick, which, when it works out, is great.
When it doesn’t work out, it’s terrible. So here’s an example of a row goal not working out well using if exists, right? So if you have a store procedure or some other ad hoc batch and you’re like, why the hell is this so slow?
Well, one thing that you should look at is the query plan because the query plan will tell you most everything you need to know. So if we look at what happens here, notice how terrible this estimate is. SQL Server thinks that it’s only going to get three rows out of this, and it’s going to be able to figure out everything that it needs with those three rows, but it doesn’t.
This actually gets 3, 7, 5, 6, 5, 9, 7. That is 3.7 million rows here, right? And down here, well, we go into a nested loops join 3.7 million times, and we find nothing here.
This takes about five seconds, and this takes about five and a half seconds, which means this query as a whole takes just about 12 seconds in total to finish, 11.6 seconds. Now, you might be wondering why this thing has such a terrible cardinality estimate, and it’s not necessarily just a terrible cardinality estimate, but it is related to SQL Server making this guess on row goals.
So SQL Server bets that it can find, right? It estimates that it will take 2.8 rows, which gets rounded to three, but notice this estimated rows without row goal here, right?
That’s where things get messed up. This is where things get messed up. Introducing the row goal means that this thing, SQL Server is like, I bet I can do it in three rows, but it didn’t do it in three rows.
It took all the rows, right? It took all the row goals to do it. So these things can certainly backfire and cause problems. Now, if you wanted to fix this without creating indexes, because sometimes you’re just in a situation where you can’t just add, either add new indexes immediately on the fly, or like you’re just not allowed to add new indexes, period, you can change the query so that in force SQL Server to do some more work, which, and like, say, for instance, counting all the rows, and this will give you the ability to not have those row goals introduced, because SQL Server is like, crap, I’ve got to count all these rows.
I can’t just bail out and think that I can get what I need with three of them. So what I’m doing here is a little sloppy, I admit, because this will return a decimal 38, 0, but this is the big int max for SQL Server.
So saying select power 2 dot to the 63rd power minus 1 gets you to that big number, but when we do that, and we say, if exist select here, where the count is between 1 and that big int max, then SQL Server can’t choose that row goal plan, because now it has to count all those rows, and instead of taking 12 seconds, this takes about 1.5 seconds, and notice that now SQL Server is like, oh yeah, 3.7 million.
That sounds more realistic to me anyway. All right, cool. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something, and I will see you in the next video, where we will talk about equally interesting and exciting things for another 7 to 12 minutes.
All right, cool. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.