A SQL Server Query Plan Answer

A SQL Server Query Plan Answer


Video Summary

In this video, I dive into a fascinating riddle involving query plans and SQL Server performance. After posing the question in yesterday’s video, I explore the mystery of why removing a column from the SELECT list causes an error when using a hash join hint. I walk you through comparing estimated plans with and without the hint, highlighting the differences between nested loops joins and hash joins. By experimenting with scalar subqueries and correlated columns, we uncover some peculiar behavior related to implied predicates and query optimization. This video is packed with insights for SQL Server enthusiasts, whether you’re a seasoned professional or just starting your journey. If you enjoyed this deep dive into query plan mysteries, consider supporting my channel by becoming a member or simply liking, commenting, and subscribing—every bit helps!

Full Transcript

Erik Darling here with Darling Data. If you watched yesterday’s video, you know that I asked you a riddle. Since I’m recording this way ahead of time, like unbelievably far ahead of time, I have no idea if anyone answered, came up with anything good. If you did, good job. If you didn’t, shame on you. Anyway, let’s talk a little bit about channel stuff and lifestyle. just get it all out of our system. And then I’m gonna tell you what happened with that query plan. So if you love answers or riddles, you can sign up for a membership and you can support all my efforts to bring you interesting SQL Server content. Whatever. There’s a link down in the video description for that. If you don’t have money for whatever reason, or you don’t feel like spending it, I don’t know, whatever Scrooge. You can like, you can comment, you can subscribe. And if you want to ask me questions for my Office Hours episodes, you can click on that link, which will be also down in the video description, right where it belongs. I do consulting. I am a consultant from the Latin word console, which means a tiny lap infant or something. But that’s what I do for money, mostly because YouTube, doesn’t doesn’t pay much of a bill. If you need help with your SQL Server, perhaps you need answers to some riddles. There’s something weird on my forehead. I can never tell. I am available for hire you. You hire me with money, health checks, performance analysis, hands on tuning, dealing with performance emergencies, and of course, keeping your putting your developers in line so that you don’t have performance emergencies anymore. That’s the bulk of my activities. But if you need something else, well, we can talk about it.

And gosh darn it, my rates are reasonable. If you would like some training on SQL Server, I get that too. Boy, do I have it. All the important stuff, all the best stuff. I don’t leave out all the fluff that a lot of other people put in. You get all 24 hours of it for about 150 US dollars. And that is good for life. Por vida, as they say, in various places around the world. I’ve been to some of them, I think. Upcoming events, sequels, Saturday, New York City, ever closer, closer by the day, isn’t it? Taking place on May the 10th of 2025 with a performance tuning pre-con by Andreas Valter on May the 9th. That is the day before, that is Friday. I highly suggest tuning into both of those. The nice folks at Red Gate are taking pass on tour. They are coming to my fine city in August, August 18th to 20th. Someone else’s fine city, Dallas, September 15th to 17th.

And yet another person’s fine city, Amsterdam, August 1st to 3rd. And then Pass Data Community Summit will be taking place in Seattle, November 17th to 21st. So if you live in Seattle, it’ll be your fine city. If not, no, you should show up anyway, maybe. A lot of people just make Seattle their own, from what I hear. But with that out of the way, let’s talk about these strange and dramatic planes.

So the riddle from yesterday, of course, is that when we run this query, as is, it takes about four or five seconds. The performance of it really isn’t the point. That’s not really the riddle here. But the riddle is that it runs and completes successfully, and we get an answer back with a query plan and all that good stuff.

When we try to run that same query with owner user ID removed from the select list and do this, we get an error that the query processor could not produce a query plan because of the hint supplied to the query. So what are we looking for here?

Well, we’ve got a hash join hint on the query plan. All right. Well, that’s, that’s, that’s, that much is obvious. So what, if, if we were to start comparing things here, what we might want to do is get estimated plans for both of these without, without the hash join hint, obviously, because that prevents us from getting a query plan.

Ah, oh, that hurts. Hurts something fierce. So let’s, let’s highlight these.

That took a long time for some reason. That scroll was like ages. So let’s, let’s get estimated plans for these and let’s see what happens. So, um, obviously, this little grabby doohickey will cooperate. Usually it’s Zoomit.

Zoomit is about to not cooperate. Now, before it was the drivey doohickey. The plans are, of course, different. If we sort of frame this up about here, that’s good enough for now. Uh, this, this plan, of course, because of the hash join hint, uses the hash join here.

Uh, and it uses, well, you can see, you can see the, what is another hash join, like, here. But, you know, um, it’s a little hidden from us. I didn’t frame this perfectly.

Uh, and then down here, we have a nested loops join. And, oh, screw it. Let’s just go look. Uh, so here is the second hash join, the first plan, uh, users table. But we have another nested loops join here.

All right. So, there’s, like, obviously the join types are different. But why? And why can’t we use a hash join in this plan when we can in this one? Well, if you were to experiment with these queries, with this query a little bit, you know, we don’t need to experiment with both of them, just this one.

You know, um, like, obviously we could see if the problem is with the comments table, but then we’d have to do a lot more work. So, let’s, let’s, let’s start with the path of least resistance. And let’s quote out this, this, this here scalar subquery.

And we’re not going to use block quotes here because we just need to do this temporarily. Block quotes are forever. The double, double line quotes are for temporary, uh, temporary measures. And now let’s try to get the estimated plan with the hash join.

Well, that, that worked, didn’t it? So, the problem isn’t with this hash join here. The problem is with, uh, the, the, the, the join to the users table. Oops.

That was supposed to be control Z, not capital Z. Uh, join this. The problem is the join to the users table that, uh, gets the display name. But why would that be still? Why?

What, what happens? What is different between these? Well, uh, we have to dig in a little bit here. And we’re going to, I’m going to show you why. And then I’m going to show you some, some, that there is some background knowledge on this, uh, out there on the internet.

If you know, if you, if you know exactly where to look and Microsoft hasn’t ruined it by archiving the, the content yet. But, um, let’s examine the, the, the, the top plan first.

The one that works with the hash joins. Uh, if we look at this, uh, the, the clustered index scan of the post table, we looked at this a little bit in the first, in the first video, but you’ll see that we have a predicate that finds where owner user ID equals 22656.

And then we have an output list down here where owner user ID is emitted, not omitted, but emitted. It is omitted in the second one.

It is emitted in the first one from the scan of the post table, right? And, uh, it is emitted from the scan of the post table because it is in the select list in this one.

If we compare that to the clustered index scan of the post table for the second query that was, we got the estimated plan for without the hash join hint, we will see that owner user ID is not emitted from this one.

It is omitted from this one, right? So owner user ID is no longer emitted. It is omitted.

Exciting stuff. But because of that, SQL Server, uh, can’t, uh, has to use what, uses what’s called implied predicates.

And it uses that in the first plan too. Like, like if you look at the, what happens in the user’s table, we still have the literal value for 22656 here. The problem with this one is more over here. So let’s follow owner user ID throughout the plan.

So owner user ID is emitted here, right? We checked that box. Owner user ID is, um, passed through the hash join here. Owner user ID is grouped here, right?

You can see owner user ID. Uh, and then owner user ID is in the output list of here. But it does not get emitted here.

It gets omitted here. And then because it is not emitted here, that’s why we get the nested loops joined with no join predicate here.

But then here we have a nested loops join where the, that is an apply nested loops, right? But SQL, the, the, the, the value that is getting applied down here is still a literal value. So the, the, the, the online content that I wanted to show you, um, and before I jump into that, uh, it’s, it’s, this is, you know, foundational stuff.

Um, when you have a, uh, or rather if you want a merge join or a hash join, then your join clause has to have at least one equality predicate. If you do not have an equality predicate, SQL Server must use a nested loops join.

For some reason, this one down here, this thing is not our friend when owner user ID is not emitted from the post table. When it is omitted, this thing has a hard, has a real hard time.

So, uh, way back when, uh, I, so you can see the difference here. It is almost, wow. Uh, it is, it is 10 years, almost exactly, uh, to the day when, um, well, I mean, March 23rd, 2019, uh, 10 years to the day when Microsoft, geez, what ha, what happened to my brain?

It is not 2019, it is 2025. So it is like almost 15 years since Craig wrote this post. Ah, my brain’s on fire.

Uh, and this is about implied predicates and query hints. And Craig talks about, uh, another situation where, uh, forcing a hash join, uh, yields an error with one query when it doesn’t with another.

So like changing the predicate from B equals zero on this one to A equals zero. And this one gets us the exact same message that we saw, right? And it’s sort of the same thing happening here, just, you know, in a slightly different manifestation.

So, um, the only thing that seems to make SQL Server hang on to the owner user ID column is to do something goofy to the owner user ID column. Like, um, say P dot owner user ID mod zero plus P dot, uh, P dot owner.

Why can’t I type owner? Wow. And IntelliSense, neither IntelliSense nor Redgate is helping me on this one. So I’m just going to copy and paste that.

But if we do something like this, all of a sudden SQL Server can figure out the query plan again, because we have, we have, we have mangled, we have done enough weird stuff with owner user ID that it now decides to emit it from the post table rather than omit it from the post table.

So if you’re working with a query like this, and you really want to hash join, and you use a hash join hint, and SQL Server’s like, nah, can’t, can’t, can’t do it. Um, part of the problem might be, um, that you have a scalar subquery and your select, select clause, select query clause.

And that scalar subquery, um, references a column that is, uh, well, obviously it’s probably going to reference a correlate, right? That’s what we need to do here.

We correlate. If we correlate on the owner user ID column, but we do not emit, we omit the owner user ID column, then SQL Server will not be able to come up with a hash join plan. Um, that is, that is the best I got on this.

Um, it’s a weird one. It’s a very, very strange one. I admit it. Uh, but that’s why it was kind of a fun riddle. And that’s why we, we thank, uh, any, any higher power that we might believe in, whether it’s a bicycle or a couch or, uh, I don’t know, that’s Maru, the PEZ dispenser, that Craig Friedman existed and blogged as much as he did back when he did.

So, uh, I hope that this answers your riddle to your satisfaction. But if it doesn’t, um, you’re going to have, you’re going to have to go further because I have, I have expended all of my brain, uh, on, on this one.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And, uh, I will see you in the next video where, uh, we will talk about so many fun and interesting things.

You will, you will forget that this ever happened. So, anyway, thank you for watching. Goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.