Signs Your SQL Server Execution Plan Is Lying To You: Nested Loops

Hellish

Video Summary

In this video, I delve into the intricacies of loop joins and their behavior when parameter sensitivity or sniffing comes into play. I share a scenario where a query runs exceptionally fast but then encounters issues when using different parameters, leading to a poorly optimized plan that takes much longer to execute. The video explores how SQL Server caches plans based on initial execution parameters and the challenges this poses for maintaining optimal performance across various input scenarios. Through detailed analysis of query plans and execution statistics, I highlight why recompiling queries might be worth considering in certain situations, especially when dealing with parameter sniffing issues that can significantly impact performance.

Full Transcript

Yay! We’re getting closer to done. So, I must confide in you, nice people, that I have now taken off another piece of clothing during the recording of these videos. And I’ll give you a hint. You can decide which piece of clothing it is. The hint is it’s two feet. The mind wanders. Anyway, what we’re going to talk about in this edition of PlanCacheLyres is when loop joins, when loops are not quite living, or not quite giving you the information that you need about them. So, I’ve got, one of the reason that this is, this is a parameter related PlanCacheLyer is because the number of loops that you have to do depends on the number of rows. And of course, SQL Server, if you have enough rows coming back, will probably not, probably not choose a loop join. However, in the case of parameter sensitivity, parameter sniffing, we might have a query like this that runs very quickly. My gosh, that’s fast. If you look at the query plan, this took just about 22 milliseconds to run. Right? Impressive. Yeah? Good, good. 22 milliseconds. You are fast. And keep in mind these numbers down here. Uh, 518. 518. 518. 391 to 518. That’s okay. We’ve still, we’ve still managed to, to do well here down to where we got 10 of 10 for our, our top 10. Right there. Yeah. Look at that. That’s nice. That’s a good, good plan. Uh, when we run it for parent ID equals zero, it is an un-good plan. Uh, this is the saved version of the plan that I’ve got.

Where this takes two minutes and 28 seconds to run in its entirety. Uh, some stuff to keep in mind here. Um, you know, looking at this plan, we had a whole bunch of nice, tiny, narrow little lines where nested loops made sense. Over here, uh, we have, again, big thick lines with nested loops. Again, not scientific, but usually a pretty good indicator that maybe we made a bad choice. Uh, we have a bad choice. Uh, we have a bad choice. And the operator costs are all inherited from the, the previously compiled plan. The cost of 0%, the estimate of 518 rows, which turns out to be drastically wrong across this entire thing. At least until we get to 10 of 10 over here, which that’s great. But, uh, what, what, you know, we see in the actual plan is things like, uh, the number of executions here. Right. And see, we did a lot of those, right? And that, that should match the number that we see there. Uh, on the, on this inner side of nested loops, we’ll see another really high number of executions.

But this stuff is all, all missing from the estimated plans because the, or the cash plans, because the cash plans will only give us, or rather I should clear that out. They should only give us these lines, right? Say we estimated 517.998 executions. That was dismally, dreadfully wrong, wasn’t it? That was not, not anywhere near reality. Uh, if we look at the cash plan, of course, we’re going to come up, we’re going to come upon a couple frustrating things. One, again, the very low cost. Uh, two, if we go over here, um, you know, I only, I only have the plan in cash this time around for the small one.

But if we look at, uh, I mean, that’s the only one that’s in cash list. That’s the one I ran. The other plan would reuse this plan. The other query would reuse this plan like we just saw. But what we would see in the plan cache is, you know, just the crappy little tiny little arrows. We would see the estimated, estimated number of executions. And it might be difficult to figure out why this is slow, especially because, um, when SQL Server caches a plan like this, well, the, the execution parameters that get cached along with it are for the first execution.

Now, uh, something else, something that I added to Blitzcache a while back was getting, uh, cached execution parameters from the plan XML. This can get a little bit faulty, especially if you’re declaring variables in the plan, in, in, inside of a store procedure or something. Then I can’t get those for some reason. They don’t show up in the same way. Um, the other thing that, uh, is, can be a little odd is if you have multiple, uh, multiples of the same plan in cash, you’ll get, you’ll see duplicate parameters.

I stuck that in there because I thought it might be helpful for parameter sniffing, stuff like that. But if we go look at what SQL Server, how we executed that query, we will see that this first parameter. Why this is unhelpful is because when we have that, all we’re able to generate is this fast plan.

I can run this a million times, one million times, and it will always be fast. Always. Let’s never be slow. Sad, right?

And the, the, uh, the cash plan will never give us any more detail. However, if, you know, we get very lucky and we find, uh, the parameter that causes things to really fly off the hinges, we’re only lucky if we run this query first, right?

Because, like, if we clear out the proc, if we free the proc cache and we just look at an estimated plan for this, we’ll see that this query plan looks a little bit different, right? We go parallel, at least in the estimated plan.

Well, I don’t know if this will happen in real life. Who knows? But now, now this one runs quickly. And if we look at the execution plan for it, we can see why. All right.

Things, we went parallel. Things got sorted around a little bit. There are some differences between this and the, the, uh, well, I should, I should just have that open. All right. Differences between these two plans.

Look at that sort operator is. We do the sort really early on over here. In this plan, we do the, the sort at the end. You know, there’s just, uh, a whole lot of stuff that ends up differently so that we need to process way more rows, uh, throughout all of this plan. Whereas in this plan, we process a lot of rows early on, weed those out, and then go about our work.

And we have, we have a fairly quick plan. What sucks here is that if we look in the plan cache now, right, for, for this iteration, for, uh, just parent ID zero, we’ll see a higher cost, of course. We went parallel and all that stuff.

Uh, and if we go over here, we’ll see that we use a little extra CPU to, um, uh, we used extra, extra CPU in order to, uh, reduce walk lock time, right? We went parallel, but the cache execution parameters here will just be for zero. If we run this with the zero plan, this will be quick too.

So the, really, in the only way that this gives up the ghost completely or in a meaningful way is if we, get this, get this parameter, run it, cache the plan for this, and then run it for parent ID zero. So this is why parameter sniffing is so tough is because not only do you need the parameter for the, for, for, like, get the good plan for a, a parameter, uh, then you need the parameter that when it reuses that plan, uh, things go to, things go to hell.

And then you need to run them in the right order to reproduce it. So this is why we’re parameter sniffing is one of those very, very tough things. And this is why, one of those reasons why I’m like, maybe recompile isn’t the worst thing in the world.

Like, maybe it’s not your enemy. Maybe we should, maybe we should have a second look at recompile. Anyway, uh, that’s it for this video. In the next video, we will talk about spills.

Ooh, spills, chills, thrills. Um, I’m, I’m almost 40, so I need to write a will. Uh, sad.

Anyway, uh, I will see you in the next video. Thanks for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Hellish

Video Summary

Full Transcript

Going Further

Related Posts