bit Obscene Episode 3: The Habits Of Highly Successful Performance Tuners

bit Obscene Episode 3: The Habits Of Highly Successful Performance Tuners



Thanks for watching!

Video Summary

In this video, I dive into the habits of highly successful query tuners, focusing on what sets them apart from those who just dabble in tuning. I share my own experiences and insights, emphasizing the importance of humility, continuous learning, and not being overly dogmatic about specific techniques or paradigms. I also discuss the three-step process I follow when tackling performance issues: identifying why a query is slow, understanding why SQL Server made certain decisions that led to poor performance, and finding a solution that ensures the problem doesn’t recur in the future. Throughout the video, I highlight the value of asking questions and seeking expert advice, even if it means stepping out of your comfort zone or investing some time. By sharing these perspectives, I aim to provide practical insights for anyone looking to improve their query tuning skills and approach problems with a more analytical mindset.

Full Transcript

All right. I overplayed my hand on that one. Yeah. Welcome to, I believe this is episode three of the Bit Obscene podcast. In this episode, we’re going to be talking about the habits of highly successful query tuners.

Not low level query tuners, not amateur query tuners, but highly successful query tuners. So I hope everyone’s buckled up for this ride. Joe, who is just fresh back from the golf course, why don’t you introduce yourself to everyone out there in our massive audience?

I wanted to match you with Erik’s pink, green background color, and I looked through my wardrobe, but, you know, I am not a fancy, worldly man like Erik, so my wardrobe is kind of limited, so this is the best match that I have. I hope the audience can forgive me. The Everlast still hasn’t sent you a tank top.

No, they don’t respond to my emails at all. That’s shocking. Shocking. I think we need to get more viewers. Yeah, I don’t know.

Either that or tell them that you punched a hole in the boxing bag and like really like blow their minds. Speaking of that point, last time Erik was like, I want to get to our bread and butter, I want to talk about query tuning. Yeah.

Because people think they want to hear about query tuning. They think they do. They think they do. I think, I think, well, I mean, they might want to hear about it, but really I think they just want someone to do it for them. Which is usually, usually my experience with clients.

So, so speaking of that topic. Yeah. The very first thing I have to say about, you know, claiming to be a highly successful query tuner myself. Yeah.

Let’s share. You are, you are one of the most highly successful. Yeah. Well, I mean, depending on how you determine success is like, you determine the success by like big houses and cars and jewelry. You are not highly successful.

What if you, if you, if you. Gold watches. Yeah. Maybe that. Yeah. Maybe, maybe, maybe just one, but if, or if you, if you determine highly successful query tuner by being able to successfully tune queries, Joe, you are, you are a rich man. You are incredibly wealthy man.

You would not pass through the eye of a needle on a camel or whatever. So, so basically the way that no one else would define success in society. Well, I mean, there are different, there are different metrics for everybody.

Right. Okay. Am I sharing anything? No, you have nothing on the screen. This is always. Not that anyone listening to this radio show would be able to see what you’re sharing. Yeah.

We should just stream these live. This is, this is a, this is really important too. It seems important. Is it working?

Yeah, that’s working. Yep. So what do you, what do you think of this, this classic artwork? It should basically be in a museum, right? It really should.

Uh, that, that, that belongs in, if not, if not a museum, then at least a blog post. So the important lesson here, or, you know, if you allow me to interpret this artwork, to offer one interpretation.

Please. You did go to college. There’s always someone better than you, which is also true of when, when, when it comes to court. Yeah. And to the, uh, gentleman from New Zealand, who is, who is dunking on Eric and myself is, you know, better than us.

I would say. Yeah. Uh, so keeping with that sports analogy. Oh boy. I’ve heard. Hey, you’ve heard of sports.

Yeah. I’ve heard of sports. Uh, there are a lot of database people who do things like run marathons for some reason. I don’t really get it myself. Or at least five K’s.

I don’t know. Yeah. But a five K finisher t-shirts when you go to conferences. Hmm. Good. That’s, it’s been a lot for me. Um, you know, it’s been said that like some of the very best basketball players, you know, you think, okay, you know, they have played their seasons, they retire, maybe they could be like a, a really good culture of the players, but that isn’t always true.

Hmm. And my understanding is like some of them are just so talented and they’re so good at what they can do that it’s difficult for them to explain it to other people.

Like it just comes naturally to them. Hmm. And, you know, I don’t know if I’d go as far as to say that Eric is the Michael Jordan of query tuning, but definitely not.

I, I do think that even if someone is good at query tuning, if they share their own process, it might not be a good fit for you. So for example, if you were able to hunt down the person in, in that basketball artwork and offer him a very large sum of money and say, I would like you to teach me the exact process you used for query tuning.

He could try to teach you, but would you actually end up doing it his way yourself? Probably not. Well, you, I would say, I think you might try to imitate it, but the thing, the thing with the thing with that, and I’ll, I will let you continue, but this is a salient point.

The thing with that is that the, the process is only as successful as the knowledge that backs it. So like, you know, you like, he might go, like, he might have a very good process that works for him and the set of knowledge that he has.

And the, the, the wisdom that he has acquired, uh, over his, I think 5,000 years on earth. But, uh, like, unless you have that, you know, aiding your, your, like aiding your execution of his process, you know, you, you might be in real trouble trying to, trying to recreate it.

Like, like someone said, like, look at the query plan and like, what do you look at in the query plan? Well, might not be what he looks at in the query plan. Like, like there’s a whole lot of this, there’s like, like years of like mastery that goes into something like this.

Drawing a query plan in whiteboard, just for memory in a room full of Microsoft people. They all have to like, I was some like crazy person or like, or like genius, depending on, you know, which one you looked at.

But, but, but for me, it was, it was, it was just another Thursday. Yeah. You know what I mean? Um, totally. I think it’s important to be humble though. As someone who thinks his thoughts are important enough to record and broadcast, maybe that’s not, maybe it’s not credible coming from me.

I think, I think it’s important to be humble and it’s important to keep learning. And to be selective in higher learning and, you know, to be mindful of what you just said, which is, it’s good learning from other people, but even if they do their absolute best in teaching you, if you’re going to train and reading a blog post or whatever, you know, there might be more to it or one person’s process might not be a good fit for you.

Yeah. You know, cause I mean, you know, it’s, I mean, it’s kind of like, you know, like once you’ve done this for too long, you’re kind of like, you’re so far removed from like what the true, you know, experience is for someone who’s just starting out.

Yeah. Or someone who only has a year of experience. And as we all know, many people are hesitant to ask questions and training and presentations, especially Europeans.

I hear not that. Yeah. Well, no, Europeans like to save up all their venom and then approach you after you’re done speaking and let you know everything they disagree with vehemently. Venomously.

So I think it was, it was back in 2017. I went to training. They’re running Europeans, fortunately. And just six top privilege of attending a secret Saturday pre-con by a great and knowledgeable presenter.

It’s, it’s a not you, by the way, it was Adam Kek. This, this was before he like disappeared from the internet or whatever, and ended up in the Postgres jail, which I think is where he still is. Yeah.

Right. Postgres correctional facility. And I viewed it as not like, so his, his, his, the content was great. Don’t get me wrong.

But my goal was to ask as many questions as I could without getting thrown out. And I’m, and I think I walked that line pretty finely, but you know, but like I was the only one doing that.

And it was like 40 people. Yeah. So it’s like, you know, like here’s someone who can probably answer any question you have. Like, you know, like I think people should be more willing to ask questions, especially when they’re paying money or the very least time to learn from an expert.

Right. And I guess I’m just speaking generally, but especially when, when it comes to query tuning, just because, you know, there’s so many questions you can have, unique scenarios, you know, that you’re aware of and you’re on problems or whatever.

Like I think that people should sell their pride or whatever the problem is. Well, I think, I think part of that, part of the, part of the, yeah. But like part of the hesitancy is, you know, like specifically with something you brought up where, where like, you know, you like, you have a very unique scenario and you want to like figure out an answer to it or get someone’s opinion on it.

And, and like, it might, it might not be easy for you to convey that scenario just with language. Right. It might not be able for you to like set up a good word problem to hand that person to do that.

And you might not be able to do it fast enough where you’re able to pass that question on, get an answer. And then like, or like maybe get an answer without taking a lot of time away from what everyone else is learning who might not have that specific scenario.

If in cases like that, like I always tell people like, you know, you, you showed up to my pre-con and like, like if you have a problem that is too hard for me to like answer directly, well, you’re just telling me about it here.

Like email me, like sit, like I’ll, you know, get the right level of detail for like, you know, like tables, indexes, query plans, query, all that stuff. And like, I’ll, I’ll do my best to answer you offline. Like, you know, there’s, there’s, I don’t, I don’t think there’s any shame in that.

But I, I did a pre-con, I guess, two weeks ago now, just about two weeks ago down in Nashville for a conference called Data Tune. And there were a lot of questions.

There were so many questions that we actually went a little bit over time because like the way that I, like the way that I pace things, I don’t anticipate a lot of hands going up. Cause not a lot of people are like that eager to like, you know, like put themselves on the spot and ask something either for fear of seeming stupid or like, I don’t know, whatever it is. But like, yeah, I don’t know.

Maybe, maybe I’m, you know, just so such an intimidating presence in the room that they’re like, they cower before me. But it was like, there were, there were so many questions that like the end of it, I was just like, I gotta hurry up. Like we’re gonna like, we’re gonna go over here.

So that was nice. But in general, I think you’re right. There’s not enough questions. And usually like the, the, if you’re like on a stage of the microphone, it’s really hard to get questions in that, that situation. Unless you have someone like with a microphone, like walking around the room so that other people can get on the boom box and, and make themselves known.

But what I’m looking for is that everyone else in the room, this person is either here to teach a square tuning or they’re here to run a lot of somebody, I’m not sure which until you started talking. Oh, I’m only here to steal your heart. And your money.

Nah, oh, money. Give your money willingly. The heart, the heart yearns to be captured. At least that’s what I, at least that’s what I saw in an anime. So, so I spent a lot of time talking about how other people’s process is going to be good for you.

Yep. But I want to share my own process. Cause I think my process is the best and everyone can use it. Yes.

You know, blah, blah, blah. Is there anything you want to add on this topic before I get into that? Uh, I think the most problem, the lesson I’ve learned the hard way a lot, a lot quite often. And I think it’s a problem that is, um, prominent and consistent, especially with database people is that they are overly dogmatic about things.

Um, like, you know, they’ll see like a cursor and just be like cursors, or they’ll see a heap and be like heaps. And like, it’s like, there are a lot of, there are a lot of, there are a lot of paradigms like that, where like, they’ll just see something in a query and be like, oh, like idiot. How, how could you do that?

The problem staring you right in the face. When like, like that, that’s not the issue. Like there, there are like five other things going wrong, but the thing that you’re focusing on is just like the thing that, you know, you’ve been, you’ve been trained to bark at. And, uh, you know, so like, don’t like, like not being overly dogmatic is a good one.

And like, be like less dogmatic and more analytical. Like someone, someone might be bringing you a query with a legitimate problem that has nothing to do with what, you know, like whatever bell you think is wrong, whatever dog whistle you think has blown for you. So that’s, that’s what I would, that’s, that would be my contribution before you, you read us your haiku.

I think that’s a really good point. And I’m going to build on what you just said with a few examples. That’s amazing.

Cause I, I didn’t know what you were going to say. It’s true. Eric comes in this totally unprepared. I, I like scribble some things down, like right before. And that’s our, that’s the entirety of our preparation.

I did come prepared. I came mentally prepared like Jay Z. I don’t, I don’t write things down. Okay. I don’t know if our audience is going to get that reference, but they might, they might not.

Um, so for me, if. Very tuning is often a three step process. First step is identify why the query is slow.

Why performance is bad. And that’s like, that’s a very specific. Like detail or kind of question. Like, like which operator is slow in the plan.

If it’s that kind of issue. So that’s the first thing. I don’t have a wide slow. Second thing is. As much as I can, whether that’s limited by time or knowledge or willingness, try to figure out why did SQL Server do something that ended up being slow? Because it’s supposed to make good decisions, right?

The correct. The correct. The correct. Is supposed to. Choose fast, efficient plans, which we don’t have to worry about. Why am I paying all this money when SQL Server is making decisions. This autonomous tuning database.

That cause problems for me. Yeah. SQL Server should be serving me. I shouldn’t be serving. Okay. Okay. And then, and then the third thing is.

How can. The performance problem be solved in a way that I never have to worry about this procedure or code or query again. Cause.

You know, like for me. I, I hate to see a procedure and then tune it. And a couple of months later, it’s the same procedure again. And so on.

Now for consultants, maybe it’s totally different. You know, maybe you’ve got your like money making procedures that you tune over and over. Depends on how you make your money consulting. Uh, for me that, that wouldn’t be, uh, that wouldn’t be a money maker for me.

Okay. All right. Well, there you have it. But, but yeah, like, you know, for me, it’s, it’s, it’s those three steps. Yeah.

Identify why the query is slow. Try to figure out why SQL Server is doing something slow. Mm. And then try to find a fix which solves the problem forever. And I never have to worry about that, that, that procedure. And.

It’s good. I like it. Thank you. Um, I don’t know why I started doing this. Cause it, you know, it’s, it’s not like something you, it’s not like something that you like, you know, decide like beforehand.

Oh, I’m going to do this. Right. It’s something that kind of naturally happens over time. Um, so now I’m going to get into details. So speaking of being dogmatic, I don’t know if I should say this, but. Oh, that means you should say it.

Yeah. I mean, like the whole reason we have this is to, you know, finally. Yeah. Yeah. Um, kind of feel like the general, like base SQL Server query tuning, like knowledge level is apparently stuck at a SQL Server 2008 R2 level, which we all know is, was like the very best release of all time. Yeah.

Clearly. Well, like, you know, like. It was my favorite because it didn’t have availability groups. As soon as those came along, everything went south. Yeah.

The reason I say that it is, you know, like admittedly, I was still in college when it, when it was released, but it feels like there’s a lot of content created. Mm-hmm . You know, there are some good features added.

People wrote books. Mm-hmm . Uh, that the MCM program was, was winding down. Mm-hmm . Uh, and. I don’t know. Like, it feels like there’s been like a lot of copying and pasting.

Like, I feel like we haven’t really like at a base level evolved from that era, you know, and as, and as Microsoft add features to make things easier, like, you know, some people would talk about them, but like in a lot of cases, people aren’t even aware, you know, like. Yep. And, you know, like, um, and I think to give what I think is the most important improvement Microsoft has done, that would be the operator level statistics.

Or you get an actual plan. Yep. That shows you at the operator level, IO, CPU time, elapsed time.

Yep. Like you can actually look, figure out, okay, like this query was slow for this specific reason, you know, like, for example, the query took 15 seconds, 14 seconds were spent building. This, this index pool.

Mm-hmm . So therefore, you make the query faster and you dig around this index pool. Like, like that’s there. Yep. Like, like SQL Server tells you why it’s slow. It’s telling you, it, it, it, it wants you to know. It wants you to know.

Why the query is slow. But I actually, I actually have a good, good question about that for you, because it’s, it’s even one that at times I struggle with because you like, you might find. Okay.

Let’s, let’s just like put some, put some numbers on it. Let’s just say you have a query that runs for five seconds. Right. But then you look at the query plan and there are like 500 operators in it. Right.

It’s a big query plan. And no specific operator is doing anything slowly. The query plan is just the sum of its parts. Right. Like, like there’s like, you know, maybe a hundred milliseconds, 200 milliseconds, like, you know, 50 milliseconds. And they all add up to five seconds, but no particular like thing.

There’s not like a big, like, like, like, like usually with the query plan, like you’re talking about where it’s slow. You’re like, look at the query plan. And it’ll be like, you see like the end time all the way over here.

And like, assuming it’s role mode and everything’s cumulative, you see like the end time over here. And you sort of like follow the query tree until there’s like, like, like 15 seconds. And then like zero seconds after it.

And you follow that 15 second down. And like, you find like the part in the branch was like, oh, like there’s like two things that take seven seconds. Right. So like you can figure those two, seven second things out. But when you have a bunch of stuff that, you know, is individually very fast, but cumulatively, it adds up to very slow.

What’s your approach there? Sure is. I think the first step would be, you know, you said like five and operators, right?

Well, I mean, I was just, I was just trying to, I said five seconds. I was like, you know, you can make it like a hundred operators. Yeah. All right. A hundred operators. I would start with why is Kirkman a hundred operators?

Yeah. Yeah. If you’re lucky, it’ll be a case where like, it’s a query that can be simplified by like, you know, moving repeated sub queries and you apply. Yeah. And then you can just, you can just get rid of them.

You can reduce the number of operators. Or maybe there’s a logical timetable you can have or something. I mean, that would be my first step. Yeah. You know, like when it’s a very complex plan and it seems like there’s no clear operator that makes it slow. I mean, it’s basically the most difficult case.

Yeah. Yeah. No, it’s, it’s, it’s hard as hell. Like you look at it and you know, it’s like, there’s not like a CTE that got re-referenced and there’s not like, you know, any obvious thing that you could like quickly, like add an index or, you know, tweak to make better. Like, it’s one of those things where you just like, you sit there and you stare at and you’re like, crap.

Like, like, like, like you, like, like you thinking like, even if I like split this thing up and broken into 10 tables, like, like sure. I might end up with like, like individually faster queries, but like, it still might add up to that five seconds by the time you get done, like 10 tabling. And then like, like, like doing the final result from the 10 tables.

Well, I mean, this is a little cop out, but you know, sometimes queries are slow because they’re actually doing the work that then that needs to be done. Yeah, yeah, totally. Like, like, I’ve definitely had procedures where I wanted to speed them up and it, it like wasn’t a hard operator plan.

It was only a few joints, but the nature of the work was just not great from a relational database performance point of view. It’s like, okay, well, you know, like, like, this is going to take two seconds. That’s just how it is.

Yeah. And, you know, in that case, it was fine. If you’re in a case where SQL Server is doing a good job with the query plan, there are any obvious things to make better, you know, it feels like CPU time it’s using is well spent, then you’re really getting into a, into a net. You’re getting into a, it’s a problem that’s hard to solve, right?

Like, you know, you could, okay, I need better hardware. I need to rethink doing this in SQL Server. Yeah. You know, I mean, like, if it’s a parallel plan, you can like, maybe try a higher DOP to see if like recruiting more CPUs is beneficial. But like, if it’s a single threaded plan, you know, this is like, like you look at the weight stats, like, that’s actually another good thing that I think Microsoft added a query plans was query level weight stats.

And like, you look at the query level weight stats, and it’s like, like three milliseconds of SOS scheduler yield. And you’re like, this thing wasn’t even like getting beaten up by any other queries. This is just like on its own, like almost in a vacuum.

Like it was doing all the work that you asked it to do as well as you asked it as well as it can. Like that, like that kind of performance problems like wild to me, because I’m like, like, where do you go from there? I would put weight stats in the query plan is probably there’s, well, I don’t know about second.

I mean, if I had to pick the top here, I would say query store in general. Yep. The actual operator statistics and weight stats in the actual plans, like those three things can get you so much and game changers. Absolutely.

Right. But like, and you know, like people immediately do talk about query store. I don’t know if they say useful things. But like, I mean, like, like, thinking back to my old stack exchange days, like, you know, like, I mean, like, even there, people aren’t often talking about high-level statistics or weight stats. It’s just the usual 2008 R2 baseline and stuff.

Yeah. But going back to, you know, you leading off with the most difficult example. Thanks for that, by the way. Right into the meat.

I mean, like, I have a much narrower worldview than you. The performance problems I see at the query level, it’s almost never what you’re describing. No, but.

There’s usually some obvious thing. Oh, yeah, same. But like, like, that’s the one that sticks. Yeah. That’s the one that sticks out to me as being like, the most challenging one to solve. Because not only are you looking at this query plan, but you’re looking at this query plan, like on a, on a call with a client who’s just like expecting you to do something magical to fix it. And you’re just, and you’re staring at it and you’re like, I, I, I think I can do something with this.

Just give me, give me a few minutes to think about it. Like. All right. Well, I don’t have your problem. Yeah.

Under the gun a little bit. I can. So it’s when you bring in the guy who’s wearing a suit and tie, right? Yeah, definitely. Bring in some Oracle consultants. That’s right.

Um, so. Going back to step one of my process, like why is query slow? Um, so, so give some examples of like what I feel acceptable answers are. Yeah.

Um, I, uh, looked at the query weight stats and there’s a ton of SOS scheduling. Yeah. Or I looked at the query weight stats and there’s some CPU throttling cause it’s in some dinky Azure SQL.

Oh yeah. The throttle weights. I love seeing those. It’s waiting for a memory graph or there’s a big temp TV spill and it’s like level 100 and it’s, it has a lot of time.

Or it’s, you know, or there’s an index pool or there’s an S loop joint and on the inner side, it’s scanning a table a million times. Um, oh, that scan under a top is probably one of the things I look for the most.

Yeah. Yeah. Um, so for me, like, you know, and you can also look for these news. It’s spending too much CPU time or just spending too little, you know, you know, too much being it’s doing a lot of work, but it’s inefficient too little being the elapsed time is much greater than CPU time because it’s the waiting in something, whether it’s IO or cloud throttling.

Yeah. Or whatever. Yeah. Yeah. We’re live. So like, like, like I’m looking whenever possible to get that level of specificity for like, you know, like, okay, the query slow.

Why is it slow? Like, what is it doing that makes it slow? Yeah. So I try to make that, that, that, that my, my, my first step. Um, so in the second step, which, you know, admittedly is a luxury of someone who isn’t on a call with someone, you know, with people who are like waiting for the instant answer has a gun to their head.

Yeah. Why did SQL Server pick that plan? And you know, like, I mean, like sometimes you just can’t get that answer or it’s impractical. No, I mean, yeah. It’s, I mean, it’s an optional thing.

Yeah. I think it’s helpful when you can do it. Yeah, totally. Like that’s where I think, you know, things like query and index hints, uh, can be really useful if you, if you have the time to spare to like use those, because you might, you might, you might like look at a query plan and be like, no, okay.

Like, you know, like, I, I think this is a dumb, uh, dumb choice. Uh, let’s see what happens if we do my choice instead, you know, like sometimes you might be right.

Sometimes you might be wrong. Uh, but like, I think, I think it is useful when you can to experiment. Um, with query hints and index hints and, uh, you know, stuff like that as much as possible, just to figure out like, like, like from like a query tuning point of view, like, am I on the right track and thinking that this is just the wrong plan for the query, or is this like the best possible plan for the query given what we have to work with?

That’s something that I’ll do sometimes too. It is a great way to try to narrow down the answer to question too. Yeah.

Um, I had one the other day, which was, I had a cluster index scan, the nurse have an instant loop. Yeah. There was an index that, that would have worked. Yeah. But optimizer wasn’t picking it. Yeah.

So I forced it to pick it and I saw the cost a little bit higher. Yep. So, you know, like, cause like if you’re not getting the query plan you want, it’s either the cost is too high or it just wasn’t considered. Yeah.

Or, or yeah, like it wasn’t considered either because like the optimizer ruled it out early on or the, like, or the, I’m sorry, the optimizer costed it out or the optimizer never got to an exploration point where it, it found that plan. So, and, and, and, and, and the case I had the index, which would have worked. Yeah.

Had a lot of included columns. And I, and I guess I had so many that, you know, due to, due to how it was costed, the optimizer thought, oh, I might as well just, just do the scan because this index is so wide. You know, I mean, there’s like a million other.

Yeah. nonclustered index that competed for space with the clustered index. That’s, that’s special. Well, I mean, it was. Have you tried defragmenting the index? I feel like it was one of those areas where the optimizer was wrong and, you know, the optimizer could have been more sophisticated and making that decision.

Yeah. Anyway, I ended up creating a similar index with fewer included columns and it, it went for that. Okay.

That’s a, that’s a, that’s a really interesting demo, I think. And this is definitely like, isn’t a common thing I do. In fact, I, I hate creating indexes that have the same key columns, you know? Sure.

Yeah. Because, you know, some very generous people who make tools for the community, their, their tools will throw a little warning. Like, hey dummy, you’ve got some, got some indexes, probably don’t need. Yeah.

Did it, did you ever consider that the index you created was so, so superior that you could have dropped the other index? With all the includes. I didn’t want to, I didn’t want to think about that.

It’s not, not a habit of a highly successful performance. To be fair, the, the actual table itself was, what was on the small side? The problem was no executions.

Yeah. So, you know, like, like, if it was a big table and the index was taking a lot of space, you know, then, then I might not. Sure. But, you know, like in this case, it’s like, well, whatever, you know? Yeah.

It’s this tiny table, two indexes. We have more important things to do. Or live and let index, I guess. And I actually, the poor job, I read my notes here. So, to, to give an example of the first step, you know, figuring out why the query is slow.

Um, I remember a question asked on Stack Exchange. Someone was like, yeah, this query took 10 seconds. That’s slow.

They were nice not to provide an actual plan. And some well-meaning answer said, oh, you know, I see you have four XML paths in your select and that’s slow, you know, you shouldn’t do that. Use string ag or something.

Yep. And I looked at the actual plan in the operating statistics. Mm-hmm. And the four XML part took like 10 milliseconds. Yeah.

Out of the 10 seconds. Now, you know, like if you’re in like the 2008 R2 era where the runner operator statistics, like, you know, maybe doing that kind of like analysis level, like I’m just going to read the select query and then, you know, dogmatically, you know, think about, oh, like, like what things in the query are bad. Like, oh, it’s slow query that’s bad.

From the path that’s bad. And then, you know, just rewrite the query until it gets faster. Mm-hmm. But SQL Server wants you to know why the query is slow. That’s true.

It wants you to know. It’s true. You have the tools. And, you know, if you let it whisper into your ear and you will listen to it, then you would know that, okay, I can just ignore this four XML path bar completely because it’s only 10 milliseconds. Yeah.

Well, I mean, the stuff that I usually, like, not to get too off track, but like, especially like sub queries in a select list like that, it’s like SQL Server does, I think, a pretty decent job of doing a pretty decent job of doing as much joining and filtering as possible to get to the number of rows that you actually need to, like, do anything with in those sub queries.

Usually those aren’t the big problem unless, like, something in that sub query, like, doesn’t have a good supporting index. Like you talked about, it ends up with an eager index pool or something. Like, there are definitely situations where, like, you know, they can be a problem, but usually, like, by the time you get down to, like, your select top 10 query, you have 10 rows that go out and do that sub query stuff.

They’re usually not going to be the problem. Usually the problem is getting to the point where you do the sub queries in the select list. Maybe that’s true, you know, 16 years ago.

Maybe, yeah. You know, it’s still going to be in everyone’s mind because, you know, Microsoft does these things and community ignores them. Yeah, well, I mean, you know, it’s a funny thing because, like, especially, like, old forum posts, like, someone would post a query.

Actually, I mean, it still happens. Like, you know, like, I’m still on Stack Exchange. I think even today there was a question where someone was just like, this query is very slow and all I do is post the query and they’re like, any pointers?

And you’re like, no. Like, sure, there’s, like, there might be, like, stuff in there that you, like, you know, could, like, nitpick and be like, well, I wouldn’t do this this way. I wouldn’t do this this way.

But I don’t necessarily know that that’s what’s making the query slow. You know, like, you could see a perfectly written query and be like, crap, it takes, that thing takes 20 seconds, really? And then, like, you get the query plan and you realize that, like, either, like, there are just no, no indexes to help that query or it’s on, like, an Azure SQL DB, like, P negative three instance with, like, like half a hyper thread and, like, like 256 megs of RAM.

And you’re like, well, okay, obviously, some other reasons behind that. But, like, you know, I find, I find the, like. Just diagnosing what’s wrong with the query based on the query text is quite often misleading.

I don’t think that’s the best way to get the answer most of the time. Like, there are some things in queries that are good to note in the way that they’re written that, like, you should keep in mind for later when you’re looking at the query plan. But, I mean, if you’re not looking at the query plan for something, like the actual execution plan, there’s a very, very small chance that you’re going to be able to solve the problem, like, like, like realistically, or like, you know.

I wouldn’t go play it that far. Because then you want to have a national plan and they just won’t give it to you. Yeah.

You know, it’s making things. Well, then it’s just best effort. Right. Yeah. But, you know, I mean, like, you know, I wouldn’t go as far as to say that you won’t be able to. And I’m sure the way you say it, you don’t really mean it because, you know. No, I don’t mean you would never be able to solve it.

I just mean that, like, you know, it just makes a task much harder and it’s just a lot more expended effort and a lot more like, oh, well, try this. No, that didn’t work. Okay, well, try this.

Like, you know, it’s just a lot harder. But for a long time, that’s like all people had to deal with. So people probably like mentally made a checklist of things that seemed to help in the past and now, like when they just see the query text, they’re like, oh, well, you know, try these three things, maybe change the query this way and that’ll do it. But, you know, like I said, it’s just overly, like an overly dogmatic way of doing things.

If I’m stick with CERA 2017 or higher, as you know, you know, you can get weight stats in the query store. I mean, they’re organized poorly, but even that alone, you know, even if you don’t have an actual plan, like, you know, like, like, there’s just so many more hints and breadcrumbs and information out there, even without an actual plan, as long as you’re on it. Like, well, I think 2016 is 2016 still supported.

It’s still supported, but it’s, it’s on its, it’s on its way out. Yeah. Like if you’re not on an SP3, like you’re, it’s not supported. Like SP3, I don’t, I don’t pay a lot of attention to the, like the expiration dates on things.

But, you know, like, like, from my point of view, if you’re still on SQL Server 2014, and you have like a, you know, business performance critical, like server, like you’re, you’re screwing yourself, even if it was a situation where like, you know, the vendor won’t support, like, you know, whatever compatibility level, blah, blah, blah. You know, it’s, it, like, like, upgrade to a version that where you can get like reasonable feedback on like, what’s what’s happening, like give yourself as many tools and as much available as you can to figure out what the performance problem is like staying compat level 20, whatever, you know, I don’t care. But, you know, that’s, that’s like, that’s like almost the least of my concerns.

I’ve seen more problems from up in compat levels and from staying at a compat level. But, you know, like getting to a root cause of performance problems, like has just slowly, I think, gotten easier and easier as Microsoft adds like insights and tooling and stuff to the, to the product that allow, you know, like allow you to do your job better. Pretty soon, I suppose, one of us could look us up if we really wanted to.

Pretty soon, all the support versions of SQL Server will have query store wait stats available. And, I mean, it’s, it’s something to think about. You know, like, like you said, they’re all going to have operator times in the actual plans.

They’re going to have wait stats in the actual plans. Who knows what else might get added to actual plans to make things like a little bit easier to figure out. You know, like my, for me, I wish, I really do wish query plans were more portable.

So that, like, as a perform, like, like when SQL Server introduced like a DBCC clone database, right? And you could give someone a stats only copy of your database. And like, you could, they could like run queries against it.

And like, granted, the IO stuff wouldn’t be there. But like, you know, you could like see, at least see SQL Server’s like decision making process if you like change stuff. Like, like adding an index was stupid, but whatever.

I wish that query plans were more portable in that, like, you could hand someone a, like an executable query plan. And it would almost be executable for you where you could like change things about the query and see what would have happened, like in the context of like what’s stored in the XML when it, like, like with your changes in it. Like, I, like, I wish that there were like better ways to like, like, just give, make that sort of stuff available.

Like, like, I know it’s, I know it’s a pipe dream. I know there’s probably like 10 billion technical reasons why that’s never going to happen. But I do wish that it was more, there was like a way to like sandbox a query plan so that you could like change things about the query and see what happens.

As soon as Connor’s restraining order is up, we’ll get on. Well, Joe Sack is back at Microsoft. So maybe, maybe there’s light at the end of the tunnel.

Really? Yeah. Really? Yeah. Like a couple months ago. Yeah. St. Joe is back at Microsoft. You know, it was a couple of years ago.

I tried emailing him at Microsoft and it bounced back. And it was like the worst day of my life. Yeah. I cried. And it’s like, man. When he, when he, when he told me he was going to MongoDB, I was like, it’s over. Like, it’s, it’s, it’s over.

That’s it. So, so, so for the benefit of our audience, there are some, his title was, was the program manager? Yeah.

But yeah, I think so. I don’t know. It was like program manager. He had like 10 billion people reporting to him. There are some people. There’s some people who are just like so helpful and pleasant to work with.

It’s kind of hard to believe. Yeah. And because, you know, it’s like when you’re working with a vendor, they’re often standoffish and not transparent. You know, like think about all the bad experiences they have with vendors, right?

Like the most, not our fault, not our fault people on the planet. But Joe Sack never, never had, never had a bad conversation. The real deal.

Never, never like, never had a question. Like never sent him a question where it was just like, like, that’s not a good question or like, you know, that’s, you know, can’t answer that, blah, blah, blah. Anything like that.

He’s always just like, if he doesn’t know, he’ll get you to the right person or he’ll ask the right person internally and like loop them in or something. Like fantastic. I wish he ran SQL Server.

Should I run the show? I’ll ask him. All right, good. I’ll ask. He’s in London at SQL bits right now. So I’m probably not going to ask today, but some point in the future.

So going back to the topic of the show, just be how great Joe Sack is. I mean, I’d be fine with that. I mean, I feel like he’d be embarrassed, but I’d be fine with that.

I think he’d be fine with that. But going back to our current topic. So step two, why did SQL Server pick the slow pan and it’s supposed to not do that? Yeah.

So to get some more real world examples. I mean, sometimes it had no choice. Like for example, Corey’s running for member grants a long time. Yeah. It could be, it could be your query is asking too much memory. It could be, there’s too many other queries asking for too much memory.

It could be, you know, memories too low. Like it may have just not had a better option. Someone could have really messed up resource governor. Yeah.

Or, you know, like there may be an assumption built in there, which isn’t being followed by your workload or your configuration. And to give an example on those lines, someone asked a question on Stack Exchange or their query took 30 seconds. Mm-hmm.

They wanted help skiing it off. And I looked at the plan. It was spending like 30 seconds like with like throttling IO weights. I remember that.

And it was like, it was like, it was like an, like an eight kilobytes spilled attempt DB. Yeah. That like caused all of that. Yeah.

And I mean, it’s, and you know, like you, you might look at attempt DB something. Okay. Well, it’s going to attempt DB is bad. I should re-rep my query to not do that. But in this case, like it was spilling so little data to temp DB, like trying to live in the world where like, oh, you spill eight kilobytes of data.

Yeah. It takes five seconds. 30 seconds. Yeah. It’s just like, no, it’s just like, no, there’s no way.

Like the problem is not the query. The problem is you’re running SQL Server on that. What’d you say? That S minus three. Yeah. Something like that. Resource. And I think it was that happy ending.

And he was able to convince someone to not run SQL Server on a toaster. And then, and then his problem. Yeah. No, but yeah. You know, like, like, like, that’s one of the things where I’m like understanding the why. And then in this case, you’re able to pick a fix, which solves the problem.

Right. Right. Like if you just fix that query, then you’re going to have a million hundred queries, which, which dare to spill, you know, a few pages, the temp DB. I think the important takeaway there is that sometimes it is not the query’s fault that it is slow.

Yes. Yes. And, you know, like for, for some white stats, you know, I mean, it depends on the weight set in the scenario, of course. So what are some examples?

Well, it’s, you know, like if you, if you have a lot of temp DB weights, you know, it, it, it, it, it could be, it could be your temp DB I was too slow. It could be there’s too many other processes all spilling. It could be your career to spill less, you know, it could be any more memory.

Yeah. Yeah. So like along those lines, I think one of my, one of my favorite weights to point out when I’m trying to help people solve, like, like, like solve a performance problem. Like at the workload level is like, if every single query is running and waiting on page IO latch SH, or I guess EX to some extent, if you’re looking at modification queries, but let’s just page IO latch SH is probably like the, like the more common of them.

Like, like, like you’re like either your buffer pool, like, like your memory for the system is just like not up to what you’re asking it to do. Or like, you know, like granted, like sometimes, sometimes indexing helps with that. Like queries have narrower indexes to read from that.

Everything’s a big clustered index scan, but like, like sometimes it’s like those query level weight stats are not just valuable for tuning a single query, but they’re helpful for indicating a problem with like an entire workload. Because like, you know, if like you, you pick out three problem queries, they, so they all take like 30 to 60 seconds and you like, you run them and like, you know, just right before your eyes, they all take 30 to 60 seconds. You look at the weight stats and like 80, 90, a hundred percent of that weight time is reading pages from disk into memory.

Like, you know, you, like there are certain numbers that you have to get in order in order for like queries to be reliably fast. And that’s one of, that’s to me, something like looking at the page IO latch weights, that’s a lot different from like when queries have like high parallelism weights or high SOS scheduler yield. That’s just like, like your queries just have like this hard limit.

They were never going to get faster than this until these other numbers get better. Either like you get rid of a lot of data or you are like archive it off in some way, or, you know, these queries have better indexes so that they don’t have to like scan your eight gig clustered index. They can like read from like a few hundred meg nonclustered index.

There’s like a lot of different things that go into it, but you know, the, the, the query level weight stats, you know, I guess my, my point is they’re not just valuable for a single query. They can be valuable, like tuning a workload, like in general. Yeah.

And I think this is making the same point you just made, but like, if you see high CPU weight time in a query, you know, that, that could be, there’s too much other stuff running. Or it could be your query’s running a lot. Yeah.

And it’s using too much CPU, right? Like, like it really depends on where your problem is. Yeah. No, but again, this is, this is, this is where the, the, the knowledge behind the process is so valuable. Yeah.

Right. Yeah. Like, you know, like for all, like all the things I’m saying, like, you know, step one, I’m definitely why it’s slow. Step two, figure out why it’s just people who are picked. Step three, pick a fix that’s permanent. Like all those take a lot of, well, the more background knowledge you have, the better. Like, like, like the more reasons, you know, why a query should be slow, the better you are reading execution plans, understanding weight stats, you know, like the more possible answers you have, I guess.

The more likely you are to find a good enough answer to solve a problem. Yeah. Like, and like, there, there’s a lot out there to be sure.

Yeah. Which I think is part of why I opened with saying, you know, this is my process. I’m just some guy. Maybe you can’t follow this yourself, but I think it’s a really good place to get to eventually if you can. Yeah.

Like, I mean, like, I just want to solve root cause. Yeah. Like, like for me, like one, like one very big difference that like I’ve sort of like noticed about myself recently is that like, I can almost guess how, like what’s written in a query by looking at the query plan. Now, like, like there are certain like things I, certain like patterns I see in a query plan.

I can almost guess what someone did in the query, like, like spot on. And like, like, I don’t know, like three, four years ago, I probably couldn’t have done that. But like, like, there’s a certain things that I see in a query plan.

I’m like, oh, well, there’s a join with an or clause in there. Or like, you know, stuff like that where like, you know, someone, like, you know, someone did something absurd and you see it in the query plan. You’re like, there’s like two or three reasons for this.

And I can probably guess which one it is. So like, like that, like getting to that level is hard, but it’s nice when you get there because it makes your process easier. No, definitely not.

I wore an extra loose fitting Adidas t-shirt today so that we, there would be no danger of bare chest. All right. Yeah. Because, no, you’re talking about a pretty serious power up here. I know.

No, that was, it was a cool thing to realize. And it was just like, oh, I can take that out. And I was like, yeah, feel okay. No. Yeah, no. So I think everyone watching, you know, you should aspire to get to that point if you’re not there.

Joe was there 10 years ago. That’s why I aspired to get there. Eh, well, it wasn’t that long. 10 years ago, I was a babe in the woods, you know. I was a big fish in a small pond.

So a few more examples of step two, like, you know, a pretty common one is cardinalium assessments. SQL Server thought, yes, it was one thing. It actually wasn’t.

So I picked a plan based on that. I don’t want to talk about that because we’re running out of time. We’re not running out of time. I don’t, I don’t have a hard stop. Do you? Oh, you don’t have a hard stop this time? No, no hard stops. Geez. All right, great. Sometimes, you know, query plans are written in a way that a nest loop join is forced.

Like, if you don’t have an equality on the join clause, you might be forced to nest loop join. There’s no support index. You’re kind of tying SQL Server’s hands.

It has to loop join. There’s no index. Either makes it small or doesn’t make it small. Both are bad in their own ways. But, you know, like, understanding all those possible reasons, you know, like, oh, well, where are some others? Well, it was a query plan timeout.

It just didn’t consider my plan. Or maybe it’s slow because it takes a long time compiling. Yeah. Man, I had a query that took, like, eight seconds to compile and, like, a hundred milliseconds to execute. You’re talking about your hard operator ones.

I think I actually hate the hard, like, the eight-second compile ones more than the hard operator ones. Yeah, that’s true. Share my own pain points on that.

Man. I don’t know if I have a – I mean, you know, there are all kinds of examples. But actually, that brings up something good. It’s like sometimes, like, a query might be perceived as slow and it’s something external to the query, the actual query execution.

It’s making it slow. Like, looking at – I think looking at compile time, well, I mean, well, that’s, like, fairly rarely been the root cause of, like, all of the slowness in a query. Like, there’s definitely been queries where I’m like, well, it takes 10 seconds to compile a plan and the query itself takes 10 seconds.

Like, neither scenario is good. But, like, if you have a – like, if you’re looking at a query and you, like, you’re just, like, looking at the operator times and you’re like, the SQL servers are reporting 100 milliseconds. But every – like, when I hit F5, like, it took 10 seconds.

Like, look at the compile time. Looking at compile time is, like, a fairly valuable thing. Actually, one of the more important ones, which is, oh, I ran the server procedure in the client and it was slow.

So, therefore, it must be a SQL Server. Yeah. Right? Like, if you go into step one, sometimes you can just prove, well, it’s not a SQL Server. Yeah.

Like, oh, okay, well, I think you used a second of CV time and sent 30 seconds waiting to send data to the client. Yeah. You know, this is not a SQL Server. Yeah. And, you know, like, if you’re not doing that kind of analysis, you’re doing the – I’m going to look at the SQL text and, like, reorder things. And change joins to applies and applies to subqueries and go back and forth.

Yeah. And you can add temp tables. Add indexes. Add recompile hint. Like, no, that’s going to work.

Yeah. I mean, at the end of the day, it doesn’t change the fact that you’re sending 5 million rows to, like, some application server that’s, like, dusty old nonsense. Right. I mean, you know, like, there’s all kinds of things application developers can do. Like, I had an issue where – recently where I thought – I thought, in my opinion, in my humble opinion, the start procedure was not sending a lot of data.

Yeah. But the developers, oh, you know, you’re sending 10,000 rows. That’s way too much data for tonight to handle.

And after some digging, they found they were doing some, like, super inefficient, like, time zone conversion in the client row by row. And that was, like, all the time. Yeah.

And they fixed that, and it was just so much faster. Yeah. And it was, like, great, guys. Thanks for checking on. I’m glad because we could solve this database problem together. But, you know, I mean, like, that’s a really important one.

You know, like, everything gets only on the database, and the database certainly has its sins and things it should apologize for. Well, you know, it’s easy to blame the big expensive thing in the corner that’s supposed to be fast for why things are slow. That’s true.

That’s true. You know, yeah, I don’t know. Like, yeah, I mean, there’s probably a good chunk of developers out there who point to the database before, like, looking at their own heat maps for figuring out why something was awful. But, you know, I think, like, blaming the database also isn’t always unreasonable.

Because, you know, when you think about the way, you think about a lot of shops, like, they may not have anyone who can, like, you know, be an expert witness for the database. Like, you know, a lot of, like, you might have, like, a big group of developers who’ll hit the database, but none of them, like, understand how to, like, analyze or, like, dig into the database to figure out if that actually is the problem. And, you know, like, you might run into a lot of situations where, you know, developers think that, you know, they’re doing everything right.

And their code is as precious as every Bloody Mary recipe in Brooklyn. But, you know, like, in reality, like, everyone’s screwing up. The database isn’t screwing up.

The database is doing the best it can with what it has. But, you know, like, sure, like, the database might be slow, but the database might not be slow for a reason that, like, you can claim on the database. It’s, like, something you’re doing to it or with it.

Or, like, you know, you have, like, 500 random application databases all on the same server. And, you know, everyone’s hitting them all at the same time, and that’s what’s screwing you up. So, you know, there’s just so much that goes into a performance investigation that, like, most people don’t make it to.

Like, most people don’t have that luxury, especially if you’re answering questions online, Stack Exchange, or whatever. Like, you don’t have the luxury of, like, going into someone’s environment and, like, poking around and doing, like, a full good analysis. You can, like, look at the query text and you can look at the query plan that’s presented to you.

But, like, a lot of the times there’s just more questions than answers than that. There are some problems when you just have to get super deep. I think it’s very rare, and it’s definitely less frequent for me nowadays.

So I remember once I had a problem where, like, an allocation order scan was too slow. Yeah. Like, that was the problem. Like, how do you solve that, right?

It was just this, or I had a problem where it was just even query tuning, but, you know, to build anticipation for future episodes. Switching a partition took a minute of CPU time. And we had chosen partition switching because, as documentation said, it was a fast and instantaneous operation.

But it was taking 60 seconds, and you just noticed. But you know who helped with that? Good old Dr. No.

Joe Sanky. Oh. St. Joe. St. Joe. It’s always St. Joe, isn’t it? All right.

Have we talked about step three at all yet? You’re about to. All right. Cool. I was going to say, we’ve spent a lot of time on one and two. Well, you know, I mean, they’re all important, and there’s a lot to say about all of them. So just to remind everyone, because we’re talking so much.

Step one, identify specifically why the query is slow. Step two, try to figure out why a SQL Server picked the slow query plan. Step three, how can you solve the problem permanently and completely so you never have to look at that ugly code again?

And for me, that’s a pretty big deal. I’m definitely not always successful. In fact, I’m often not successful.

But, you know, it’s, I mean, it works out sometimes. So I think before that, there’s a philosophical discussion to have, which is. Oh, boy.

Yeah. Well, I’m sure you face this problem many times where an user says, okay, this is too slow. And it’s your job to make it faster. But the question is, how fast does it need to be?

Right. Like, like answering that. Because. I think, you know, I think there’s some natural tension in some companies between developers and their managers. Because some developers might have a passion for making things fast.

But the managers’ performance improvement says, okay, this is a money losing thing. You know, I’m spending developer time and testing time. Yeah.

And I’m introducing risk to make things faster. And we’re not going to sell more licenses when anything comes faster. Yeah. Right. Like, you know, like if you take a very kind of brutal view of the world, you could say, you know, the stopper should be only as fast. Such that people don’t get angry enough to stop paying for it and switch to another business.

Yeah. When you say, like, you’re not going to get new customers. I’m like, fine, but you might keep some. Yeah.

And, you know, I mean, it’s like, you know, to be fair, sometimes things are just unacceptably slow. And, you know, even the managers will let you make performance improvements. But, you know, like, it is true that, you know, like there are some companies that say silly things like performance is one of the hills we stand on. They try to mark themselves as having fast software, which, you know, if you can do that, great.

But, I mean, I think in a lot of other cases, you know, like people want like or sales or whatever else, you know, they want new features, you know, that they want to be making more sales and trying to say, okay, well. Well, they just stay competitive with other products out there, platforms out there. Yeah.

No, I mean, like, it’s, I mean, I’m not saying. We’re ahead of other products or platforms. I mean, like, well, what I’m trying to say is one answer to the age-old problem of how fast does it need to be. Like, I think part of the answer is, you know, considering the economics of it, you know, like how fast does it need to be.

And, you know, like there are some industries and spaces where people do just have to be brutally competitive with each other. Maybe there’s a bunch of governments that you have to do or you’re doing online gambling or whatever.

Oh, stocks and stuff are probably. Yeah. Yeah. Oh, yeah. Right. But I think stocks is the case where people are willing to pay a ton of money to get fast performance. Yeah.

No, like they’re not going to complain about like SQL Server licensing. Yeah. They’re not going to be like, we need 50 more cores. No problem. Yeah. Yeah. Right. I mean, yeah, like that, like that’s kind of the opposite of where I was going, but, you know, I spent that money at a strip club last night. Like, whatever.

You spent the money on a bottle of wine, right? A bottle of Petrus from 1923 or something. How much would that cost for those of us? I have no idea.

I don’t even know if I had Petrus in 1923. I couldn’t tell you. You’re ruining your reputation. That’s not my reputation. My reputation is like a good mix of quantity and quality. I don’t want like, I don’t want a $10,000 bottle of wine.

I’m going to finish it in an hour. I mean, as someone who’s occasionally value alcohol, I try to pick something that that’s like in the middle. And yeah, it seems like I’ll, I mean, you, you chose Lagavulin, which is always the right choice.

No, see, there you go. Never, never going to go wrong. Getting back to what I was saying. Before we started talking about booze and strip clubs.

That’s right. I think that, well, what I would advocate for, you know, kind of just like being a serve others as, you know, like, if you can, like, like, like, try to make things fast. You know, like, I don’t think there are a lot of cases where a developer gets fired because, you know, instead of making it take five seconds, they may take three seconds.

Yeah. Right. So, you know, like, and I’ve worked at previous jobs. I would often sneak in performance fixes that I wouldn’t tell anyone about because, you know, like, it is hard to know what your developer is doing.

Yeah. You know, like, there isn’t anyone who’s like, you know, is going to bust me for spending an hour on Friday making this code faster when, you know, technically we could have gotten away with making it a bit slow. Yeah.

And, you know, like, of course, there’s, there are trade-offs, right? Like, you’re an ad index. Okay. That has its, that has its downsides. There’s certainly a level where you can take performance tuning too far.

Like, oh, I’m the ad in memory LTP to make this faster, which you’re not using. You probably shouldn’t. That’s not something you want to sneak in without telling anyone. Yeah.

Well, I mean, that’s for sure. I mean, so I think, you know, you need to kind of grade performance changes or performance improvements rather where, like, there’s like the heroic improvement, right? Where you go from something taking like, like, you know, easy numbers, like it takes a minute and you get it to take three seconds, right? That’s a heroic improvement.

If you make an improvement where it takes one second instead of three seconds, most people aren’t going to, like, necessarily feel that improvement, right? Like, like, by the time someone’s using an application, they click on something, the request gets sent, the database does something, sends back data, like the application processes and shows it to someone. They might not notice two seconds, right?

They might not, like, notice a two-second tick with something. They’ll notice like a, like a, like 50-second or 55-second tick or something. But, like, there’s like the heroic improvement, right, which is like what I think most people, like, should, like, not stop at, but that should be where most people reassess what improvement they can make past that. Yeah, I’m definitely not advocating against quick wins in all cases, you know.

I think that people are a bit too afraid of compiling. You know, maybe compiling was a big deal in 2008, but, like, you know, and I’m sure you’ve seen this at least once, probably many times, where, oh, the query takes a minute, it only compiles 20 times a day. You got to recompile, it now takes two seconds, right?

So you got to recompile. And, you know, maybe there’s more you could do to make it faster, but, I mean, like, is there anything better than making, like, a one-line gochain and just add and recompile and you’re done, right? I mean.

No, like, anything you do after that is more for you than for the query. That’s like your own edification. That’s just like, I know I can get this from five seconds to two seconds. Yeah, but, like, I mean, I don’t know.

I have the sense that, you know, maybe in the ancient times, people were too liberal with the recompile hints and compiling. Or maybe the software just wasn’t good enough. Yeah, sure.

There’s definitely the hardware was as good. Like, 32-bit software with, like, two and a half cores or something? I’d be afraid of recompiling all the time, too. And Microsoft has made improvements to just cut down on the number of compiles in general. Like, it’s some scenarios for sure.

Like, you know, I mean, definitely pay attention to the number of executions per day, but, like, it’s not executing that many times. I mean, recompile hints really not. Yeah.

It’s not evil. Yeah, and that’s also a case where I would look at compile time for the query, too, though. So, like, if compilation is actually, like, a, like, wall clock duration expensive thing, then I probably wouldn’t consider that. But, like, you’re absolutely right.

And, you know, people write queries in really dumb ways, right? Either, you know, lots of optional parameters or lots of local variables. And, like, when you throw that stuff at the optimizer, it’s just kind of like, well, you’re going to get my worst possible guess. So, like, adding a recompile hint just, like, takes the sting out of all that stuff.

And then, you know, there’s absolutely no reason to, like, avoid it. So, yeah, like, what I’ve been trying to say is, you know, when we’re talking about doing a fix and making it permanent, it’s nice that you can figure out, like, what the goal is. And it’s hard.

You know, like, I mean, we’ve been talking about it, and there’s still so much more that we could say. Yeah. So, you know, like, if you ask any just how fast it should be, it’s probably not the way you want to do it. No.

Because they’re going to say as fast as possible, right? Yeah, as fast as possible. Instantaneous. Instantaneous. Instantaneous. Yeah. So, you’re very often asking. And, you know, like, so hopefully you as a developer can find your end. And, you know, you can look at what other code takes and what past precedent or, you know, talk to your manager or whatever.

But, you know, like, hopefully you can get some idea. Or, you know, like, maybe it’s just somewhere, okay, you spend an hour on it. If you can make it faster than an hour, you just, like, stop there.

You know, like, just picking some strategy there. Well, I mean, particularly with recompile, it’s like, like, like, like I was saying, like, like someone just, like a bunch of optional parameters or local variables, right? And, like, like recompile is this absolutely the path of least resistance for fixing that stuff.

And, like, you’re, like, you can explain that to someone and be like, look, I can add, like, 20 characters to the end of this query. Or I can spend a lot of time turning this into parameterized dynamic SQL. And, like, then we’ll, like, and that would also, like, sort of provide the same benefit.

But that’s going to take me a lot longer to, like, write up and test and make sure everything is functioning right. Whereas this just does, you know, not the equivalent of that, but, like, performance-wise, like, the gains are the same. So, it’s like, you know, especially for something like SSRS reports or, like, you know, you see the stored procedures that get thrown together by those or for those.

And you’re just like, like, no, I’m not rewriting that as dynamic SQL. I’m not rewriting that as, like, you know, anything else. I was like, I’m just sticking a recompile hint on here.

And instead of taking two minutes, it takes, like, 15 seconds. And that’s good enough for this report. Like, query criticality is a big part of it, too, right?

Like, you know, people often, like, look at, like, the PlanCasher query store. And they’ll look at query costs and they’ll be like, look at how expensive this query is. We have to get it to cost less.

And I’m like, okay, well, what’s the context of the query? Like, is it running when people care about it? Or is it, like, is it running at a time when it could either, like, is it a user process? Like, or is it running at a time when it could impact user processes?

Like, there’s a lot of, like, important contextual stuff. Like, every DBA and developer I’ve ever talked to has had, like, a grudge against one particular section of code or portion of code. But it’s not a portion of code that would have, like, a real, like, fixing it would provide some real, like, improvement for the end user.

It’s just, like, something that they dislike. Something that, like, it’s like, oh, the last person who worked here wrote this code and it sucks and I want to fix it. And I’m like, well, to what end?

Like, who’s going to benefit from you fixing this? Aside from, like, you being happy that you threw out, you know, Joe’s old code from 19, actually, 2000, what year is it? 2015, 16?

Who’s going to be happy with you? I don’t know. I don’t have a train of thought, so I’m not aware.

I don’t know. I’m not aware of these derailments. Derailment problems. It looks so bad.

I have to do some other thing for the first time. No, I refuse to edit these. People are going to sit there with 30 seconds of you picking your chin hair. I’m feeling you, man.

I’m feeling the – this is like you on the call. The customer wants the instant answer. You’re sitting there trying not to sweat. Oh, man. All right.

Peround or sniffing? Yeah. Peround or sniffing? Peround or sniffing. Well, we should – we should – we should – we should – we should – I got it. I got it. That helps. I just had to say peround or sniffing three times.

Okay. So, step three. Solve the problem permanently. It’s one of those cases where, you know, having as big of a background knowledge about a particular server as you can is very helpful. Like, for example, you might have a parameter sniffing issue.

Like, you’ve got a query slow. You’re fair enough because it’s – because of parameter sniffing. I mean, there are a lot of ways to solve that, right? Yeah.

Like, you could literally add a semicolon in production. Yeah. And it’ll get a new cache plan. Yeah. And maybe that cache plan is better. And you might think you solved the problem. Yeah.

Right? Which is another way you can go off on a wrong path just by taking a query and, like, rewriting it without really thinking into why it’s slow. Because, oh, I fixed my problem. But you didn’t.

Like, it was a parameter sniffing problem. Yeah. And it’s like, I’m back. Yeah. I don’t know if I want to get into, like, the details of how you fix that. Aside from option recompile? Yeah.

Well, you know, it’s not running off. I got recompile. One thing that I like, which I think you don’t, is I’ll sometimes add an optimized or unknown hymn if it’s compiling a lot. And if it seems like the plan I get with unknown is just good enough, I mean, like, I’ve done it.

So I can do you one better on that. You ready? You ready to get blown out of the water on this one?

I recently fixed a parameter sniffing problem. Not with optimized for unknown. So it was a parameter.

Was it optimized for value? Yeah. So it was a kitchen sink store procedure with lots of optional parameters in it. And I happened to stumble upon a set of optional parameters that produced, like, a fairly good all-around plan, which was way better than the sway between, like, 100 milliseconds and, like, 30 seconds. And everything took, like, 800 milliseconds to, like, 1.5 seconds.

So there was a much narrower swing. Like, it wasn’t as good as, like, the fastest plan, but it was way closer to the fastest plan consistently than the worst plans that you would get with, like, just, like, letting it run normally. So I did optimized for with a set of values for each parameter, and it was gangbusters.

I’ve never felt so weird. I’m a proud of you. I’ve never felt so weird and dirty.

You need to go to the store and buy yourself some bigger shirts, you know? You’re going to be filling them out. That wasn’t a flex, man. I felt so guilty about that.

Because, no, I mean, I remember, I remember, but again, that’s not being dogmatic. I remember attending your, no, it’s absolutely right. I mean, I remember attending your pre-con and you, you erringly dismissed the optimized for value syntax. Did I, did I, did I dismiss optimized for values or optimized for unknown?

Yeah. I think it was both, but I couldn’t just remember. If you like, just have to record straight. I don’t, I don’t think, I don’t think I’ve ever had a big problem with optimized for values. And I don’t think I ever have because, you know, like, I don’t know, probably from some, at a mechanic session, like 15 years ago, where it was just like, like top 2 billion optimized for top equals one.

And I was like, cool, optimized for something is good. But like optimized for unknown is one of those things that like just years of consulting, I would see people use that all the time.

And as soon as you quoted that out, it was like, okay, like things were okay. And so like, I, I optimized for unknown. I recognize, I fully recognize that like once every like three, it’s like, it’s like a weird moon cycle. It’s like once every three to five years, you’ll find a situation where optimized for unknown is dope.

But most of the time I’m just like, man, I don’t know. As a consultant, as a consultant, it’s like once every three to five years. For furry tuners, hey, here’s a trick you can use once every three to five years.

Just mark today on your calendar. And three years from now, you can try it. And four years from now, you can try it. And five years from now, you can try it.

No other time can you try it. I have a much shorter cycle personally for achieving good results with optimized for unknown. But that could be due to the nature of my workload. Very well could be.

The way things I do. I do think that parameter stuffing is, you know, I think it’s like the easiest problem to solve in the wrong way. Right.

Because as we said, you can literally do anything. Yeah. And the problem might go away only to come back. So, you know, if you can identify those cases. And, oh, I guess on that point, you know, we’re talking about the importance of actual plans.

You know, maybe the problem happened yesterday and you’re not seeing it today. One thing I do is I go in the query store, get the optimized, or I get the parameter cache values. Then I add an optimized for value hints with those cache values and get an actual plan of that.

And that almost always works. So, I don’t know, like, how well known that is. But if you’re looking to get an actual plan for a previous parameter setting problem, that’s the method I use.

So, I think in a sort of recent blog post, I detailed what I do, which is fairly similar to what you do, where I’ll create a temporary store procedure with the query text and the parameter values that are stored in the XML. And then I can just rerun that temporary store procedure with different parameter values and see what gets sniffed and what order I have to execute stuff in. And, you know, like, so, like, I love that you get the compile time parameters from query store.

But, like, I wish that there was some, like, way to log the, like, runtime parameters so that you could figure out exactly, like, which parameter sets, like, fight with each other. Because it can get really complicated. Like, you have, like, you know, five, six, seven, ten parameters to deal with.

And you’re like, okay, like, well, which ones do I have to change in order to get this to be bad? What do I have to change them to in order to get this to be bad? And, you know, you don’t always have the answer quickly because, like, you don’t know which parameter it is.

And worse, you don’t know, like, what values got passed in. It could be something as stupid as, like, like a date range being, like, usually a day and getting expanded to, like, six months. Or it could be something, like, just, like, getting, like, a different user ID or something in there.

Like, like, that part of parameter sniffing I find difficult. Because I don’t always have access to everything that I need to, like, like, over time figure out, like, which columns are involved. Which ones have skew in them.

Which ones might not have skew on their own but have skew when they’re, like, put together. There’s a lot of really tough stuff that can go on in that. But I think just, you know, from a general troubleshooting perspective, you know, there are things you can look for in the query store that make life easier. So, like, when you’re looking at query store, you know, queries that on average finish and, like, finish very quickly but might have, like, a max CPU or duration that’s really high.

That’s, like, one really good telltale, one really good sign of parameter sniffing is that, like, usually this is real fast. But sometimes it’s real slow. And sometimes that query will have multiple plans associated with it.

And, like, getting those multiple plans might get you the right parameter values to test to figure out, like, what I have to do to make things slow or not. And, like, another thing that makes that, can make that incredibly complicated is, like, if sometimes passing in a null value makes things slow. Because then you’re just like, how would I have guessed null?

Joe is scoring me on my answer right now. He’s grading me. We could spend a whole session playing about query store.

Maybe we should. Yeah. Maybe we can get it to the right, yeah. Like, you have all those secret server connectors. You can get it to the right person.

I can’t even get anyone to make the GUI better. Oh, I was talking about the GUI. Oh, yeah. No, I’ve been complaining about that for years. No one listens to me.

It was only after many years and great effort was I able to get Azure Data Studio decoupled from SQL Server Management Studio. Oh, that was your… Yeah.

Erin linked it in her blog post. She said this highly upvoted item. Wait. Whoa, whoa, whoa, wait. Wait. You created like a connect item? And I think…

Yeah, I created it. I created… That’s good. Whatever they call it. It’s not connect. Yeah. Yeah. But yeah, I created a user connect voice idea, learn doc. And it was highly upvoted.

And it just so happened that a highly upvoted item also aligned with the future roadmap for SQL Server Management Studio. And so the decoupling was achieved. That’s one of those slight metrics.

It’s like, oh, well, you know, coincidentally, we didn’t care about that anyway. Dear community, we love you so much that we went out of our way to fix this highly upvoted item.

Yeah. Meanwhile, like adding dark mode to Management Studio and putting the debugger back have like 5,000 upvotes. I had like 400 or something.

It was like… Just happened to align. But your account, you know, the Eric Darley name counts for… Oh, I don’t know. It’s funny because like I complained about it like two or three years ago and I haven’t said much about it since.

And then like, just like out of nowhere, it was like SSMS 20. It was like Azure Data Studio is gone now. Cool.

More examples. You got index full. Sometimes you can just add an index. And that becomes… Sometimes not. So one thing I actually wanted to say earlier, you just jogged my memory on. Like following a…

Like we talked about earlier, like following a process is great. Like having the knowledge to back that process is better. So like knowing why spools get chosen would be a really valuable thing when you’re tuning query plans that have spools, obviously. But like knowing like why there’s a spool there in the first place is huge, right?

Like why is there… Like there’s an eager index spool. Like what does that mean and why is it there? Or like there’s a lazy table spool. What does that mean and why is it there?

So like knowing that stuff is not stuff that comes easily or overnight. Like… Like… So like you might see one of those things. And you might even look at a query plan and be like… There’s like 30 seconds in this eager index spool.

Or there’s 30 seconds in this lazy table spool. But like unless you know why that spool is there and how to correct that spool… Like you might be at a pretty big disadvantage when you’re trying to like, you know, follow… Like, you know, just look at…

Oh, just look at the query plan. It’ll have all the problems. All the answers to your problems. It’s like… Like, okay, but do you have the answers to why that query plan is what it is? Like that’s another really challenging thing, I think.

And like imparting that knowledge is hard. Yeah. So I mean, to be fair with spools, it’s often easy to make spools go away if you create the index. Now, you might not be creating the best index.

Or you might be in a situation where you have to be very careful with your indexing because of your weird workload or hardware. But I don’t know. Like for me personally, I would say that, you know, figuring out like…

I mean, I have the privilege of being able to add indexes whenever I want. So those aren’t so bad for me. I mean, you know, I’m sure there are some companies that don’t wish I have to like justify their indexes to some, you know, round table.

Oh, like I work with people who are like… Who don’t want to do anything. Yeah. Like I work with people who like, where like index changes have to like A, get approved by a vendor and then go through like company change management processes. And then they have to like get added to like the database and source control and then they get rolled out.

So like I’ll make, I’ll make suggestions to some people and like those changes won’t end up anywhere near the product for like months. The real world is a scary place. It is.

It sure is. Full of dangers. But then you also like, you know, consulting, you’ll work with people who are like, you know, look, if you don’t give us some answers, we’re going to be out of business in like three to six months. So we can just do whatever.

And you’re like, cool. We can put on the gloves. Cronality estimates.

Yeah. Big cause of performance. Yeah. Legacy cardinality estimator for life. You know, like if you can get to the place where you have the knowledge and experience. Okay. You know, like, you know, this part is slow. It’s slow because the current estimate is bad.

And I have a way to fix that current estimate permanently. Whether it’s temp tables or recompile or another thing. Right. You know, like, like that can be a real game changer. Yeah.

You know, you’re solving the root cause you’re doing it quickly. Hopefully. And hopefully that select query, like, never comes back. To be worked on again. Unless that cardinality estimate was great for a specific set of parameters and awful.

Sounds like a parameter setting issue. Oh. If you have parameter setting problems, you should rewind, like, 15 minutes ago or whatever. Try to gain something from that jibber jabber.

Yeah. So I think that’s enough about those. So I’m going to state the three things that I do twice more. All right.

This is one of those times. Yeah. So first one, identify specifically why the query is slow at the operator level or weightsat level or similar level. Second step, as much as you can, try to figure out why SQL Server did the slow thing that you don’t like. Third thing is try to implement a full and complete fix, which means you’ll never have to work on that problem again.

And it’s much easier said than done. But, you know, when you can, I mean, I feel like that’s the best way to do queries, you know, not to overstate it. Or understate it massively.

Or understate it. I don’t know. You know, it would be nice if we had some knowledgeable guest who would, like, back me up or who’d say I’m wrong. You know, either one’s fine with me. Well, I mean, I think, you know, I agree with the process.

I don’t know how to get the people watching there. Yeah. No, I, yeah.

I mean, maybe we’ll get some really helpful blog or YouTube comments about people agreeing or disagreeing. I don’t know. Maybe people will, like, have training they recommend. Yeah, maybe they will.

Maybe. I’m sure someone out there has SQL Server training where they could learn these things. Someone. Someone. Someone credible and reliable. I think there are a few odds and ends. Like, you need to have monitoring set up.

You know, I recently saw a question where someone’s like, we don’t have query store because DBAs say it’ll add too much overhead. And, like, you know, I’ll acknowledge based on the version of SQL Server in the workload, that is true for more than zero people. You know, there exists an environment where that’s actually true.

But I think in the vast, vast majority of cases, that’s just some grumpy dogmatic person who thinks 2008 R2 was the best version. And, you know, I don’t want to split out my workload. Or, in fact, I’m going to run SQL Server with the dashed X parameter because, you know, that’s how hardcore I am.

And we don’t need any of this fancy monitoring and whatnot. And my response to him, which I don’t know if it helped, but it was, you know, like, you know, like, like your DBA is like they have monitoring for the things that they’re responsible for. So for them to tell you, you don’t get monitoring for the thing you’re responsible for.

Like, if they want to say query store is too much overhead, well, then they have to give you something else. They have to give you something. You need to have something.

Yeah. If you have nothing, then you are basically putting your fingers in your ears because SQL Server wants to tell you why it’s slow. But that information isn’t getting there.

Yeah. So, like, I definitely agree with the three steps in your process. I would, when, you know, for what I, I don’t know, maybe it’s me having sort of tunnel vision for what I do. But, like, you might get handed a slow query and it might not be obvious why the query is slow.

So, like, you know, you do have to have, like, a level or two above and a level or two below those three steps. They might not be, like, steps in, like, your core process, but they have, like, they have to be, like, you know, they have to be in the tool belt or whatever you want to call it. Or, like, you know, you might need to look at the server as a whole to see if the entire server is just struggling and nothing is keeping up.

You know, you might have to look more closely, like, underneath and figure out if, like, you’re hitting some weird SQL Server problem that is not necessarily the fault of the query. But, like, you know, something, something deeper in there that’s just, like, like, just not going well. So, like, there’s definitely, like, this is an example.

Like, like, I was working with a client and they would be, like, like, every once in a while the server slows down. And, like, I spent so much time working on this problem. And, like, they were right.

Like, they would, like, like, everything would run fine for a while. Something would happen. There would be massive blocking and deadlocking. Queries would grind to a halt. They’d restart the server. Everything would be fine.

Like, I identified a bunch of, like, query and index level stuff that I was just, like, well, like, I see what the blocking is. I see what the deadlocking is. Like, there’s definitely some, like, ways we could tune the queries and make this work. But, like, cool.

Like, go ahead and fix that. And, like, everything worked fine for a while. Like, actually, everything worked great, like, much better than usual for a while. But then the same thing happened. And, like, so I started, like, digging more and more and more. And what I found was every once in a while with, like, almost no predictability or regularity, every once in a while they would get a stack dump on error logs.

And when the server stack dumped on the error logs, that would precipitate all the other problems happening. And I don’t know what that stack dump was, but it left SQL Server in such a confused and riled state that everything else would go wrong. So, like, you know, when you’re, like, troubleshooting, like, if your job is only, like, like, someone hands you a query and make it faster, those three steps will always serve you well.

But if, you know, you’re dealing with, you know, like, a big complicated system, you know, there’s definitely, like, a level higher and a level lower that you need to be, you need to be willing to, like, explore. Or, like, if the answer isn’t obvious from, like, you know, going through those three things. I would say that’s prior step one, but I guess that’s kind of self-defeating in that what I’m really saying is to be a good query tuner, you have to sometimes do non-query tuning things.

Which, I guess, is something that you can say. I don’t know how practical that is for people to follow. But you’re definitely right in that, you know, like, part of step one is figuring out, is the problem the server?

And, you know, hopefully the company isn’t in such a bad state where your developers have to figure out if the server has some kind of problem. I mean, I’ve definitely been there myself, and I’m sure many others have been there. I mean, especially, I mean, you might not even have a DPA.

No, very few people do. I mean, that’s the modern way of doing things. But, yeah, you’re right. You know, like, I mean, even just you could generalize that as just knowing when to ask for help if you needed to. Especially from consultants who specialize in this piece.

I mean, speaking of help, I think another good thing we’re going to do is, you know, like, don’t be a big fish in a small pond. Like, if you really want to improve your skills or even just kind of benchmark yourself or even just invest in yourself, like, helping other people get free on the air. And that is a pretty good way to do it.

Because, like, if you can, you know, take some horribly written question from someone you don’t know, whose system you can’t access, who’s probably trying to trick you, and they’re not giving you all the information you need. And if you can solve their problem, think about how easy by comparison your problems will be. Where you’re on, you’re catching the server.

You can get the actual plan. You can do everything you want. You can try things with query hints. It’s just so much easier. You know, like, if you’ve got the stomach and willpower for it, it’s definitely a way to level up your query tuning. And, you know, maybe you’ll be at the department store and you’ll be buying a bigger shirt and Eric will be next to you also buying a bigger shirt.

Buy a much bigger shirt than you. That’s about it. But I think people should have good tools, like have the latest version of SSMS.

Don’t use it. Yeah, I still connect to SERS and the version of SSMS is, like, you know, 2008 R2. Of course.

Because I wouldn’t have been. So I think, you know, like, aside from the habitual stuff or the habit-related stuff, I don’t say habitual because that is weird context. But aside from, like, the habit-driven stuff that you mentioned, I think, you know, it’s, like, two that I hit on before, like, you know, don’t be overly dogmatic about what you think is bad.

You know, stay analytical about, like, what problem you have in front of you. But also, like, I think focus is really important for query tuners because it’s really easy to get distracted by a lot of the stuff going on. And sort of, like, not even necessarily, like, around the query, but just, like, you can handle, like, a big store procedure.

And it has, like, you know, 50, 60 different queries in it. It’s really easy to get bogged down in, like, a bunch of, like, a bunch of queries that you, like, mentally, like, assign some wrongness to. That you’re, like, oh, I got to fix this, I got to fix this, I got to fix this, I got to fix this.

And that’s where, like, you know, like, stay focused on things. The thing that, like, what will help you stay focused on tuning the right part of the store procedure is getting the actual plan. So, like, make sure that you’re focusing on the right thing in whatever you’re working on to make it faster.

So, like, you have a get hint at a store procedure, run the store procedure, get the actual plans, focus on the slowest parts of that plan. And then, like, you know, make sure that you’re staying focused on the right parts of things. You know, like, don’t get, like, you know, like, put on your horse blinders and, like, don’t get, like, worked up about, like, the 60 queries in the store procedure.

Worry about the ones that are causing the problems. I’m going to share one more. All right.

You want to call it after that or do you want to? I’m going to get on it there. Yeah. All right. I know. I’m not used to talking so much. I’m not going to get so much. Yeah. I’ve been on phone calls since, like, 9 a.m., so this is the end of my talking. You’ll probably hear it.

So, like, this is, like, my end of pre-con day voice. Struggling. You need to drink the tea, man. Tea.

You need to drink the tea. Tea. Lagavulin tea. That’s what I need to drink. I guess this is an inquiry to an egg, so maybe it’s a bad egg. No.

But it illustrates some of the principles. So I had a T-SQL performance problem. Yeah. Well, I don’t know what it’s called. Like, the SP session context. Oh, yeah, yeah, yeah.

The newer one. That was taking, like, three seconds to run. Yeah. And the application design always called that the RxC-curing query. It was saying some metadata that was sometimes used.

Yeah. So sometimes you’d have issues where, you know, like, if the query, like, the real work took a millisecond, you have the three seconds of session setting a variable to memory. Which is, like, totally outrageous.

Yeah. Absolutely. Right? So sorry. This is something I had to figure out. And since I had logging of, you know, which queries were slow, I was able to see the issue only happened sporadically. And, in fact, it seemed to happen, like, almost exactly every two hours where you’d see this spike of SP session set whatever is suddenly taking way longer than that should.

And I was just like, oh, is this some locking thing? Is this some weird internal SQL Server process thing? Like SQL Server having to clean up contexts and stuff?

Yeah. Yeah. Right? Is there some kind of, like, intention? Garbage collection? Yeah. Yeah. Like, I’m Googling for this. I, like, can’t find anything. Yeah. What’s going on? And I figured out it was related to memory allocation, which makes sense, I guess.

Sure. I mean, like, it’s not doing a lot. Like, you’re just setting some stupid value to some memory. Yeah.

Like, I mean, why should that take long? And, you know, I eventually figured out that there was some IT scanning process. Every two hours, it would do some.

Like, it was trying to record, I guess, performance side of the server. But I guess it was, like, extremely heavy-handed. It would just, like, I don’t know what it was doing.

Like, I’m not good enough at all that stuff to tell you. But it was doing something that was very expensive to the server. And, like, it was actually making everything slow down.

But the SP session setting was so obvious and such an obvious red flag that that’s the thing I noticed. So the problem had nothing to do with SQL Server, nothing to do with how we were calling that. But it was some weird IT thing.

It was just thrashing the server. It sounds like Sean Ghilardi blaming the filter drivers for everything. Why does SQL Server do IOS? I want you to get Sean on this program.

And that’s going to be the first question I answer. I’ll see what he says. I’ll shoot. I will text him and Joseph and see if they’re willing to test it. You probably shouldn’t tell him when asking that.

I think that would be the one thing that would draw him in so that he could yell at you. Joe Sack will probably say that Microsoft prohibits him from it. I mean, he’s still an important guy, right?

I’m not sure, actually, what his new title is. And I’m not sure if I’m allowed to say what he’s working on. Yeah, I’m not going to.

But, I mean, if I knew his title, I would say his title. I don’t think that’s a big deal. He’s Papa Sack. But, yeah, I mean, he’s working on cool stuff. And I don’t know if he would be able to speak freely about the cool stuff he’s working on.

But, you know, it would be nice to have a general SQL Server fireside chat. We can all wear sweaters for that one. I’ll put, like, a fake fire background on the green screen.

Good. Yeah, all right. So, for our loyal viewers, you know, maybe next time or time after that, we’ll have a guest and we can grill. Yeah.

Maybe. Maybe not. Probably not. See if we can get Paul on here wearing a mask. Fake voice thing. Or we can get a victim of SQL Saturday with, you know, with the voice and the face.

Oh, yeah, yeah, yeah. That’d be a good story. SQL Saturday’s darker in the belly.

He can share his hearing experiences. We’ll see. We’ll see. I will. I will. I will. I will. What do they say?

I will send out feelers. Do people say that? Yeah. I don’t know. Yes. Feelers. Send out feelers. All right.

Thank you for listening to the Bit Obscene radio show. There was also for a few seconds at the beginning a television show. So I’m Erik Darling. That’s Joe Obish over to the side there.

He’s going to hit the links now. And I don’t know. Thank you. Thank you for watching. It’s been an hour and 40 minutes. Goodbye. Bye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join me In Boston May 10 For A Full Day Of SQL Server Performance Tuning Training

Spring Training


This May, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, May 10th 2024 at the Microsoft Offices in Burlington.

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About sp_getapplock And Error Handling In SQL Server

A Little About sp_getapplock And Error Handling In SQL Server


Video Summary

In this video, I delve into the nuances of error handling and connection scoping when using `sp_getapplock` in SQL Server. Specifically, I explore how choosing between session or transaction as the lock owner can significantly impact your approach to error management. By walking through practical examples, I highlight the importance of proper error handling techniques, especially when dealing with locks that might not be released automatically upon an error. I also discuss the limitations of `try-catch` blocks in SQL Server and explain why sometimes they may fail to catch certain types of errors, leading to potential issues like lock leaks.

Full Transcript

Erik Darling here with Darling Data for Business Enterprise. Put some more descriptors on my company name. Maybe that’d fool people into buying more stuff from me. Oh, I’m kidding. Y’all buy plenty of stuff from me. You should still buy more, but I’ve already bought a decent chunk anyway. So this video, I want to talk a little bit about something that came up in recent conversation with a group of dear friends, and that is error handling and sort of connection scoping with sp-get-app-block. Because sp-get-app-block is a very, very underused feature of SQL Server that can solve a lot of problems for you, but needs some sort of special handling.

So when you use sp-get-app-block, you have two options. You can use it like this, and I’ve shrunk myself down because there’s a bit of text on the screen to begin with. I may embiggen myself when there’s fewer texts on the screen. So when you use sp-get-app-block, you can have the lock owner either be a transaction or you can have it be a session.

There are good uses for both, and there are good reasons for both. I think most commonly I see people use, and by people, I mean mostly me, use session because you either might not be operating in the context of a transaction, because when you use transaction, you have to be in a transaction. You have to create a big begin transaction.

So I end up using session mostly, either because A, there’s no transaction to begin with, or B, there might be multiple transactions. I don’t mean nested transactions. I don’t mean begin tran, blah, blah, blah, begin tran, blah, blah, blah.

I mean, like, you might have different groups of queries nested into different transactions, and you might, like, for you would have to, like, recall sp-get-app-block for each one of them. Now, there are totally valid use cases for that, but they’re pretty rare, and they’re probably not worth covering in any great detail here. So I usually don’t use sp-get-app-block like that, but it does have an upside, and that is that when you use transaction, then the app-block either commits or rolls back with you.

When you use session, that doesn’t happen, and that can change how you do error handling with sp-get-app-block, because if you don’t handle errors correctly when you use session as the lock owner, you could have stuff get stuck for a really long time. So let’s look at another thing that’s interesting, and that is that there are errors in the world in SQL Server. Well, not the whole world, the SQL Server’s world, that are unaffected by try-catch, right?

So these are errors that you can hit where you will not get to a try-catch block. You will just either return the error message up at a higher place, or your session will be dead and you won’t hit the try-catch block. So that’s cool, too, right?

So the first one is warnings or informational messages that have a severity of 10 or lower. We typically don’t worry about those too, too much, do we? The other one is errors that have a severity of 20 or higher that stop the SQL Server database engine.

Be careful of those severity 20 plus errors. And then also things, I’m going to zoom down a little bit further there, attentions. So like if you have a client, like if you have a query timeout of like 30 seconds and you get a query attention, blah, blah, blah, blah, then that also won’t get you to a try-catch block.

Also, if you use a kill command, you will not get to a try-catch block. There are some other, again, sort of like higher level errors that will prevent you getting to a try-catch block. Compile errors, statement recompilation, object name resolution, that’s what all that stuff says up there.

And so those are also things that will not, will like cause an error, but not the kind of error that a try-catch block will try to catch. Or maybe it will try and it’ll just drop the ball. I don’t know.

Could go either way on that, right? No, we’ll have to work on, we’ll have to have SQL Server shag some fly balls this spring. All right. So we have this table, cutely named lockme, and we’re just going to start that thing from scratch. And we are going to stick one row into lockme.

And it’s going to have a value of zero, which is the only number that exists in every language. Not true. Demonstrably not true.

Not everyone calls it zero. Anyway. You know what? There’s a lot of dead space on the screen. Let’s get big darling back here. That big darling.

There we go. That enormous noggin taking up your screen. There we go. Punch the white space. So the first thing I’m going to do is show you what happens when, now we’re not going to be using the spgetapplock that we did up there.

We’re going to use spgetapplock and the lock owner is going to be session, or session. Either way, right?

And we’re going to get a lock. We’re going to update the table lockme. And then we’re going to simulate some crap happening for like 10 seconds. And then we’re going to throw an error.

All right. And then after we, after that error occurs, we’re going to do this. And we’re going to, well, we’re not, we’re actually not going to do this. The store procedure is going to go kaput.

Right? And notice there’s no error handling here. There’s no try catch block. Right? There’s just a commit transaction, which is the stupidest way to do anything in SQL Server. If you’re going to start a transaction, you should definitely have a catch block because who knows what will screw up in there.

And you might have stuff that you need to clean up if something goes wrong. All right. And the second store procedure is going to do nearly the exact same thing in the same stupid way with no error handling. Right?

So let’s create both of these just to make sure that, well, just to make sure, because I have two other versions of these a little bit further down and I need to show you what happens in different scenarios. So we’re going to, we’re going to rather than like quote in and out a bunch of additional code, we’re just going to create them as new procedures, each one.

So in a couple of new windows over here, I have this. And you’ll notice that I have a specific release app block call in both of these because we’re going to need it. And then over in this window, I have SP who is active.

All right. So what we’re going to do is run sp get app block here. And well, sp get app block one here and sp get app block two here. And when we get, and let’s look at the, let’s look at the who is active.

We have some blocking going on. All right. And what’s cute right now is that we know that we had a 10 second wait for in there, but this, this lock wait is just going up and up and up, right?

We’re like 20 seconds. Now this, this one, the second query is still in SP get app block two getting blocked. Right.

And this first one has thrown an error that says divide by zero error encountered. Now it says the query completed with errors, but it’s still holding onto locks. All right.

So we need to run release get app block here, which will allow this one to run. Right. And if we run that again, we’ll say, yeah, that doesn’t exist. But if we run this one now, or rather we, this one is able to run now. Right.

Look at what, what’s happening is, well, nothing. It, it, it finally, it finally ran and failed like a millisecond before I got to who is active. So whatever we hit this. Right.

And now we have a divide by zero error. And we have, we, we’re still going to have an open lock. Right. So we actually have to run this here to, to release the app block that that thing took. Okay.

Not an ideal situation, not an ideal situation at all. So let’s go down a little bit further and let’s look at some error handling. Let’s look at the same store, two store procedures, but now there’s some error handling, but we don’t have the right kind of error handling.

So if you remember what the documentation said, if we used transaction here, then the rollback would take care of it. But if we don’t, we don’t have it, we don’t use a trend. We don’t use transaction here.

Then like we’ve still won’t clean that up just with a rollback. Okay. And like the, this store procedure is just a redo of sp get app block one. And there’s one right below it that does the same thing with a sp get app block two. So we’re going to do this.

We’re going to create these procedures. And we’re going to run sp get app block one here and sp get app block two here. And we’re going to see the same thing happen, right?

That wait for is going to take up to about 10 seconds. And at the 10 second mark, it’s going to throw an error and not be there anymore. And now we’re just stuck again.

Right. Because the rollback didn’t roll back the app block. So even with error handling that does a rollback, like checks up to the transaction and says, duh, get out of here.

Roll yourself back. You bum. Dirty bum. Pay your bar tab. This thing still gets stuck because we don’t actually do anything here. So same deal.

If we go away, single prompt, if we release this app block, and now we have no app blocks left to have and to hold. This one will get to about 10 seconds from whenever we killed that off. It’ll get to that divide by zero error.

And then we’ll still be stuck with a crappy app block. So let’s release this one. And let’s go look at what you should be doing if you’re using sp get app block.

Right. Which is you need to put your release lock. Here.

And here. All right. So let’s go run this. And create both of these.

And now, when we run this. And we run this. This should get to about the 10 second mark.

And remember, they’re still going to be blocking. Right. And we go look over here. They’re still blocking. But now, that ran. But in the try catch block, we should have released the lock. And now, when this thing has gotten to about 10 seconds.

Well, it’s funny because now with the error handling. Because we have the try catch block. It actually says, query executed successfully.

Because we handled the error. We handled the error. We actually didn’t surface the error or do anything. If you saw the query executed successfully in the last run, you wondered why. That’s why.

I should have stopped and said it then. Sorry about that. But anyway. Because we put the error handling in there. And now we have instead. Well, as well.

In addition to. We have release app lock here. And we have release app lock in the catch block. These things will have released their locks in the error handling. And we won’t have an app lock to release here.

All right. So if you’re using. SP get app lock for anything.

Or you want to use it for anything. You think it is a cool new bright shiny. And you’re all this. Even though it’s like old as hell. You see it’s a cool new bright shiny. And you want to use this SQL Server feature.

Just be mindful of a few things. Stuff that we talked about up at the top. Which is that. You have two options.

For the lock owner. You can do session. Which does not. Get rolled back. Then you have transaction. Which does get rolled back.

But you need like. Begin tran commit. Like it needs to be in a transaction. In order to use it. You can’t just use it without. The begin tran commit facilities. So.

That’s my advice. Be careful out there. Error handling is a. Difficult. An often tedious task. In SQL Server.

And remember. That there is a whole bunch of stuff. Both of these links will be in the. In the show notes. But it’s not a show. It’s just a video. Uh. Both of these links will be in the. The YouTube.

Additional information. But just keep in mind. Uh. Just be very careful with try catch. Because even with try catch. There are certain things. That won’t get you to a try catch block. Then you might need to be extra mindful. Of things.

Going on around. Your sp. Get app blocking. That might. Prevent. Your try catch block. Trying to catch anything. Alright.

Cool. Uh. That about does it here. Thank you for watching. Uh. I missed you terribly. Uh. I hope. I hope you.

Enjoyed yourselves. I hope you learned something. Uh. If you. If you found a. Find this video useful at all. Uh. Feel free to. Please give it a thumbs up. I beg of you. Uh. If I.

If I could get on my hands and knees. And you could still see me on camera. I would. I would get on my hands and knees. And beg you to like this video. I would also. Get on my hands and knees. And beg you to subscribe to this channel. Think of the children.

Uh. And. I will see you. In another video. In another time. In another place. So. Cool. Uh. Both. Big Eric.

And. Little Eric. Thank you for watching. Alright. Have a good.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Why INSERT/EXEC Causes Weird Blocking In SQL Server

Why INSERT/EXEC Causes Weird Blocking In SQL Server


Video Summary

In this video, I delve into the potential pitfalls of using `INSERT EXEC` in SQL Server, particularly when it comes to stored procedures. I highlight how `INSERT EXEC` can inadvertently wrap your inner procedure in a transaction, leading to unexpected blocking and even deadlocks if locks are taken during execution. By walking through two example stored procedures, one acting as an inner procedure executed via `INSERT EXEC`, I demonstrate how this setup can cause issues, especially when combined with `WAITFOR DELAY`. The video showcases a practical scenario where a simple insert operation results in significant blocking, emphasizing the importance of careful consideration and best practices when using dynamic SQL and transactions.

Full Transcript

Erik Darling here with Darling Data. And today we’re going to talk about why you need to be really careful with insert exec. Now, not necessarily insert exec with dynamic SQL because dynamic SQL is typically, I’m not going to say always, but typically limited to just one query, right? You build up a string, it’s usually a select, and then you execute it. Where you have to be careful of it is with stored procedures. And the reason why is because, and this is something no one believes me, is that when you do insert into a table and exec the stored procedure, that stored procedure executes in a transaction. Okay? So, if you take any locks during that stored procedure, you could be holding onto locks and causing blocking and even deadlocking with that particular query setup arrangement, if you will. I like arrangement better, I think. Sounds more French, right? Ooh la la. So, I’m going to walk you through two store procedures and I’m going to show you a blocking pattern with insert exec. Alright? So, that’s what we’re going to talk about today. A little about insert exec. So, this is the second store procedure. We’re going to call it, we’re going to think of this as the inner store procedure. This is the store procedure that our outer store procedure is.

going to execute inside of itself to insert into a temp table. So, what we’re doing here is we’re obeying the laws of good sequeling. We’re going to set no count and exact abort on. And then we’re going to delete from this table called lock me. And this is just here to show you that when you take a lock, you hold a lock. I know that we could truncate. I know we could do other things, right? Here, we don’t have a where clause, God forbid, but we’re just going to do a delete from this one row table because I don’t think there’s going to be much of a difference if we delete one row from a one row table versus if we truncate it. Do you? I don’t think so. Alright. The second thing we’re going to do in this procedure is we’re going to insert the result of this at at trend count. I guess it’s kind of like a global variable, right? Because it’s got two at’s in front of it. It’s like a global temp table has two pound signs. So, I guess this is kind of like a global variable. And I’m going to show you this because I want you to see, like I can’t select at at trend count in here because we’re going to be doing an insert exec with this procedure. And so, like, that select gets ignored, right? Because the select that gets run is this one at the end. So, I’m going to insert the result of at at trend count to show you that it does go up when the inner store procedure executes. And then I’m going to stick this wait for delay in here just to simulate long running queries, right? In real life, I know your store procedures. They don’t execute in sub millisecond time. They might execute in 5, 10, 15, 20. I’ve seen some of your store procedures out there execute for days.

So, don’t give me that nonsense. And then the final thing we’re going to do is we’re going to select out from this table. In the outer store procedure, this is insert exec 1, right? This is, I created them in this order because I didn’t want to deal with stuff. We’re going to follow the same pattern, be diligent T-SQL practitioners. We’re going to set no count in exact abort on.

In this store procedure, I’m allowed to just do a regular select here, right? Because this one is not being inserted into anything. And then after I select the trend count here, the reason I’m doing this is to show you what the trend count starts at, right? So, this is the starting point for the trend count. And the other one that we just looked at, that’s like the second step.

I’m going to create a temp table called Transporter. I guess that’s a Jason Statham movie. I don’t think I’ve seen it, though. So, you’re going to have to, you’re going to have to, no spoilers.

All right, no spoilers. You’re going to have to keep your mouth shut about what happens in the transporter. And then what we’re going to do is we’re going to insert, I’m going to cough. I’m going to insert the result of our second store procedure, our inner store procedure, into this transporter temp table.

And then I’m going to select star from transporter. And I’m going to get rid of this errant empty line at the bottom there. Are there any other errant empty lines? No.

All right. So, this is just insert exec running, right? I’m going to kick this off. And it doesn’t matter what query plans are for this, because really our main problem with this query is a 10-second wait for.

And when this thing finishes, we’re going to see two result sets come back. We’re going to see that we started with zero and that we ended or that we incremented to two. Crazy, right? Crazy.

How did that happen? All right. So, we’ve established that the inner store procedure gets implicitly wrapped in a transaction, right? Okay. Now, let’s run this again.

And let’s try to select some data from LockMe. And we’re going to come over here. We’re going to wait until this hits about eight seconds. And then we’re going to run who is active.

And what do you see here? Our wait for delay, which at this point had been waiting for 8.647… Sorry.

8,647 milliseconds, or 8.6 seconds, had been blocking our select query, right? This count big over here, this big counter, for almost six and a half seconds. All right?

That’s no good at all, is it? It’s no good whatsoever. Now, of course, you may not experience this because you are either a super genius who uses an optimistic isolation level, or you are some kind of groveling low-level fool who uses no-lock hints everywhere.

All right, we don’t like groveling low-level fools, do we? First percentilers. All right.

So, to wrap things up in today’s first video. Actually, I haven’t had a chance to record at all this week. I’ve just been very busy with things. Also, the weather has gotten nicer, so I’m a little bit more susceptible to going other places to enjoy other things.

But I’m going to record this video and one more video today. You might be able to guess what that other video is about if you diligently look at the titles of some of the tabs that I have open. I’ll give you a hint that it’s not going to be perfmon.

All right. Cool. So, when you use insert exec and you execute a store procedure, be very, very careful and mindful about what that store procedure is doing. Because if you are doing any inserts, updates, or deletes throughout that store procedure, blocks are going to get held until that store procedure completes.

And you could end up with very, very strange-looking blocking chains, especially if you put wait-for-delays in your query. I don’t necessarily recommend you do unless you’re writing, like, a cursor or a loop and you don’t want to just, like, burn out your CPUs on cursors and loops. So, that’s that.

Thank you for watching. Thank you for taking time out of this. I hope the weather is beautiful where you are. I think only our friends around the Rocky Mountains are having a tough day. All right.

So, I hope the weather is beautiful where you are when you see this. I hope you learned something. I hope you enjoyed yourselves. If you like this video, there’s a thumbs-up button somewhere on this screen. I’m not sure what the accessibility options are like in the thumbs-up.

I’ve never had to experience them. But I’m told there’s a thumbs-up button on the YouTube screen. You should hit it if you liked it. All right.

Smash, et cetera. If you enjoy this type of SQL Server informational content, you could pretty please give my channel a subscription. Subscribe.

Like and subscribe. All right. That’s what we’re supposed to say here. And I will see you in the next video that I’m going to record very shortly. So, cool.

Thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Darling Data Expands Its Social Media Influence Empire To TikTok #TikTok

Clampdown


Darling Data was born too late for MySpace, Friendster, Facebook, and Make Out Club (MOC).

While I patiently wait for 4chan to make business accounts available, I figured I’d try out something new.

So, you can find bite (byte?) sized chunks of my videos on TikTok here: Darling Data

I’ve also hired my wife (clearly a nepo baby) as a social media intern to slice, dice, edit, caption, #hashtag, and add complimentary gifs to the clips.

If I’m not careful, she’ll be better at SQL Server than me by the time this goes to international media outlets. I would have hired her to write this press release, but she went to bed.

Since I’m clearly not good at these things, please do the TikTok version of liking and subscribing over there, fellow kids.

This post was brought to you by the Senior Executive Management Team at Darling Data, a subsidiary of Beer Gut Magazine, Inc.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About RESOURCE_SEMAPHORE_QUERY_COMPILE Waits In SQL Server

A Little About RESOURCE_SEMAPHORE_QUERY_COMPILE Waits In SQL Server


Video Summary

In this video, I dive into a fascinating topic: resource semaphore query compile weights in SQL Server. I share real-world examples and demonstrate how these weights can cause significant issues, particularly when dealing with overly complex queries that take an excessive amount of memory to compile. By using tools like SP Quickie Store and spwhoisactive, I show you how to identify and mitigate these problems, emphasizing the importance of writing more manageable SQL queries to keep your database running smoothly.

Full Transcript

Erik Darling here with Darling Data. Recently voted by BeerGut Magazine Board of Editors to be the SQL Server Consultancy most likely to get sued by BeerGut Magazine. I’m not sure exactly what they meant by that. I’m not scared. I’m not scared of you, Joel. Sue me if you want. So, today’s video, we’re going to talk a little bit about resource semaphore query compile weights, because this is one that I’ve run into several times in recent consulting history, and it’s always fun to talk about because, I mean, it’s not happening to my server. It’s, you know, happening to someone else’s server. I just get to tell them why it’s happening, teach them what it’s happening, and I’m not sure what it’s happening. I just get to tell them how to fix it, and occasionally even if I’m really lucky, help them fix it, which is usually a process of breaking up those awful giant monolithic queries that developers insist on writing into smaller, more manageable little chunks.

So, I have in front of me, us, you, we, everyone, everyone who can see the screen, this query, which is a really stupid query. It is just a series of crossjoins from this tiny little post type table, which has like eight rows in it. The thing is, if you do crossjoin enough times, the way I have here, and I want everyone out there in developer land to have a newfound level of respect for me because I started my, my CTE numbering at zero.

Would also like to remark, these are probably the most readable CTE ever written. Perhaps that would be a nice award, Beargut Magazine. Perhaps, stop threatening me. Nice to you. So, the end result of this, and I’m going to show you the query plan for it, is a query that takes, well, I mean, first, let’s, let’s marvel at this estimated subtree cost.

Right? It doesn’t even fit on the screen when I zoom in. It doesn’t, I’m not sure it’ll even fit on the screen. I just have to scroll that one across.

If anyone wants to pause the video and count the number of zeros in there and figure out what this number would actually be, I’m sure there’s a national debt joke in there somewhere, right? Sure.

But the important, semi-important thing here, aside from that gargantuan estimated subtree cost, is that this query takes, well, takes about 740 megs of compiled memory. Which, I mean, you know, 740 megs, whatever, but that’s compiled memory.

That’s like, not a query memory grant. It’s not like a table that exists physically on the server. It’s just 740 megs to figure out what the hell to do with this query plan.

And it takes about 35 and a half seconds to compile. So, like, local factors involved, depending on a few things. This will take anywhere between, like, 25 and 45 seconds to compile a plan.

Again, if you fiddle enough with compatibility levels and cardinality estimation models, you might see something on the lower end. You might see something on the higher end.

So, let’s see. Okay. There are two things about SQL Server Management Studio that have been irking the bejesus out of me lately. One is random window switching positions.

I don’t know why that happens. And the other is sometimes, like, you ever, like, control and F to find something? And, like, you see a bunch of orange stuff on the sidebar highlight?

And then, like, if you, like, let’s say you’re in a store procedure and then you, like, hit F5 to change something in the store procedure. And it brings up the results, the thing where it’s just, like, query completed at whatever time. And then, like, you get rid of the results.

And then, like, the orange dots all move. And all of a sudden, like, where there are hits for things, it’s all wrong. It’s been making me feel like a crazy person. I don’t like it.

Anyway. Let’s make SQL Server. Let’s get revenge. Let’s make SQL Server feel like a crazy person. So what I’m going to do is show you a couple things about how you can spot this happening. And, well, I mean, we can talk a little bit about how to fix it.

But I think I already just told you. Stop raiding gigantic queries, you fools. So if we run these two queries while there is nothing going on, we will see that my server, based on the amount of memory that I have, has available to it 64 slots in the small query compile gateway, 16 slots in the medium query compile gateway, and one coveted spot in the big query compile gateway.

Since nothing is going on right now, nothing is very active. Well, I mean, for some reason, the small gateway is active. Probably some background tasks doing stuff, hanging out, maybe.

I don’t know. Maybe it’s active because something has to be active and nothing else is going on. I don’t know.

Some of these DMVs are just wacky and goofy, and I don’t wish to learn much more about them than I have to. So, based on these numbers, I don’t know why I threw that away.

Based on these numbers, I have 64 plus 16, 6 plus 4, 70, 80, plus 1, 81 total gateway slots for queries to compile in. Now, if you have a bunch of small, low-cost queries, you can compile, like, infinite numbers of those.

But as soon as queries require a little bit more effort and thought by the optimizer to compile a plan, they’re going to start flooding into these gateways. And as the optimizer, like, figures out how much stuff it’s going to put into compiling a plan for a query, they’re going to start traveling through these gateways until they, until, well, I mean, in this demo, they’re going to, one’s going to hit the big gateway.

The rest are going to get stuck in other smaller gateways. So, the way that we’re going to do that is by having a pink cursor, of course. And what we’re going to do is use SQL query stress, again, maintained by the lovely, talented Eric EJ.

And what we’re going to do is we’re going to run 100 and, I guess this should probably be 160. 161’s good enough. I guess with 81, it should be 162.

But, you know, you know what they say about off-by-one errors. We’re just going to roll with this because this will do a good enough job. We’re going to flood this server with requests. And we’re going to be connected into the remote DAC.

So we can see these things really start to have problems. All right. So this looks pretty stable now as far as the results go. So I’m going to hit cancel here.

Normally, I’d be game to let stuff keep running and, like, you know, show you what happens as, like, stuff finishes and moves. But you know what? I have to, I have to do a pre-con on Friday in Nashville.

And I don’t want to brick my laptop between now and then. So I’m going to be a little superstitious here and not push my luck. Just not pushing your luck is probably superstitious enough.

So anyway, what we’re going to see is up in this top result set where I’m interrogating the gateway DMV, we will see that all of these slots have become active and that all of these slots have accumulated a number of waiters. All these queries are waiting to compile a plan.

Right now, there’s only one query on my server that can compile anything. All right. And that’s this lucky. Oh, man.

It was almost number 13. Lucky query number 14. It’s like lucky number 11. All right. So this is the only one. I mean, again, I’m going to cross my fingers and hope that there’s only one null row in here and that every other row is showing resource semaphore query compile.

Ooh, la, la. What a beautiful demo that is. So only this one query is allowed in that big gateway. And if you have more memory available, you will have more gateways open to you.

I think once you get past like 128 gigs, you start getting more of them. I forget the exact number. I’m sure it’s documented somewhere very clearly and somewhere very easy to find.

Those Microsoft docs are just legendary for their goodness. Yeah. So one query can run and one query can do anything useful.

Now, again, you know, this is a demo query designed to do something terrible. In real life, your queries probably, I mean, they might also have nine CTE in them.

Who knows? All right. Y’all are crazy. But really, you would just be looking for like really big, complicated queries that SQL server would have to think particularly hard about when a plan is getting compiled for it. All right.

Now, if you’re using stored procedures or parameterized queries, generally, you only take a hit on this when they initially compile or if they recompile for some reason. That’s why I have the recompile hint on here because if this thing initially compiled a plan, other queries would just start using it and laughing at me.

Right. So like once you get past the compilation phase and a plan gets cached, other queries can be like, ah, gotcha. But in real life, this often doesn’t happen with usually just doesn’t happen with like one big, awful query. Doing stuff is usually like like a whole cavalry of awful queries just riding into your SQL Server and shooting the place up.

So one way that you can find queries that take a long time to compile is by using SP Quickie Store. And if you use it in expert mode, it will return additional result sets that have compilation metrics in them. And what I mean by that is you’ll have this top section up here and, you know, you might see there’s probably nothing in here that’s so eye catching, except, I mean, maybe the fact that we couldn’t render this this query plan is XML from the from the query store DMV.

I’ve been meaning to do something in here where like if I can’t render the plan, then I’ll like produce some query so you can get it as like just the raw XML as text. So you can save it off as a SQL plan and reopen it. Reopening query plan XML that’s too big to render as XML like in the database works just fine as opening a file.

I don’t know when I don’t know why Windows is so much better at it than SQL Server Management Studio. Who knows? Maybe it’s SQL Server. Maybe it’s Management Studio. It’s hard to tell. I mean, SQL Server has the error about 128 nested nodes of XML, so it’s probably SQL Server as usual.

I’m not going to let SQL Server off the hook. As much as I love to blame and get mad at Management Studio for things, SQL Server is no angel. But if we look here in the results, we’re going to have this section down here called Compilation Stats.

And just because I know the query that I care about, I’m going to look here and I’m going to say, well, you are query ID 14510. And this is ordered just by query ID ascending for some reason. I didn’t really put a lot of thought into the ordering here.

I should probably have some more useful metric and query ID be the ordering principle here. I’m not sure if I would make it total compiles or average compile stuff, but wow, what was 14509? Where are you in here?

Oh, you were the other. See, look at, there it is. There’s another instance of this query from when I was running it. Look at how prepared I was for this demo. But if we look at these two lines in the Quickie Store results for compilation stats, you will see one of these took about 30 seconds to compile. And another one took about 24 and a half seconds to compile.

So neither one of these from my brief tinkering hit the 45 second mark. But in the past they have. They’re just not in these results.

So sorry about that. I’m sure that’s totally disappointing. And you’re going to hate me forever. And you’re never going to forgive me. This injustice. This curse I put upon your land.

So yeah. What I’ve shown you today is sort of how you can figure out which queries are struggling with resource semaphore query compile. You generally want to look for things that took a long time to compile or stuff that if we come over here a bit, we’re going to see.

No, no, no. Where’s the compile memory line? There we go.

Some compile memory. Lots of compile memory in here. So you want to look for stuff that has high compile memory, probably high compile time. You can find that stuff in query store using spquickiestore.

You can see exactly how much you have available gateway-wise for compiling queries. And you can usually see queries that are stuck waiting for memory to compile by looking at spwhoisactive and seeing the resource semaphore query compile weight pile up through a whole bunch of queries. So if I have to leave you with anything, it’s don’t write queries like this.

Unless you need to show people resource semaphore query compile demos. But in real life, you shouldn’t be writing big, complex, monster queries. You should be kind to your SQL Server because it probably has a lot of other problems.

You don’t want to make them worse. Why pile everything on? It’s like making someone in AA quit smoking.

Why would you want to do that to someone? Giving up their only drill in life. Anyway, I’m going to…

I don’t know what I’m going to do. I suppose pack bags or something. I don’t really have a good game plan for tonight. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. If you like this video, I always appreciate a thumbs up. And if you like learning stuff about SQL Server, I would strongly suggest, before Beargut Magazine sues me into oblivion, subscribing to my channel so that you can get big, glaring, blaring notification every time I finish uploading one of these things.

And hopefully after it’s been processed up to HD, because you don’t want to watch these in low definition. You want to make sure that I am in vivid detail. Don’t let YouTube’s downscaling turn me into, like, Minecraft pixel art.

Turn it up to HD. Erik Darling in HD. It’s the way I was meant to be experienced.

Anyway, thank you for watching. This will probably be the last video I record this week. I’ve gone a bunch. And then next week, I’ll pick back up with some stuff for you, because I care about you.

I want you to know the awful things I know. Anyway, thank you for watching. Let’s do this. ification.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join me In Boston May 10 For A Full Day Of SQL Server Performance Tuning Training

Spring Training


This May, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, May 10th 2024 at the Microsoft Offices in Burlington.

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Management Studio 20 Removes Azure Data Studio From The Installer

Odd Couple


I was quite publicly mystified by the coupling of Azure Data Studio into the SQL Server Management Studio installer. They’re different tools for different people.

This isn’t a tirade against Azure Data Studio, nor is it a victory lap since apparently the feedback item I posted was part of the decision to remove it.

This is purely informational, since the announcement that it’s not in there anymore is nearly as quiet as the announcement that it was being included back in SQL Server Management Studio 18.7, and only slightly louder than the availability of a command line switch to skip installing it.

Back when I initially complained about the inability to skip installing Azure Data Studio, there was a lot of talk about how SQL Server Management Studio would increasingly rely on it for new functionality. I suppose that was either untrue, or the roadmap changed significantly.

Quite cynically, I thought it was a cheap way to increase the install base of a new product, but who knows? I assume Microsoft has better telemetry about usage than binaries just existing. Again, who knows?

A further miffance was that you could download and install Azure Data Studio independently, but not SQL Server Management Studio.

If you read through Erin Stellato’s post about the SQL Server Management Studio roadmap, where version 20 was fully released:

For SSMS 20 we have removed Azure Data Studio from the installation.  In addition to being a highly voted item (Make ADS An Optional Install Alongside SSMS · Community (azure.com)), this aligns with the future extension work for SSMS 21 and we decided to make the change in the current release.

So, hooray! Dreams do come true, etc. When I was but a young boy, I dreamed that someday my desires would be aligned with future extension work for SSMS.

I’d also like to say here that being the public face of SQL Server Management Studio makes Erin Stellato about the bravest soul I know. It’s a tool that millions of people rely on, and nearly everyone actively complains about. Part of my RSS feed is not only posts from Microsoft about data platform related stuff, but also the comment feed. She has saintly patience in her responses to the comments on these things. Comments on the internet maintain their position and title as the worst things on earth. Erin deserves infinite credit for doing the job that she does as well as she does it.

If you’re keen on moving to SQL Server Management Studio 20, you can download it here.

If you’re still stuck using older versions because of a new bug or an old feature, you can skip installing Azure Data Studio by running the installer via the command prompt:

SSMS-Setup-ENU.exe /Passive DoNotInstallAzureDataStudio=1

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join Algorithm Limitations In SQL Server

Join Algorithm Limitations In SQL Server


Video Summary

In this video, I delve into some interesting limitations of SQL Server’s join algorithms, particularly focusing on scenarios where the lack of an equality predicate can lead to unexpected errors or suboptimal query plans. I explore these issues through practical examples and demonstrate how implicit predicates work in nested loops joins. Additionally, I discuss a scenario involving right outer joins, which cannot be implemented as nested loops joins due to their nature, highlighting why SQL Server prefers left outer joins for such cases. By walking you through these examples, I aim to provide insights into optimizing queries and understanding the nuances of SQL Server’s query execution engine. Whether you’re working with stored procedures or dynamic SQL, this video offers valuable lessons on how to avoid common pitfalls and improve query performance.

Full Transcript

Erik Darling here with Darling Data. And I’m just now making the rather sad realization that I forgot to re-embiggen myself after recording the last video where I had to shrink myself down. So let’s figure out what an adequate portion of me is. Let’s move this around a little bit. Ooh, ooh, ooh, that’s too big. That’s a little too much me even for me. Let’s get that to a little bit of a bit. more reasonable place. I think this is pretty good. I just have to be really mindful about what I do with my head because when I do that, my head looks preposterously large, I think. I mean, my head is sort of preposterously large. Anyway, if you’ve ever seen me in real life, you know this, but the camera angle and sort of, I don’t know, I don’t know what to call it, depth of field thing just makes it so much worse. But anyway, I don’t know what to call it. Anyway, in today’s video, we’re going to talk about some joint algorithm limitations in SQL Server. These may and probably do exist in other databases. I don’t study other databases as deeply as I study SQL Server. So you’ll have to forgive me if I’m wrong about that. But I can’t imagine that it’s much different anywhere else unless there’s some other database out there that has some totally different joint algorithm that I’m just unaware of. So I’ve got a couple indexes that I created for most of the queries we’re going to be looking at. It’s, I mean, most of the plans are going to be getting our estimated plans, so they don’t matter too, too much. But a couple of them will be actual execution plans. And I don’t want to sit here waiting for those to show up. I don’t want to waste your time, mine, or look foolish.

The eyes of our overlords, whoever they may be. So, first, and this demo is going to play a part in why I’m showing you the next demo, too. But let’s get through this one first.

Oh, my good, good. Headphone hair in back, you’ll have to forgive me. Another reason not to put my head down too far, I suppose. All right, so this query, just get an estimated plan for this. SQL Server chooses a nested loops join plan.

Well, it doesn’t really choose it. It’s forced to use it. And it’s forced to use it because we don’t have an equality predicate. We don’t have column equals other column here. We have greater than equal to, less than equal to. Okay, so without an equality predicate, we cannot have a merge join, and we cannot have a hash join.

All right, so if we try to get an estimated plan for this, we will get an error. And SQL Server will say, query processor cannot produce a query plan because of the hints defined in this query. Please resubmit your query, and we’ll get back to you as soon as possible.

And the same thing will happen with a hash join hint. All right, so you do this, and look at it. It’s that same error. If you didn’t know you could edit query text in the messages tab, it makes it really easy to sort of frame the text a little bit better. If you’re taking screenshots or something, don’t be afraid to do that, so then you don’t have to send one of those, like, preposterously long, skinny screenshots, or, like, assemble your own screenshot later, and everyone’s going to think you, like, photoshopped it.

Like, wow, that’s not a real thing. Liar. It’s completely unrealistic. Bad demo writer. Bad DBA. Bad developer. Whatever. Whatever they may think of you. So, I think I’ve talked about this a little bit before, but stuff like this is often why, if you’re the type of person who has, like, calendar or date tables, or even number tables, and your database has utility tables, I’m not saying you shouldn’t, but when you join to those, you are often joining on, like, greater than, equal to, less than, equal to, and some arrangement there, and, like, if you’re joining, like, if you have, like, really big results set, and you ended up in, like, you end up with a nested loops join because you have to without an equality predicate, that can really drag a query down because SQL Server can only do a nested loops join, and if you get caught in a nested loops join, I don’t know if you’ve ever, like, done a cross join between, like, a small table and a big table, but those get stuck in nested loops hell, and the same thing can kind of happen here, right?

Because, like, especially if that nested loops join ends up in, like, the wrong side of the query plan, like, you know, somewhere where, like, you know, you end up doing, like, billions and billions of nested loops joins into a table because SQL Server has to, well, that can be a pretty bad time for performance.

I’m not saying not to use those tables. I’m just saying that just be careful where you use those tables in your queries. Sometimes you have to, like, get queries down to a smaller result set, like temp tables, and then join those. Other techniques can work, too, like, you know, cross-apply with a big top in it or something like that and can sort of tell SQL Server where you want things to happen a little bit better.

But moving on to, like, where the lack of an equality predicate can cause some sort of unexpected errors, I learned about this from a Craig Friedman blog post from, like, forever ago, but it still works today, but we’re going to talk a little bit more.

We’ll talk about some stuff. We’ll get there. Just stick with me. Hang on tight. Stick your hands deep into your pockets and try to grin and bear this information. So if we create this procedure, and the deal with this procedure is that we are getting a count from posts joined to itself where the ID in one side is the parent ID in the other side.

So we’re sort of just, like, counting, like, questions and answers, right? We’re associating the answers with the question. We’re joining from the question to the answer, whatever order that happens in. It doesn’t really matter much.

But if we create this, and then, sorry, before we go any lower, then we also have this where clause where p.parentID equals this parent ID parameter, which is set to this number.

And we have an option hash join hint down here. Now, if we try to run this, we’re going to get an error message. Very similar, actually identical to the error messages that we just saw, where a SQL Server is going to say, no plan for you.

No query. Can’t do it. You’re screwed. Now, the way to see why that happens is to run the query without the hash join hint in it. So let’s run this, and now let’s execute the store procedure, and let’s see what happens.

And that is, well, something kind of cool. So this isn’t necessarily the fault of, like, the query writer, and it’s not necessarily the fault of the hash join.

There’s a query optimization that kicks in. You can think of it as implied predicates. You can think of it that way because that’s what it’s called. The predicates are implied because our join columns have an equality predicate, and our where clause also has an equality predicate on a value.

So what happens in the query plan is SQL Server. Let’s zoom in on this a little bit. Let’s see if we can make sure that everything is going to work out all right here.

Let’s frame that up nicely. Let’s act professional for once in our lives. So up at the top, we have a clustered index something. I don’t know why SSMS chooses to cut this off.

It’s not like there’s anything. Oh, that’s from my, I mean, I understand why this is where my hand cuts off. Maybe I should make myself bigger so I can reach more places on the screen. I don’t lose fingers.

But we have a nonclustered index. This is going to be a seek. And I’ll show you that in a second. And then down here we have an index seek on my non-cl. I’ve never had my non-cl seeked into before, but rather nice, actually.

Rather pleasant. So we have a seek on both sides of the joiner. I’m going to zoom in on exactly what happens. We seek on the clustered index to where the ID column is equal to the parent ID parameter.

And that’s on the outer side of the join. On the inner side of the join, where we seek into a non-cl. That is, of course, going to be a seek predicate here on parent ID.

So this, again, implied predicate means that a predicate will apply to both sides of the join because we have a where clause column that matches one of our join condition columns. And it’s all equality predicates.

So when we look at the nested loops join, we actually don’t have any join predicate here. We just output the stuff for the count function. So the reason why the hash join hint throws an error is because we’ve removed the equality predicate from the join.

Rather, the optimizer has removed the equality predicate from the join, pushed it to either side of the join, and then basically just kind of brought back everything because all the rows would match. There’s nothing to actually filter out at the join.

The join can’t get rid of any rows. And SQL Server knows that because of the implied predicate. So it basically just assembles everything and shuffles it off. And that’s cool, but it can also lead to some sort of surprising errors sometimes if you add a hash join hint to a query, and that query starts failing with different parameter values because SQL Server implies the predicates.

So this can—it’s not just stored procedures with parameters. And also, like, you can get around this with recompile hints, but if you declare a variable and do the same thing, we will get the same query processor error.

It is worth noting, I think, that this doesn’t happen for some reason with a literal value. I don’t quite get why. SQL Server does a hash join here, and we still imply the predicate.

All right, it’s right there. You can see it down the bottom, 184, whatever. And we have it down here. So I don’t quite get that, but I’m willing to admit that I don’t quite get it. But this could happen with a literal value if you had forced parameterization enabled for your database.

For some reason—well, I mean, literal values skirt a lot of optimizer stuff, so maybe this is just another one to add to the list. So you’ve seen a couple examples now of limitations with join algorithms where there’s a lack of an equality predicate and even a sort of surprising one where there wasn’t a quality predicate, but the optimizer optimized it away.

Now, I want to talk about something a little bit more interesting because I wrote a post recently about joins. And in that post, I made fun of write joins a little bit, probably a little bit more than I should have.

I’m not actually that militaristic about write joins. I guess. I don’t know. I still think that they’re kind of stupid, but really what I do hate are Venn diagrams to explain joins.

That’s obnoxious. Stop doing it. So I have an optimizer compatibility level hint here just to not get adaptive joins because the adaptive joins just make what I’m trying to explain a little bit more confusing.

It’s really just want to show you the join type and whether it’s left outer or right outer. So I’ve got these two queries, one using a left join and one using a right join.

And when we run these, if you’ll note, SQL Server quite often thinks that right joins are goofy too. And SQL Server has rewritten both of these as left outer joins. Left outer joins.

SQL Server took our query, carefully crafted to use a right join and said, no, thank you. I’m going to do a left join instead. Okay.

Well, that’s interesting. Now, if we write our query and use this set of hints, so now I say loop join and do this, right? So we got hash joins before.

If we write this to use loop joins, SQL Server is going to do the same thing. SQL Server is going to say, right joins are stupid, right? Right join, nah. Left join.

But this is a little bit more of an optimizer limitation than just an optimizer being like, I hate right joins type thing. And I’m going to show you exactly what I mean.

So if I add in another hint here, so there’s a new hint in this query, say force order. All right. So I’m saying you write, you create this query plan in the order that I wrote it.

We can do that successfully with the left join. SQL Server says, no problem. Gotcha, pal. You’re good. If we try to do that with a right join, SQL Server is going to say, no.

We cannot write loop join this query. And if we write force order, then SQL Server is going to obey the order of the joins, but we’re going to go back to a hash join.

We do a right outer join, but it’s back to being a hash join. Now the reason why this is an optimizer, or rather an algorithm limitation, is because these are outer joins.

We need to return a value and another value, or like values and whatever values exist, or values and a bunch of nulls when they don’t exist.

And it gets, it’s a little bit easier to see when you visualize it like this. All right. So let’s run these two things.

And what I have here is the wonderful, lovely, talented generate series function. And this is counting from one to 20, skipping, like taking steps of two.

This is counting from one to 30 using steps of three. So if we look at the values that come back from here, right? And the first result set, we have one, three, five, seven, nine, 11, 13, 15, 17, 19, which is, again, counting to, well, would have counted past 20, but by limited to 20, but we get up to 19, right?

One, three, five. So we’re basically doing every third number. And the second one would, I mean, would have gone up to 30, but, you know, counting by, we’re skipping three. So we get one, four, seven, 10, 13, 16, 19, 22, 25, 28.

So if we were to run this as a left join between this result set and this result set, we would find, what is it? One, seven, 13, and 19.

And the second result set, right? And those would join, but then for three, five, nine, 11, 15, 17, we would get that back with a null, right?

So we would get value plus another value where they matched and then value and null where they don’t match. Now, the reason you can’t do this with a right join is because you wouldn’t know what rows exist. In the outer table, right?

Like, think about if we, if we did this as a right join, you would be like, oh, well, here’s one. Returning one in one. And then you say four. And then you say, well, there’s no fours here.

There is a three here. So I could, but like SQL Server doesn’t know to return three and null because it only has the values that are on the outer side of the join to start going to try to find. So right outer joins technically don’t work as a nested loops join because of the way the algorithm is implemented, right?

Like you, like you can’t possibly get the right results with the right outer loop join because you need to figure out if this matches and this matches and return this plus any matches or return this plus a null. It gets a little bit easier to see if you actually write the join out because what you get here is exactly what I’ve been describing.

When we find a match, we get a value and a value. But when we don’t have a match, we get a value and a null, right?

And we don’t know, like, like whatever values are in here that didn’t match. SQL Server is not going to be like, well, hang on, hang on, back it up. We didn’t have a four, but we had a three. So I get a return to three and a null.

It’s just not going to work well. There’s not like a good efficient way to do that with a right loop join. So right joins is a wise man. Once said, if you never want to see nested loops joins, write all your queries is right joins, but the optimizer might say beat it.

This is just kind of a funny limitation. I don’t expect anyone out there to actually try to avoid loop joins for their entire life. But there we have it.

Anyway, these are just some interesting join algorithm sort of limitations. I mean, they’re not even like bad limitations. They’re just things you should be aware of.

Like, a lot of people say limitation, like a lot of people infer or imply that limitations are bad things. But I think quite often limitations are good things because performance would not be great if we didn’t have some limits on the way some things are implemented.

Some of those limits keep the wheels on. This is, I think, a pretty good example of that.

There’s also, you know, the right outer loop join just couldn’t possibly give you correct results in a timely fashion. It just wouldn’t work. It doesn’t work because you take a row, you’ll look for a row, but if you start on the wrong side, you just, you’re going to miss values that you should be bringing back.

So, it could be a pretty bad time. You wouldn’t, you wouldn’t want any more incorrect results than you get with all your no lock hints anyway. So, that’s about it for this.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. If you like this video, give it the old thumbs up. If you like this kind of content, please subscribe to my channel so that when I publish videos, you can immediately be notified, drop everything, and, and, and watch them and, and learn more sequels because apparently that’s important.

Yeah. Databases, database, data science, all that stuff. Seems to be, seems to be pretty useful these days.

Seems to be pretty in demand. So, the more you know, the better off you are. Especially, you know, anyone can return correct results. If you can, if you can return correct results faster than everyone else, you’re in better shape than a lot of folks would be.

So, yeah. Learn your sequels. I guess. I should, I should learn my sequels too.

Anyway, I’m going to go, I don’t know, do something else fun and interesting now. Maybe I’ll record more. Maybe I won’t. We’ll see what happens. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

The How To Write SQL Server Queries Correctly Cheat Sheet: INTERSECT And EXCEPT

Who Am I Kidding?


I have never once seen anyone use these. The most glaring issue with them is that unlike a lot of other directives in SQL, these ones just don’t do a good job of telling you what they do, and their behavior is sort of weird.

Unlike EXISTS and NOT EXISTS, which state their case very plainly, as do UNION and UNION ALL, figuring these out is not the most straightforward thing. Especially since INTERSECT has operator precedence rules that many other directives do not.

  • INTERSECT gives you a set of unique rows from both queries
  • EXCEPT gives you a set of unique rows from the “first” query

So, cool, if you know you want a unique set of rows from somewhere, these are good places to start.

What’s better, is that they handle NULL values without a lot of overly-protective syntax with ISNULL, COALESCE, or expansive and confusing OR logic.

The tricky part is spotting when you should use these things, and how to write a query that makes the most of them.

And in what order.

Easy Examples


Often the best way to get a feel for how things work is to run simple queries and test the results vs. your expectations, whatever they may be.

I like these queries, because the UserId column in the Comments table is not only NULLable, but contains actual NULLs. Wild, right?

SELECT
    c.*
FROM dbo.Comments AS c
WHERE c.UserId IS NULL
AND   c.Score > 2

INTERSECT

SELECT
    c.*
FROM dbo.Comments AS c
WHERE c.UserId IS NULL
AND   c.Score > 3
ORDER BY
    c.Score;

Running this will return results where a Comment’s Score is greater than 3 only, because that’s the starting point for where both query results begin to match results across all the columns.

Note that the UserId column being NULL doesn’t pose any problems here, and doesn’t require any special handling. Like I said. And will keep saying. Please remember what I say, I beg of you.

Moving on to EXCEPT:

SELECT
    c.*
FROM dbo.Comments AS c
WHERE c.UserId IS NULL
AND   c.Score > 2

EXCEPT

SELECT
    c.*
FROM dbo.Comments AS c
WHERE c.UserId IS NULL
AND   c.Score > 3
ORDER BY
    c.Score;

This will only return results from the “first” query (often referred to as the left, or outer query) with a Score of 3, because that’s the only data that exists in it that isn’t also in the “second” (or right, or inner) query.

Both queries will find many of the same rows after Score hits 2 or 3, but those get filtered out to show only the difference(s) between the two.

In case it wasn’t obvious, it’s a bit like using NOT EXISTS, in that rows are only checked, and not projected from the second/right/inner query, looking for Scores greater than 3.

Again, NULLs in the UserId column are handled just fine. No ISNULL/COALESCE/OR gymnastics required.

I’m really trying to drive this home, here.

In The Year 2000


SQL Server 2022 introduced some “modern” ANSI syntax. By modern, I mean that IS DISTINCT FROM was introduced to the standard in 1999, and IS NOT DISTINCT FROM was introduced in 2003.

While no database platform adheres strictly or urgently to ANSI standards, waiting 20 years for an implementation in SQL Server is kind of really-extra-super-duper son-of-a-gun boy-howdy dag-nabbit-buster alright-bucko hold-your-horses listen-here-pal levels of irritating.

Think of all the useless, deprecated, and retired things we’ve gotten in the past 20 years instead of basic functionality. It’s infinitely miffing.

Anyway, I like these additions quite a lot. In many ways, these are extensions of INTERSECT and EXCEPT, because the workarounds involved for them involved those very directives. Sort of like Microsoft finally adding GREATEST and LEAST, after decades of developers wondering just what the hell to do instead, I hope they didn’t show up too late to keep SQL Server from being bullied by developers who are used to other platforms.

We can finally start to replace mutton-headed, ill-performing syntax like this:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id = p.LastEditorUserId
  OR (p.LastEditorUserId IS NULL);

With stuff that doesn’t suck, like this:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id IS NOT DISTINCT FROM p.LastEditorUserId;

The query plan timings tell enough of a story here:

sql server query plan
i’ve been waiting for so long

But not everyone is able to use the latest and greatest (or least and greatest, ha ha ha) syntax. And the newest syntax isn’t always better for performance, without additional tweaks.

And that’s okay with me. I do performance tuning for a living, and my job is to know all the available options and test them.

Like here. Like now.

The Only One I Know


Let’s compare these two queries. It’ll be fun, and if you don’t think it’s fun, that’s why you’ll pay me. Hopefully.

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id IS NOT DISTINCT FROM p.LastEditorUserId;

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id = p.LastEditorUserId
WHERE EXISTS
(
    SELECT p.LastEditorUserId FROM dbo.Posts AS p
    INTERSECT
    SELECT u.Id FROM dbo.Users AS u
);

Here’s the supporting index that I have for these queries:

CREATE INDEX 
    LastEditorUserId 
ON dbo.Posts
    (LastEditorUserId)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

It’s good enough. That’s what counts, I guess. Showing up.

20th Century Boy


At first glance, many queries may appear to be quite astoundingly better. SQL Server has many tricks up its sleeves in newer versions, assuming that you’re ready to embrace higher compatibility levels, and pay through the nose for Enterprise Edition.

This is a great example. Looking at the final query timing, you might think that the new IS [NOT] DISTINCT FROM syntax is a real dumb dumb head.

sql server query plan
gimme a second

But unless you’re invested in examining these types of things, you’ll miss subtle query plan difference, which is why you’ll pay me, hopefully,

The second query receives the blessing of Batch Mode On Row Store, while the first does not. If we use the a helper object to get them both functioning on even terms, performance is quite close:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id IS NOT DISTINCT FROM p.LastEditorUserId 
LEFT JOIN dbo.columnstore_helper AS ch 
  ON 1 = 0;
sql server query plan
i guess that’s okay

In this case, the slightly tweaked query just slightly edges out the older version of writing the query.

I Can’t Imagine The World Without Me


There are many ways to write a query, and examine the performance characteristics. As SQL Server adds more options, syntax, capabilities, and considerations, testing and judging them all (especially with various indexing strategies) becomes quite an endeavor.

I don’t blame developers for being unaware or, or unable to test a variety of different rewrites and scenarios. The level of understanding that it takes to tune many queries extends quite beyond common knowledge or sense.

The aim of these posts is to give developers a wider array of techniques, and a better understanding of what works and why, while exposing them to newer options available as upgrade cycles march bravely into the future.

Keeping up with SQL Server isn’t exactly a full time job. Things are changed and added from release to release, which are years apart.

But quite often I find companies full of people struggling to understand basic concepts, troubleshooting, and remediations that are nearly as old as patches for Y2K bugs.

My rates are reasonable, etc.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.