In case you haven’t noticed I haven’t been writing blog posts lately. Mostly I’ve been writing training material, and boy are my fingers tired.
Since the training material is getting recorded, making videos of stuff usually makes for a better practice run.
People love complaining about free stuff. Like Americans in Europe when they realize you have to ask for ice in your water.
Anyway, a step I have always been bad at has been providing a written summary of what’s in the video. It’s a step I can never bring myself to take after I’ve spent who knows how long writing and recording and thinking about the zillion other things I have to do with myself. It’s also not something that comes very naturally to me.
I hate doing it. I hate paperwork. I hate checkboxes. Why do I need to do this? The video title is self-documenting.
Pumping Irony
So, hey, LLMs exist. LLMs can do things that I don’t have the mental energy to get set up to do.
I’ve been putting Claude Code to work on a bunch of stuff that would… Let’s be honest, it would never happen if I had to learn all the stuff around the perimeter of what I want to do. It’s a bit like the cloud in some ways. What used to be a rather monumental effort if you wanted to try to build something: domains, database servers, app servers, firewalls, and so on. All the stuff that would be a real impediment kinda became a click-a-few-buttons, spend-a-few-bucks, you’re free to do the thing you want now effort.
Where does the irony come in? Well, I got it in my head that I could have ol’ Claude build me a process that would:
Download batches of YouTube videos
Use local LLMs to generate summaries, chapters, and transcripts
Automatically update my blog posts and YouTube videos with the appropriate text
I think right now, it’s mostly working. I’m starting with old videos so mistakes aren’t as obvious and public.
Back to the irony: The very thing that allows me to do this is probably also the thing that makes it far too late to do any of this, ha ha ha.
You’re all spare parts. Don’t forget it.
Along The Way
The idea started off simply enough, and to be honest most of it just kinda worked.
Pythonista
YouTube and WordPress have APIs
Downloading videos is easy
Local ollama server makes running LLMs on my M3 simple
The LLMs I’m using (whisper for transcripts, qwen for summaries) don’t cost anything
It’s mostly just a matter of writing stuff to logging files and working off that
Probably the only part that made life interesting was WordPress. Sometimes the semantic search for YouTube video name > WordPress blog post name wasn’t very confident. Sometimes it would find multiple posts with similar names. We had to expand the search to narrow the search, by falling back to (gasp!) keyword searches for YouTube URLs in post bodies, and then even further to what sort of link was used. Text links are different from embed links, after all.
So how’s it going now? Well, it takes some number of hours (depending on video length) to download and process the audio tracks. The YouTube updates are easy, it’s just one URL to the other. I have to manually check WordPress for some things before applying those, but that’s not so awful. It’s just reviewing a control file.
Sure, there have been some bugs and hiccups along the way. For example, refining the post matching: some posts got updated with summaries and transcripts that shouldn’t have. There’s also a weird thing happening where some posts will get the YouTube URL copied in a second time, but that doesn’t happen all the time. Anyway, I guess my point is: Yes, it’s imperfect, but it wouldn’t exist and I wouldn’t be doing this if I couldn’t have a robot do the monkey work. It does not take a tremendous amount of self-reflection for me to admit that if I had to do this myself, there would be a whole hell of a lot more bugs and hiccups.
After all, I’m a visionary. I’m a big picture kind of guy. I know where I want things to get to, but I’ll be darned if I can write a lick of python or C# or whatever else to do it. This probably shows in my stored procedures in a lot of places, too. I am not the most gifted logical thinker. I’m forgetful. Sometimes I’m in a rush. Sometimes I have one thought and get distracted by seeing a million things that I need to correct, do better, or forget how it works along the way.
A million rabbit holes later, I can barely remember the thing I was supposed to be doing.
For Your Enjoyment
What you should see over the next days/weeks is old blog posts with videos getting updates to have summaries and transcripts, YouTube videos get updated to have summaries and chapters, and then finally, blessedly, to the great delight of Kevin Feasel (and potentially only Kevin Feasel), new videos having all this stuff when they publish.
Thanks for reading.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the latest updates for SQL Server 2025 Standard Edition, discussing its key features and limitations alongside my co-host Sean. We explore how the increased memory and CPU limits can benefit many users without the need to jump to Enterprise Edition, while also highlighting some areas where Standard Edition remains somewhat restricted, particularly in columnstore indexing and certain performance optimizations. We touch on practical considerations for those considering a migration from older versions or Enterprise Editions, as well as the potential benefits of moving to managed instances or Azure SQL Database. The discussion covers a range of topics, including High Availability Group (AG) basics and the challenges they present compared to more advanced features available in Enterprise Edition. Throughout the video, we aim to provide insights that can help database administrators and IT professionals make informed decisions about their SQL Server environments.
Full Transcript
Erikodynamovsky, L- Welcome back to our presumably second 15 minute episode of the Bit Obscene radio program. We’ll be talking about standard edition for SQL Server 2025, this go around. I do want to make a quick announcement here that this beautiful Darling Data T-shirt, I wore this during one of the days of the pre-con that Kendra Little and I did at Past Data Community Summit. in 2025, or well, I guess a few weeks ago now, and I will be auctioning this off. The proceeds of all of the auctioning will go to the Darling Data Home for Little Wanderers. So we do hope that this shirt will sell for quite a bit of money. It’s still… Did you wash it? No, it’s unwashed. It still smells like pre-con smear. Fresh Erik smell. Yeah, so you got a whole day of me in a shirt, on a shirt. So, hopefully, that will increase the price. Anyway, welcome back. We’re going to kick this off by… Let’s start with Sean this time, because Sean was…
Oh, baby. Sean was blissfully quiet during the IQP talk. So we’re gonna pick on Sean and make him talk about standard edition of SQL Server 2025 here. Sean, take it away. What is your opening salvo? Sean, take it away. I think there’s two things to talk about here, and I think they’re both good, right? You got the standard developer edition. Correct. Which technically meets standard edition requirements of this talk. And two is the increased limits for memory and CPU. I don’t see either of these as a negative in any way, other than more people complaining that it doesn’t have the features of Enterprise Edition.
But, you know, I mean, you opted to up the memory, 256 gig, which… Still pretty good, especially if you look at cloud offerings. Yep. Go see our talk on cloud. And, you know, given how much RAM costs now, right?
Oh, yeah, with the recent… No, it’s crazy. I bought RAM for 200 bucks a year ago, and now it’s like 700. Yeah. Wow.
What should I auction my laptop off? I have 128 gigs in my laptop. I feel like I should auction it with a micron. The brick of gold. Yeah, right? Yeah. Nice. If any of it’s micron, you might be able to get sentimental stuff now since they’re exiting the market.
I don’t actually know what’s in there. I don’t know. I could… If I thought either of my kids were destined for college, this would be their college fund. I was gonna say at this point.
Yeah. I mean, I have points in a liquor store. But, aside from that, I don’t know. Some of them are… The lesser of four sockets. Some of them are e-cores, so we don’t really count those.
Good. The lesser of four sockets are 32 P-cores. I agree with Sean. Everything is the lesser of four sockets. I agree with Sean. It seems like Microsoft did a good thing for once, and it’s hard to even complain about it.
I’m speechless. Joe agreed with me. Is that allowed in his contract? Fully agreed.
Well, the reason I don’t agree is because normally you say things that are wrong, but now you said something that was right. So I agree with you. It’s simple. One thing you mentioned is the resource governor edition on Staron, which could be huge for certain workloads. Memory grants specifically.
Yeah, using Comstore. It’s really easy to get these ridiculous 25% memory grants, which you might not want. I’m a big believer in resource governor for Comstore. I don’t know how you use it without it, but apparently.
Just not for CPU. No. Yes. Yes. Not for CPU. That’s correct. For memory and for maybe tracking parts of the workload to do reporting on it, but not CPU or CPU limiting or that stuff.
Yeah. People touch CPU and to go from an earlier joke, you know, if you just configure it correctly, all these problems go away. So.
So if you just configure it correctly, our friend Forrest has a good blog post about resource governor and CPU, which I would. I would argue people should read before ever touching it. There should be like a gooey checkbox in management studio where you go to check the box and you have to take like maybe a 10 question quiz.
Yeah. On it. Got to go read force blog posts and then take quiz. There are a lot of options that should require some quizzing before changing.
Like you should have to like, like maybe, maybe like a sobriety test. Like you should have to like. Well, I’m crazy.
Yeah. I never use management studio again. That’s bad. ADS is going away. That’s true. You’ll never, never catch me in there. But no, I think, I think Microsoft finally sort of like caught up with stuff. You know, for years I looked at developer edition is like a gateway drug to enterprise edition because people would make all their stuff on it and then roll out standard edition and be like, ha ha, we’ve gotten away with it.
And then standard edition just had so many crappy things about it that, you know, they need to run to enterprise edition pretty quickly. But I mean, 2016 added a lot of the items into standard edition. Well, yeah.
I think those were good things. They know that they were good things. And I’m not, I’m not arguing with that. But from my perspective, where standard edition is still woefully hobbled is around columnstore where everything is limited to adopt of two. Yeah.
And like, you know, granted you like there are some situations where, you know, batch mode adopt two is still better than row mode adopt eight. But that like those are those are even those are a little bit tricky at times. So I like, you know, if they like I think that, you know, the next thing would really have to be opening up some of the perf stuff to be cross edition, because some of it is and it gets confusing to figure out like, like scalar UDF inlining is enterprise and standard.
And it’s like there’s like there’s some stuff where it’s just like, no, it’s a net positive for society. We should do this. And then there’s other things are like, no, that’s enterprise only.
And I just feel like hobbling batch mode on standard edition is one of those like, like, like mean, like robber baron. It’s just like your family starves this Christmas. You didn’t pitch enough coal for me.
It’s like, ah, crap. Like, so that that that to me, like, you know, there’s still stuff about standard edition where I’m like, nah, like, don’t you can’t use that. But I do think Microsoft did did some good deeds with it for 2025. And I’m going to try not to gripe too much.
Eric wants something to complain about. So well done. Yeah. Like the batch mode and Magstop 2 thing. I forgot about that. Like, you know, they’re going to finally have like a brand new paradigm for parallel processing. It’s going to be super good.
It’s going to scale super high. It’s going to solve so many of the problems with Roma parallelism, you know, skewed rows and the in the repartition. With my own customization streams, very busy schedulers.
With the remote query, but you can only ride a Magstop 2. It’s like giving you the smallest possible taste of a drug. That’s how you get people, right?
The first one’s free. Small taste. Just start. I mean, I guess so. Yeah, that is an interesting limitation. Well, that’s not true.
I mean, standard edition a little bit in Pride, but I wouldn’t call myself a standard edition. I mean, standard edition is great for a lot of people, though. It’s just so express.
Well, that’s what I’m saying. Like, 2025 made Express even better. I agree. Agreed. But standard edition solves a really, again, a really unique area where Express isn’t quite enough. But Enterprise is obviously ridiculous in terms of getting a license and running it.
I think standard really does fit that role very well, especially with some of the security items that came through in the previous versions and stuff like that. But honestly, there’s a lot of things that people could run with standard edition that they don’t. And, you know, they use Enterprise.
And conversely, I think there’s a lot of things that people run on Express edition. And it’s like, I want to smack you. Stop. Just go to standard. Like, ah.
You know, especially now that standard edition can use, what is it, 32 cores? Yeah. Like, come on, man. Give me some better DOP options on batch mode. I want.
Well, if you think about the hardware, too, outside of, I know everyone’s getting super hot and heavy over ARM, which I’m going to argue is not the correct microprocessor architecture to run database stuff on. Even, you know, the ARM 64 EC stuff and whatever. But I’m sure I’ll get roasted for that.
It’s fine. I don’t really care because it’s, you know, you’re wrong anyway. But the amount of CPUs, at least for, like you were saying previously, you know, you had your laptop had some E cores in it. The amount of CPUs that are x86, 64, and our full execution units hasn’t really gone up super high.
Right? Especially not ones where if you want to take, you know, you’re talking about columnstore, if you really want to take advantage of a lot of that stuff, and even a lot of the new features where you’re getting crazy efficiencies from. I mean, you want those SIMD instructions, right?
You want AVX. You want all that stuff. You’re not going to get that on these crappy ARM and these crappy other CPUs or E cores or whatever. So, I mean, 32 cores, I think, is a really good spot, even by today’s standards, for actual decent execution of stuff with having full processor capability and what that entails.
Sean, I’m curious. Do you think some standard edition customers end up kind of being blocked from doing a migration to, like, managed instances or Azure SQL database just because they’re getting so much savings and licensing costs that going into a cloud platform that eliminates the difference between standard and enterprise just doesn’t make sense for them? No, because typically those people are going to go to that anyway because their CIO told them that we need to be cloud first.
Well, but in the maybe rare case where the CIO wants to report of, well, we’re paying X today, we’d be paying Y on the cloud. You know, if you’re standard, that’s always going to look better for you in terms of comparison, unless you’re doing, like, Azure VMs or something and you still stay in standard. Yeah, I wouldn’t say it would.
My personal opinion, I don’t think it’s a blocker because you’re already running standard locally unless you’re really, you’re going to be blocked from migrating for some other reason, right? Like, you can’t put your data there or whatever. And if you are going to migrate, it’s going to be outside of, you know, idiot telling someone that we have to be there.
You’re going to do it for other reasons, such as like, you know, we want more controller or longer backups. We don’t have the infrastructure for it or we really don’t need 64 cores, right? Or 32 cores.
We really only need four and we actually may save money by doing it because we only need a four core thing on MI or hyperscale or something. So I don’t think it’s really blocking anyone from doing it. I think that’s obviously nice that you get the enhancements, but I don’t think many of those people are doing it for the enhancements.
Like, I don’t think it’s the standard having new stuff is really going to block any of that. If you were going to do it, you’re going to do it first. If you have a legitimate reason, you were going to do it anyway.
If you don’t have a legitimate reason, you were going to do it anyway. So let’s suppose you’re on 2008 R2, which we all know is the best version of SQL Server. You’re on Enterprise and you don’t have SA.
Well, I suppose that’d be obvious if you’re on 2008 R2. What are the big things to look out for if you’re considering switching from Enterprise to Standard other than the Comstar, which of course you’re not using because you’re on 2008 R2? Yeah, going from an earlier version like that, a lot of things to consider that you are actually going to be upscaled.
So what was in 2008 R2 Enterprise is just going to be in Standard Edition today, which is great. I mean, that’s awesome, right? You’re actually people need to upgrade, you know, get off of Windows XP and, you know, anyway, there’s a lot of good stuff.
I think the things actually to watch watch out for are going to be some of the stuff we talked previously around, like IQP and things of that nature, where your workload’s not going to run the same. I mean, you’re just not, you’re going to get stuff that was slow is going to be fast, stuff that was fast is going to be slow, and stuff that was slow is going to stay slow. It’s going to be a mixed bag.
Like there’s, that’s really the biggest thing. Good. We’re talking about a more modern version of SQL Server, right? Like from 2019 or so, because the IQP stuff is pretty recent. So if like, yeah, but think about it, don’t have SA, it’s free to upgrade.
Is that how that works? Yeah. If you have SA, it’s free to upgrade. If you don’t, you don’t. So I don’t know if there’s been enough time where like, if you don’t have SA, you’d be looking for an upgrade. But look how many places are running in 2016, which is end of life, right?
I mean, there’s so many places around 2016 where a lot of those, I know you prefer not to. That’s the, you don’t want to look at those skeletons. We definitely didn’t retire a SQL Server 2016 instance last week.
That definitely didn’t happen. That’s not something that happened. I’m just going to say that for the record. What do you think about the basic AGs?
So let’s say I’m a very cost conscious customer. I’m going to use standard. Like what, what’s the starting point for HADR? Yeah.
The starting point is definitely. So technically database mirroring is still in there, right? You can still technically use it, but I will slap anyone who does. Like I will find you and I will slap you. Some people are into that.
I, well, it might be slap as a service for. Then you’ll get an invoice for it. Yeah. But the it’s, that’s actually a really great question, Joe, because it’s, it does offer the basics.
Like I’m not, I’m personally not a fan of basic AGs. Just in general, you get, you know, to what you were saying before you get just enough where it’s sort of helpful, but not really because of how it’s implemented. Like you get that taste and yes, that’s, that could be an impetus to upgrade.
But if you need that level, you should have been an enterprise anyway. I guess that’s just an argument there, but for the basic AGs, I do. I love them in terms of you actually get decent functionality for it.
I just hate that you can’t, there’s no upgrade path. And what I mean by that is you put in the basic AG. Yeah.
You can’t do your readable secondary. You can’t do a bunch of stuff for the reason why you would actually use it. Uh, but if you’re saying just for HADR, you kind of get the HADR, right? You kind of get it.
It’s not a hundred percent, but you, you get 50% of the way there and, and you can play with it and say, okay, this may actually work for the, so for this one rope, you know, maybe critical system or something, that’s where we go. We don’t upgrade any of the other licenses.
Sure. But I don’t like that there is no upgrade path. So what I mean by that is let’s say you create a basic availability group and you say, this is, this is awesome.
We love it. We want to go to enterprise. We tried developer edition, you know, as a, as an example for enterprise and it worked really well based on what we’ve done. This and that.
Not now, what do you do now? You take a replica out upgraded to enterprise and do what? Yeah, exactly.
As someone who is very ill-informed about this, I did ask a question on Stack Exchange four years ago and I think you were still boycoring at the time because you didn’t answer personally. So you really let me down there.
So the question I asked was basically, I’m reading about basic availability groups. You can’t do integrity checks on the secondary. This seems really bad.
Like, is this even tenable? And the answers were generally, well, if you’re willing to like fail over often so you can integrity check your secondary, then sure. That’s what everyone wants to do.
Yeah. Fail over into an integrity check. Yeah, that’s definitely what everyone wants to do. But speaking of a taste, I’ve heard rumors that we’re going to have a video dedicated to how 2025 changes pages.
Have you heard that rumor, Eric? I have. I have. I’ve actually heard that rumor about someone on this radio program producing that video. Yeah.
That person should really get it done. Yeah. That person should get it done. Anyway, I have two thoughts on this, and we can close out whenever. One is it two.
I was giving you the perfect way to close out, man. Let it go. 2008 R2 is not the best version of SQL Server. 2008 is the best version of SQL Server. Because that was before Microsoft changed their licensing model.
So I have a client on SQL Server 2008 Standard Edition, because they have a lot of memory, and they do not pay per core. So they’re balling out there.
And the other thing is that I do not understand why anyone would go the route of a basic availability group when a failover cluster in log shipping still exists in the world. Yeah.
Yeah. Yeah. You’re on Standard Edition. Don’t bag it. Failover cluster in log shipping. Yeah. Failover cluster in log shipping. Sounds like you’re trying to answer it to my question. I heard a rumor that log shipping was updated in 2020.
I can mark it. Really? Accept it. Wait. Wait. Wait. Hold on. Stop the presses. What got updated in log shipping for 2025? It’s breaks when you upgrade, right?
Aside from that. Oh. Wait. It’s called a feature, Joe. Wait. Did you want more than that? Yeah. I wanted more ammo for my log shipping love. It supports TDS-8.
Oh. Well, that doesn’t do anything for me. I’m sorry. You’re more secure, Eric.
I was already secure. I was already quite secure. All right. Well, I think that’s enough about Standard Edition. Thank you all for watching. I don’t know what we’re going to talk about in the next video yet.
It is undecided, but from the look on Joe’s face, it will be something very serious. So whatever we decide to talk about, you will listen to and enjoy. Anyway, thank you.
Goodbye. Goodbye. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I dive into the world of SQL Server 2025’s intelligent query processing, discussing it alongside my co-hosts Prop Joe Obish and Sean Gilardi. We explore topics like AI-driven cardinality estimates, which can sometimes lead to unpredictable performance issues, as well as the challenges with parameter-sensitive plans and abort query execution hints. We delve into how these features might impact real-world scenarios for DBAs, especially those working with third-party applications where they have limited control over queries and indexes. The conversation highlights both the potential benefits and the pitfalls of relying on such advanced features in SQL Server, offering insights from practical experience to theoretical considerations. Join us as we dissect these complex topics and share our thoughts on how Microsoft can better support DBAs in their daily tasks.
Full Transcript
This computer. All right, welcome back to the Bit Obscene radio program. I have two candidate temporary co-hosts with me today, Prop Joe Obish and Sean Gilardi. This episode of the Bit Obscene radio program is brought to you by Meridomo 100% Cacao, Sin Azucar, apparently. Brand new sponsor. They sent me this box. It’s lovely. You can eat it and your face gets a little numb and you feel a little sweaty and you just want to talk to people. So I like it. So I took a bite of that so I could talk to these two. Today we’re going to be talking about SQL Server. Actually, we’re going to do five episodes in a row. Short episodes, hit and runs about SQL Server 2025. The first topic is going to be, you can see the anticipation on Joe’s face. It’s going to be intelligent. Query processing. I don’t like how you committed us to doing five because now we have to do five. Whereas you had been like ambiguous, then we could have. The ambiguous number five. Well, we’re doing, we’re doing five. And if we do more, then they’re called bonus episodes.
Oh, we have to do, oh man. And if we do, if we do fewer, not less, we do fewer, Joe. Excuse me. Yeah. Okay. So have a little respect. If we do fewer, then it doesn’t matter because I will do two by myself and just pretend that I fired both of you. Okay. All right. So we’re good to go. Start us off, Joe. What do you, how do you feel about intelligent query processing, the state of intelligent query processing in SQL Server 2025? So the way I feel about it is a long time fans of the pod will know that I strongly believe in fixing a query performance issue once, and then you fix it forever and you never have to look at it. This is different from the consultant approach of wanting as many billblahers as possible, but I’m not a consultant. So I can get away with that. And, you know, the thing about intelligent is kind of means unpredictable.
Hmm. Right. So erratic even I’m going to show you my pets. So this is my, my pet rock. It is very predictable. It, it, it, uh, never poops anywhere. It shouldn’t. It never chews on wires. Very house trained. It is perfect for me. And this is how I want the Kray automation to be. I don’t want any surprises. I don’t want any. Very, very intelligent. Um, Oh, Joe, you were born just a little too late.
Changes. Right. Oracle have that. They called them. Roles. If you, uh, like if, like if we just get into the, um, well, you can have carnality estimates that change through AI or something, right? Like what does it even mean? Um, like I have a performance issue yesterday. I tried to reproduce it.
Today. I’m getting a new AI cardinality estimate. I can’t reproduce the issue or something’s working today. It gets worse in the future. Cause as we all know, all other than Sean, sometimes better estimates can lead to worse performance. That is true. I’m sure you’ve seen it. That is true.
So like with respect to the entire feature, uh, I guess, I guess, I guess it’s a bit of a, uh, design choice. Cause I remember someone, maybe it was, uh, Pablo Blanco told me that the, the, uh, legacy. See, there are a lot of like special cases, Microsoft developed over the years.
Presumably this was back when, uh, you know, connect actually worked. So, so people would, would complain about their issues and Microsoft would hear their complaints and fix it. And part of the, the new sea new being 11 years ago is it’s, they got rid of a lot of those special cases for, for better and worse. And I do wonder if this AI carnality estimate, like if that fits in that model or not, um, um, so to, to, to, to give an example of something I saw recently, as we all know, other, other than Sean, maybe if, if you’re filtering on a column that stores, like when the row was created, let’s say you do like, uh, creation time greater than get date, you expect there to be very few rows returned by that query.
Yeah. Yeah.
Yeah. Yeah. It was greater than get date at time zone.
Anything. Hmm. Suddenly that’s an unknown inequality. Pretty good. Hmm.
We don’t know anything about that. So now we’re in the, uh, 30% default guests situation. And I think the biggest Delta you can get for at time zone is 26 hours or so. Uh, you know, in real life, it’s probably going to be only a few hours.
Uh, so from the point of view of like, you know, I, I want. My pre-optimizer to know things like it could, it, it could constant fold at time zone, but it doesn’t. Or it could treat at time zone is not really being there.
Cause you’re generally shifting by a couple hours. Personally speaking, I wouldn’t expect to get a totally different Carnelli estimate just because I’m shifting a parameter by a couple hours. But like, that feels like the kind of thing, which would be, um, that might be a special case in legacy CE.
Maybe, I don’t know, but it’s certainly not in the new CE. And, you know, if, if we’re talking about all of this do AI stuff kind of feels like those types of fixes might be dead because like, why, why do you special cases when you can just have. Yeah.
I do everything. Right. I mean, in the, in the specific case of that time zone, I have never trusted that function in a query, uh, aside from maybe like, unless I need to, unless I’m using it, like in the select list to figure things out. Uh, most of the time, if I need to do anything with time zones, I’m not saying anything at time zone.
I’m figuring out like what the difference in hours is with that time zone and then doing date math based on that difference. Like I, which is probably more accurate to be honest. Cause yeah.
Cause yeah. Cause like then you can, you, you get, you constant full, like whatever that, that, that one scale our value and you’re good to go. So at time zone can screw itself. I hate that thing.
Uh, it ruined three of my nights when I was working on stuff in query store. So I don’t, I don’t even want to go there. Um, as far as like intelligent query processing, processing stuff in general goes, uh, you know, like I, I, I appreciate the, uh, the attempts and efforts to make things better, but it all just feels a bit bolted on at times. And it doesn’t feel, and it feels like all this stuff where it’s like, we’re going to do this heuristically and we’re going to like figure it out.
And like, you know, whatever, like, like, so like the two things that I have the biggest, uh, gripes with so far are, uh, batch mode on Roastor. Um, there is at least a newly discovered hint by our Russian friend who to override batch mode heuristics. So we have that going for us.
No, granted there are a million other ways to override batch mode heuristics anyway, but the details of that stink. But, um, the other one is the parameter sensitive plan optimization. Uh, the way that it, it gauges skewness and worthiness of kicking in is nonsense.
And the way that it buckets things is absolute garbage. And I, I wish that like anyone in the world with like half, uh, half a sense would, would take another look at how that, how that thing works and fix it. Because it makes no sense to have like a bunch of like middle values get bucketed because those middle values are not the highest value or the lowest value.
So like you’ll have stuff with like 10 million rows sharing a plan with stuff that has like a hundred thousand rows because they’re not the highest or the lowest. And I feel like that, that whole scheme is just the, one of the dumbest things that I’ve seen choice wise in SQL Server in a long time. Um, um, dot feedback is another one that belongs in the garbage bin.
Oh, wait, wait, hold on. What do you, how do you, why don’t we talk about brand of sensitive, whatever, as a group? Okay. Before, before we go into that, before we go into that. Okay.
Yeah. It seems like Joe, you’re, you’re almost asking, uh, for the old Oracle style rules where you say, join this 10% do, do a nested loop. Join always do this, always run it this way.
And you kind of lock it in because you know better. Oh no. That’s what it sounds like. So I’m asking for a way to be able to communicate possible query optimization improvements to a top Microsoft man like yourself. And then maybe there’s a small chance we get that in the product.
Um, like the, at the, at times, an example is entirely like, like theory based in terms of like, you know, like what’s reasonable or. Let me wear it this way. The more information you have during crowd designation, the better in general, right?
Generally at time zone, isn’t some scale or UDF black box. We have no idea what the output is going to be. It’s a very well-defined thing.
It has a well-defined range and how it’s going to change your data. If Microsoft wanted to, they could enhance, uh, constant folding or just. Kernel yes submission general to work better with that time zone.
Could. For sure. Now I’m not saying that, that, that, that they should, but it feels like, you know, like, like that type of tweak. Is maybe going in the wayside.
I’m not looking for anything like, like, like Oracle. Oracle is. Way too expensive. Oracle is its own set of problems. Um, so with respect to.
We can get to PSP, but I wanted to also get your feeling on abort query execution. Man, we’re just jumping around and I don’t have an opinion on that. Do you have an opinion on that?
All right, go ahead. Oh, I have, I have a strong one. Oh, go ahead. Oh man. Wait, we’re going to mix everything up and our poor radio viewers are going to be confused. Maybe we should go one at a time. So, uh, the, uh, the abort abort query thing is a major whiff by all of Microsoft, the entire organization, top to bottom.
Um, the fact that all it does is kill a query. If it starts running is absolute gobbledygook. Like that thing, if, if there were any, again, if there were any sense to anyone, then that thing would have a number that you could supply to it.
They would tell a query when to time out. Cause like most of the time, if you’re a DBA, you’re not like, it’s not like, oh, I never want this query to run. You’re like, this query has five seconds to finish.
And if it doesn’t finish, it’s over. Like, like just having a query get killed immediately is, I mean, I suppose there are some circumstances where you would put zero in and be like, no, you, you don’t get to run or whatever. But like, for the most part, you know, like this, like that, that hint is absolutely gorgeous for off the rails reporting tools like Looker and other crap like that.
Um, and like, you want to give those queries a chance, but if they, if they miss their window, then they, the query just gets executed. Um, there was something like that in resource governor, but you need a trace flag for it. It was like the max CPU time thing.
But like, unless you have a trace flag that doesn’t really pan out. And I just want to point out too, that the use, the use cases, not necessarily for a board query execution, but for a lot of this stuff can be, uh, you know, Joe and yourself are doing a lot of actual query tuning system tuning. You know, design, design, design help and work and stuff like that.
A lot of poor DBAs. And I, I was one for a long time, got handed very crappy applications bought by some third company, third party company that God knows what. And it said, you can’t touch it.
You can’t do anything. Also, we’re going to call you when everything breaks and run slow. Yep. And I’m like, what do you want me to do with it? I can’t, you’re telling me I can’t even, I can’t even make a, uh, you know, a plan override for any of these.
Like you’re, I can’t do a plan. I can’t do anything because I, you know, void our warranty on, on the app. So some of this stuff doesn’t make sense for people who are actually tuning queries, getting stuff done. But for those people, you know, have to deal with some of this crap that comes through and it, it, it’s not ideal, but it’s better than nothing.
Uh, I mean, sure. But like, like, I think Eric’s point is, it seems very similar in engineering complexity to have the feature as is compared to have the feature where he can like set a number of seconds. Because the feature as is, is effectively killing after zero seconds.
Maybe. And Eric just wanted to be able to tweak that one number. Is that accurate? That is correct. I, I think I can see both sides.
Like to Sean’s point, that was my interpretation of the use case too. Where like, if you find yourself implement, cause you could just do it yourself if you had to, right? Like, well, we have the query timeout, right?
You can set a query timeout, but it’s already run for that long and use those resources. Well, in the application you can, but like, yeah, well, I thought we were talking about you, you’re locked out of a third prior application. You can’t.
Oh yeah. Go ahead. Because I’m sure people have done things like I’ll throw an agent job up every minute. And if I see queries running longer than X seconds, if there’s a certain pattern, then I don’t know if we can kill those queries. So instead of doing that disaster, you could use this if you’re okay with just not letting them run out.
But you know, like that’s, that’s evaluating the feature from maybe too narrow of a use case. Whereas Eric is dreaming bigger and greener and with more rainbows. We’ll wash that.
I mean, my question here is how many, how many queries in, okay, let’s just say like, we’re in Sean’s world from when he had a real job. And like, you have these crappy third party applications, you’re not allowed to change a query and index a setting, anything like that. All you can do is like, I don’t know, like restart SQL Server if everything locks up.
Okay, there you are. And like, you know, you’re, you’re, you’re, you’re in that world, you’re in that place in time. And Microsoft is like, here’s this hint.
And you’re like, cool. Now I can just have these application queries not run. And now all of a sudden, you know, end user Bob and end user Sally are like, well, it’s time to go to work and do my job. And they, they click execute.
And it just comes back with an error. That’s like your query has been killed. Like, like, like what, like, what can you reasonably block an application that’s not going to mess up an end user’s day? Like, you have to give it a chance to run.
And if it can’t run, you say like your query timed out. That’s a much, that’s a, that’s like nice and friendly. If someone just like, you know, hits whatever button to like, you know, like, like submit something, and it hangs out for a little bit. And then like, it just immediately returns, like, you’re not allowed to run that query.
And they’re like, oh, cool. I can’t do my job. Who do I talk to now? Like, it’s just, it’s an, it’s, to me, it’s just wholly unrealistic to have that situation where you’re like, like this, like, and, like, unless, and look, and, and end users are annoying. Right?
Because like, you might have like a bunch of like, like business intelligence people who are going to be running stuff through management studio. And they can easily change like one or two little things about a query and have it completely bypass whatever you set up for that abort thing. So like, like, what’s the point?
Like, like, get, like, give the query a chance. If the query can’t finish within like, I don’t know, five, 10, 30 seconds, kill it off. Like, if it used the resources that caused a problem, it spiked things up for that time. Fine.
But then now it’s over. Right? Like, you’re done. I have an example for you actually from, from real life even. Oh boy. Like a developer messes up a query, which is like checking a queue. Uh huh.
And instead of running once per minute, it runs a billion times a day. Mm-hmm. Which is something I’ve seen. Yeah.
So like for that case, you don’t want your billion times per day query to even run for like, well, for like one second. Okay. Fine. Like, like, like, I would view it as like, you’re picking the lesser of two evils. Like, you know, the entire application is slow and down versus, well, we can turn off this, this queue thing for a couple hours and we can process the data manually later.
And then you just might not care that much about this thing, which is suddenly running a billion times a day when it wasn’t supposed to be. Okay. But if you’re, if one of your developers messes up a query like that and you turn off the queue service and you say, hey, developer, make a hot fix with a wait for delay in there.
Like, like, like to me, like, like, like, you’re not like, you’re probably not going to find an enterprise application that screws up that badly. Like, granted. Okay.
Like there, there are cracks that everything can fall through, but like for the most part, you’re not going to run into a situation where you’re just like this query can never run. So you’ve never used teams. No, you’re right.
I specifically avoid not using teams as in my consulting contract. So you’re effectively basically saying the same thing, right? Where you’re viewing the, the kill zeros at zero seconds thing as too narrow and in your, in your technicolor dreams, you want to be able to change that from zero to X. And that’s going to cover more use cases, which I don’t think anyone would disagree with.
But I mean, you know how it is. Everything Microsoft does, it’s never done right the first time, right? Yeah.
But the problem, the bigger problem for me is that there’s often not a second time with a lot of these things. So you’re just stuck with things in V1 and you’re like, cool. Oh, never going to use that now. Like just, perhaps we can add some intelligence to the development process for intelligent query processing.
Perhaps, perhaps we could. You could call it, I, I, Q, P. But, but how do you know, but how do you know how to intelligently fix a query if it’s not allowed to run?
Like, you know, you, you, you, you’ve got to give it a chance. You could even have like a whole thing. You could even have a whole thing that kicks in and it’s just like, like this query has been aborted because it didn’t finish in 10 seconds, like a hundred thousand times.
Like, let’s really think about this one. Let’s really do our best to make this one better. Speaking of doing our best.
Yeah. I believe we agreed to a specified length of the video and we’re too much over it. But with respect to parameter sniffing, your favorite thing in the world, my understanding of the, what’s this thing even called?
Parameter sniffing. Plan optimization. The parameter sensitive, parameter sensitive plan.
Oh, parameter. Yeah. Parameter sensitive plan optimization. Do you know if it still only works for equality predicates? Yes.
Only equality predicates. There, there are a number of strange rules. That one feels weird to me. Like maybe my world view is too narrow, but. I don’t think it is. You got the gloves on.
That’s why it feels weird. Most of the. More fake news from Microsoft. Right. Most of the. Parameter sensitive.
Yeah. I see are relating to inequality predicates. Yep. On dates. Where you know that they, they have passed in today or a year or, or a 10 years from now. Yep.
And it’s not so much. Equality predicates. Uh, at least that that’s in the workload that. The workloads that I know of. So the whole thing kind of feels. Not exciting.
Yeah. No, I mean, like in, in my, you know, consultant demo world, it’s very easy to find a quality predicate issues in the stack overflow database. You have like the post type ID and vote type ID stuff.
And there are wild swings in the number of post and vote type ID. So it’s very easy to show it a not working when it should, uh, with those and be the bad bucketing that happens even when it does work. Um, but I think generally you’re right in that, uh, you know, a lot of the problems that I see with, uh, parameter sensitivity are with, um, bounded date ranges where, you know, for the majority of the queries that come in, they’re looking at the last hour or day of data or something.
And, or like the current working day of data, whatever it is. And then every once in a while, someone will come along and say, oh, well, I need to see six months of crap now. And then that plan just, you know, bites it.
Yeah. So like, yeah. So like, I, I feel like there should be some, like, there should be something like rather Microsoft has enough crazy rules around date. Things in query plans, like get range through mismatch types and get range through convert that it feels like a very natural fit for the project.
Parameter sensitive plan optimization to have special rules around dates where it should be able to look at them and be like, oh, like, yeah. Like the last time this ran, it was for the last like four hours of data. And Hey, this person’s looking for a much bigger range.
We should probably figure something else out. Like that seems like a, like a good special case for me, but what do I know? Maybe part of the thing that makes it tricky is you often see, you know, the start date and end date parameters too.
So then it’s like, you have to optimize for the pair of parameters, which I’m sure is way too much. No, you only, you only, you only have to ref, you only have to engineer for the distance between them. Well, right.
But if your startup teacher has like a hundred parameters and you know, like you as the, as the expert know, well, these two parameters really matter the most. I, you know, like the problem has shifted from one equality predicate to one inequality predicate to the right pair of inequality predicates. It certainly seems possible, but I mean, I, I still would like them to do inequality first.
Yeah. You know, for, for me, it feels like the optimizer is smart enough to pick up on those patterns when a query is being like optimized. Like it, like it, it can, it, it like, it’s like looking at the parameters and it’s looking at the columns and it’s looking at the way that they’re being assessed.
And you can make a, it’s pretty easy to infer from all of the various trees and, you know, parsings and bindings and all the other stuff that goes on that like, oh, like, yeah, these, these two pair up together. Like, make sure if you just like saw like a store procedure and a list of parameters, it might not be obvious at first, even though you might have something called start date and might have something called end date. And maybe some queries use start date and some queries use end date and they don’t pair up like that fine.
Like, you know, there’s, there’s edge cases, but like, as soon as you start like really getting into looking at a query and trying to figure out a plan, you have a pretty good sense of when they are paired and when it’s probably a good idea to figure something out. Oh, Sean, Eric thinks it’s easy. Will you have good news for us in C++ Server 2021?
Or 2030 or whatever, whatever else. Yeah, whatever. When are they going to have local variable deferred compilation? No idea.
Probably never, because think about how many like behavior changes you get from that. Well, I know, but like it should be, the option, the option should be on the table. We’ll just do that.
Add that to the connection options. Yeah. Option. Eric’s additional option. It’s everything is just a database scope config or a query hand. There’s just, there’s probably.
Just look under preview features. Yeah. Speaking of preview features. No, we’re not going to switch topics. We won’t do that. We won’t do that to our nice listeners. We should switch topics.
You could throw a local variable into a one row table variable and then maybe get some deferred compiling on that. But you still don’t have a histogram on that.
Well, they shouldn’t prove that and add a histogram. Ah, well, then you just have a temp table by a different name. Yeah, but you have a temp table without all the temp table issues. But if it has a histogram, they just, well, I guess if they, they just don’t, if they don’t do the caching.
It’s not a single row, like it’s going to be no problem. Yeah, sure. There, right there. We figured it out. Okay. So table variables can have histograms as long as it’s on a single row. Right?
Sean, you’re taking notes. Yep. You see my list? Okay. Yeah. Yeah. I saw it. I saw it. How do we feel about DOP feedback? I actually had an opportunity back in 2019 when I was on Microsoft’s campus, a very intelligent gentleman who I respect greatly, but who also has no internet presence.
I’m not going to say his name. Asked me if I wanted to give feedback on the DOP feedback feature. I said, this seems not very useful.
You shouldn’t do it at all. Which was, I guess, not the feedback Microsoft wanted in that case. Because sadly, they still did it.
I looked at it when it was 2022 and thought it was pretty flawed. Should I just go for it? Yeah, go for it.
From what I remember… It’s a long silence. Well, it’s hard to figure out. I’m for it.
I don’t know how to word things sometimes. We’re reading from disk on this one. One of the big problems with… Azure pass. Row mode parallelism is you can really get hammered by performance depending on how busy the server is. Yeah.
Way more than batch mode, way more than serial queries. There’s a demo on a blog post on your site where I think the performance difference was something like 25 to 1. Is this the round robin?
Not round robin. It was the… Demand? Well, this is where demand would be way better. Ah, okay.
It was some very simple query with row mode repartition stream operators. And if you had like a single CPU that was very busy. Yep.
The query runtime got like 25 times worse. Yep. Due to the yielding. Mm-hmm. So I don’t know how you can effectively do DOP feedback when you’re not kind for things like the server is busy during these hours.
And this is when I really care about performance. Yep. Because the thing is gathering data all the time.
And if you start gathering data like after a server restart, maybe that’s at 2 a.m. Well, all your parallel queries are doing great at 2 a.m. Because there’s still little work on the server.
Yep. So there’s that big issue, which maybe it’s better in 2025. I have no idea.
I think they’re kind of cagey on the details. I don’t. The other thing that bugs me about it is I’ve never personally tuned a query to anything but max.op1 or really, you know, encouraging parallelism for query. Yeah.
Like I’m not doing, oh, will this query run at DOP2 and this one run at 4 or this one run at 6 and this one run at 7 because I’m feeling lucky that day. And like even worse for the poor soul who has to do that, this feature isn’t actually respecting that.
Because if you ask me how to design it, if we had to do it, I would say, well, if, you know, if, if, if a person or an AI added like, like the literal max.opx hint to the plan, presumably someone like looked at that query and like that DOP should not be overwritten by whatever automated process this thing is running. But that’s not how it works. I don’t know.
I don’t like it, Eric. I don’t like it either. I, I think it’s goofy because it only adjusts DOP down. It doesn’t adjust DOP up. It would never look at a query running at DOP 4 and be like, oh, DOP 8 is better, like magically.
But I think the, the, the, really the, the biggest flaw in it for me is that it will never adjust down to max.op1. It only adjust down to max.op2. And I remember hearing at some point that they couldn’t accomplish that without a recompile.
But we all know that’s nonsense because when a SQL Server comes under enough memory, enough CPU pressure, it will run a parallel query plan at DOP 1 internally. So it doesn’t actually need to recompile. And there’s really no reason for it not to get down to DOP 1.
But like, like for me, you know, the, the thing that I would much rather see happen is if like, if we’re talking about like parallelism feedback is like, like fix the parallel page supplier. Like if you have a, if you have a parallel query that’s ending up with incredibly lopsided threads, like, like you, like you have a DMV that shows that, that is, that is detectable by SQL Server. Like it, like you could actually do something meaningful with like whatever algorithm is being used to distribute rows to threads in order to balance that on like a future execution.
And that would be far more meaningful to me than like, you know, you know, some other features sitting in the background being like DOP 8, nope, DOP 6, nope, DOP 4, nope, DOP 2, mm, DOP 2, just leave it. Like it, it, it, I think it’s just a waste of time. You reminded me of another complaint I had, which is like, it seemed very focused on preventing recompiles.
But if we’re talking about tuning these big parallel queries, like I, I’ve never thought, oh man, I just parallel query that, that runs for 30 seconds. I have to avoid recompiling no matter what. Like, like that’s like, that’s my number one priority.
No recompiles. Um, that was split. Maybe that was just decided by, uh, cookies and it can’t be overturned, but. I don’t think cookies is on that one.
If you, if you’re able to move past that, I actually had a good idea. I think just now, like, what if we had a batch mode heuristic, like IQP where if, if you have a remote query, it has terrible distribution on the repartition streams, tons of parallel weights. Yeah.
Next time you execute it and enable it, uh, batch road heuristics. Well, that would be nice. That would be, that would be sweet. Joe mode on the. Yeah. That would be way better in my mind. Then we’re going to take a terrible row mode parallel query and maybe downgrade it.
Yeah. Cause like, cause like, if you think about a query that’s running properly, as in it’s doing a batch mode or demand based parallelism.
Yep. I can’t think of a single case where, oh, I have this great batch mode hash join. It’s running a DOP. A. Whoa, whoa, whoa. That’s terrible.
We need to knock that down to four. Like that would never, that would never happen because well, you know, just the internals are somewhat, are so much better suited to do. If you’re running a DOP, it’s too high for batch mode or demand based parallelism or broadcast parallelism.
It doesn’t really matter unless you’re running out of worker threads. But again, if you’re in batch, when you’re not going to run out of worker threads anyway. Oh, all right, Sean. Eric and I figured it out. Yep. We’ve. Ship it.
You’ve got ship it. Nothing else to do. It’s all done. Yep. You guys did it. We did. We did. Congratulations. Thank you. You should, we should get hired as co-PMs to just fix SQL Server. Cop them.
Yep. Anything else on IQP to make it IIQP in the future? Hmm. Nah. I’ve talked about everything I want to.
Yes. Wait, wait, did you want to complain about the SPX QSQL? I mean, my, my complaint with that is I don’t understand why you would want a similarly bad behavior to occur.
Compile storms. Yeah. So like I, I’ve had a number of clients where they’ve had store procedures run and all of a sudden, like one of the queries is compiling a plan and you have compile locks and now you have a bunch of other copies of the store procedure waiting to compile a plan and are waiting to use the plan that the current store procedure is compiling.
And I fixed that at quite a few places by using SPX QSQL so that now you just have like a bunch of queries compiling independently. None of them really sit around waiting.
Everyone, everyone’s happy. Like they don’t have like one store procedure that just locks up a million other copies of the store procedure. And now that’s all going to get broken, right? Let’s screw up my whole system.
I had a very good system. I had a very good thing worked out and now I’m getting screwed on it. Now you get more hours. I have a question on that actually, because I’m, I’ll admit I’m not familiar with compile storms or so-called compile storms.
So is the way it works. You could have many procedures. There’s no compiled plan. The first one to start compiling gets some type of lock resource. The other one’s a wait.
Yep. The first one to finish is compiling. There’s now a compile plan. The other sessions stop waiting and they use the compile plan from the first one. Correct.
So in what scenarios would that be bad? The one I’m thinking of is maybe the CPU you’re compiling on is like super busy and the first compile takes a long time.
So it holds a lock for a long time. Is that an example? It can be a lot. Yeah, there’s, there’s, there’s two main examples that I’ve dealt with. One of them is, I don’t even know if this is the word at this point. One of them is mitigatable by async stats updates.
Cause like some, sometimes the long compile lock is because of a long synchronous stats update. But then other, you’ll run into other times where it’s a long, like just like a long query compilation.
And like, for whatever reason that just sucks the life out of the server. Like I said, Joe, as a real life example, there’s a store procedure that was tens of tens of thousands of lines long, for example, you know, and, uh, it would take minutes sometimes to run the compilation stuff on that.
And the, I don’t know if you remember when, um, microservices was super hot. Everything was a microservice. Still kind of is.
Still kind of is. Yeah. We just don’t, it’s not super hot to say anymore. No. So, uh, we’ve moved on. Yeah. Yeah. CIOs don’t get jacked off on all over the place anymore. So, uh, anyway, so you would have this and then the, you know, microservice, oh, we need to spin up another microservice and it needs to, the query didn’t execute.
So we need to rerun it again. Check it in one second. All right. All right. Now submit it a hundred more times. And all of a sudden you have a 32, 64 core server that has 15,000 new connections getting to it.
All trying to run that query because it didn’t run the first time because microservices and you don’t care. Why are we just firing forget and hope everything goes? Well, I don’t understand that example because if it’s taking multiple minutes to compile, like wouldn’t it be worse to have many sessions all doing the multiple minute compile?
Cause, cause I feel like you’re complaining that, you know, the code’s written very poorly. Well, you need to. Right.
But you do have to pick, pick one or the other. Like if you had the compile storm, you have a bunch of sessions waiting on the lock. If you don’t pick it, you have a bunch of sessions all using CPU and other resources, trying to compile the plan.
Like they’re both bad, aren’t they? Or, or like, I don’t really see why having a bunch of sessions waiting on a lock is worse than having a bunch of sessions all doing the multiple minute compile at once.
I mean, so for that example, a, a better thing would be, let’s say it only takes 15 seconds to compile. But a lot of these, when I’ve, when I’ve worked on these, it’s been, and I mean, thousands of connections all coming in, all running the same query.
And if you would let them go and all run in the 15 and compile and do this stuff, then you’ll start getting out there. Like it’ll, it’ll, it’ll start making headway.
But if you let them all block up, then they continually block up. I mean, you’re still going to hit a point where you DDoS yourself, but that’s, there’s really not, there’s really not a good way other than not having any, even if you take out the compilation storm, it’s still not good.
Right. Yeah. All right. So I think you’re saying there’s some internals reasons as to why getting those hundreds or thousands of pile locks at the same time is going to gum things up.
Like not to, not to blame you guys in this case, cause it sounds like a pretty annoying problem, but like there’s something about the lock manager or whatever’s, or whatever’s even doing it, which makes it not able to efficiently untangle all those thousands of lot of compile locks.
Well, just think about if you have 20,000 connections, all sleeping, waiting on one lock, and then you, the lock gets freed up and then you go have to say, all right, let’s walk through 20,000 sessions and say that these are all, you know, runnable.
It just, the fact of doing that makes it ridiculous, right? You’re doing more overhead to do that than to just say, actually sit there. I’ll pick one of you.
What, none of you will get to run if we do it the other way, at least this way, one of you gets to run. Yeah. Yeah. My, my experience has been converting like, like, especially so like, like, like maybe, maybe my example was a bit oversimplified where like, there’s one query in a store procedure that like locks things up for whatever reason.
But like, you know, there can, like when you have incredibly long store procedures with a lot of, you know, like if branch logic, stuff like that, where, you know, rather than compiling a plan for everything along the line in that one go, putting this, putting stuff into the dynamic SQL gives you some separation of duties or like putting stuff into other store procedures, gives you, gives you some separation of duties, some like, some like deferred compilation stuff.
And a lot of the times that does solve a pretty big problem for the compilation locking stuff. So, you know, there’s, there’s like, I’ve just dealt with so many weird permutations of it that like the dynamic SQL thing is a very obvious answer to me.
But like, you know, like I just really hate that now dynamic SQL is going to have like that same sort of compile locking behavior. Because that, that, that really, that really undoes like a, like a behavior that I’ve relied upon for, I think the first time I ever actually had to do that was like early 2020.
So like for me, for me, that feature is a big lose because I like it, to me, it undoes a very like, like, like reliable and dependable behavior that fixed a lot of problems. Sounds like we’re all looking forward to UQP in the future, right?
Yes. New Q, New QP. U QP.
U as in intelligent. Yes. About that 15 minutes. Yep. All right. Well, I’m a good in this topic. Yep.
Me too. All right. We’re going to end this one here and we’re going to talk about standard edition next. I think if anyone can still talk after all this. Yep. Yep. I’m doing it.
I’m doing it to all of you. All right. So we’re going to stop this recording and thank you all for watching. And I hope that you’ve, you’ve found a way to survive this verbal intercourse. All right.
All right.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I dive into an absolutely fascinating query plan that Erik Darling from Darling Data explores on a Friday afternoon, perfect for winding down and enjoying some SQL Server shenanigans. We delve deep into the intricacies of a segment top operator within a cross-apply, revealing how it handles ties in user IDs and dates, which is both amusing and enlightening. The query plan itself is a delightful mix of expected and unexpected elements—like the batch mode hash join running alongside row mode operations, leading to some very peculiar bitmap handling that challenges our understanding of SQL Server’s execution plans. It’s a must-watch for anyone interested in the nitty-gritty details of how SQL Server processes complex queries, especially those involving multiple aggregations and joins.
Full Transcript
Erik Darling here with Darling Data. And today we are going to have an astoundingly obscene amount of fun, looking at what I think is a fun query plan. And since I have all the power here and I can do whatever I want, we’re going to be publishing this on a Friday. So everyone has a fun Friday and gets off to the start of a great weekend. Down in the video description, all sorts of useful links for you and me. You, so you can have SQL Server help or training, and me, so I can get money. It’s a wonderful trade-off. You know, there’s all sorts of labor theories involved here. If you want to hire me for consulting, buy my training, or support this YouTube channel, you are, I mean, I would say you are free to do all those things. You are free to do all those things, but none of those things are free. If you want to ask me office hours questions, the price of office hours questions is going to go up to $10,000 a question at the stroke of midnight, 2026. So get the free ones in while you can. And then, of course, if you like this channel and all the shenanigans therein, please do like, subscribe, tell a friend, all that good stuff.
After a long winter of solitude, I will be springing to life first at the Data Tune conference in Nashville, and that’s March 6th and 7th. And then Data Saturday Chicago, March 13th and 14th. Tickets are on sale for my Advanced T-SQL pre-cons at both of those. So if you’re in the area or just planning on attending, you don’t even have to actually attend. You can just buy a ticket and then have yourself a little Ferris Bueller day, and I won’t know the difference.
Just how many people showed up. I don’t know how many people bought tickets. I guess there’s an extra lunch. I don’t know. Now everyone gets an extra taco. Donate a lunch. You did a good deed. Look at you. What a sweetheart you are. Anyway, it is December. We have all our arms. We’ve got a ghost. Let’s go look at what I think is an interesting query plan.
So there’s an old Paul White blog post about the segment top query plan. And I want to explain a little bit about that before I get into the other parts of why this query plan is interesting. So we’ve got this sort of thing inside the cross-apply here, right?
This whole thing is inside a cross-apply. And we are selecting everything from the badges table where the date column in the badges table equals the max date column for each user, right? So we might expect to see the badges table referenced twice in the query plan because we have to get the max from the badges table here.
And then when we do all this stuff, we have to get the max per user ID, right? Because we’re correlating this here. So we might expect to see a join in this query plan, but whoa, we do not, right?
We have a scan of the one scan of the badges table. There is not a hidden scan anywhere else. I promise you there is nothing hidden anywhere else in here, which is great.
And then we have this segment operator, which I’ll show you the details of that in a moment. And then we have a top operator. So this is all fun stuff.
Now, if we look at the segment operator, we’ll see the segment operator is grouping by user ID. So this is getting the max. This is sort of getting the max date per user ID. We don’t have that aggregation in here just yet, but that’s what the top is for.
And the top is a very interesting top because the top is running as top with ties, right? And there’s a good reason for that, right? User ID and date.
And I’m going to show you the good reason for that now. Now, we have to digress from this query for a moment so I can show you. So if we were to run this query, this is… So, like, the thing whenever you’re finding…
Looking for data like this, what you might expect to see is there is some chance, because user ID is not unique and date is not unique, that there might be some users who have, like, a max date that ties, right?
So, like, you might have multiple badges that you got at the exact same time. And that does actually occur in the data, right? So if we run that same query, but we group by user ID and date, and I’m going to filter out down here and say only things where the count is greater than one, and we run this, you see that there are a whole bunch of users who have gotten a whole bunch of badges at the exact same time.
It’s like they were just awarded all these badges in one go. I think this is because the process in Stack Overflow that awards badges runs at some interval and looks for people deserving of badges and says, hey, you get a badge.
So some of these people got, like, a whole mess of badges all in one shot, right? Back when Stack Overflow was a popular website that people used and, like, you know, like, did stuff on, you know, you get all these badges for doing stuff.
So there are a lot of those. And even if we go a step further, right, and we look at this grouped by an additional element, now we’re also going to group by the badge name, right?
If we run all this, right, we still find that there are people who got a whole bunch of badges awarded at the same time, right? Just look at these top sort of numbers here.
Someone got four popular question badges at once. A bunch of people did. Look at this. Then a bunch of people got these revival badges.
I don’t know what the revival badges mean. I don’t know. Honestly, it’s been a long time since I cared about what any of the badges mean. But, you know, they got a whole bunch at the same time. Good for you, right?
You revived and populared a whole bunch of stuff. But let’s come back to this query because this is the one I think there’s interesting things in. And so we’re going to highlight this and run it with the whole thing this time, right? We’re sort of done with that initial middle section, the stuff going on in the cross apply.
And where this one gets interesting to me is we’ve already sort of examined this part, right? We have the index scan. We have our segment top section over here.
But where things get kind of neat to me is that over here we have a batch mode sort, right? You know, that’s maybe not the biggest deal. No, that’s not very exciting.
But over here we have a batch mode hash join, right? So we can see this hash match running in batch mode. But where things get kind of weird and funny to me, and this goes back to, I don’t know. I don’t know how far this went back, but so SQL Server has had bitmaps forever, right?
We have row mode bitmaps and we have batch mode bitmaps. And if we right click on the hash join operator, we’ll see that the hash join is a bitmap creator and that we created optimized bitmap 1025, right?
So this is all like running and created in batch mode. And you would normally expect to see a bitmap applied like way down over here. But apparently batch mode bitmaps do not agree with rowstore indexes, right?
There’s just like they just can’t sort of line up on that stuff. That’s just sort of, I think it’s just a bit of a technical limitation with things here. So because this runs in row mode, we can’t apply the batch mode bitmap over here.
If this executed in batch mode, then I think we could. But we got row mode execution on the reads. So we don’t get the batch mode bitmap pushed all the way down here.
Instead, the batch mode bitmap runs in this kind of funny row mode filter, right? So we have this filter operator over here. And that is where the bitmap gets applied, which is very strange, right?
It’s just a weird thing to think. Like, what happened to this bitmap? Like, did you get converted to row mode? Like, are you still batch mode? Can you be created in batch mode and you can only run in row mode?
Like, what is happening in here? It is a very, very strange, very silly query plan to me. So, of course, like when you see a lot of these plans, right? And you’re like maybe troubleshooting performance.
This might stick out to you as looking very strange. Because I always tell people that you should always be suspicious of filters and query plans. Because like almost any time you see one, it signifies some like non-relational thing that had to be expressed via a filter. It could be a very long, complex, almost non-sargable where clause.
It could be something where like you need to filter on a row number or filter on like a filter on a count. Like if we go back to the count query down here, we’ll see there’s a filter operator. But that makes total sense for there to be a filter operator here.
Because we don’t know how many rows grouped, right? We don’t know how many of the grouped rows will have a count of more than one. So, we have to generate that whole result and then filter out after we have like done the counting.
So, like sometimes you get a filter operator and you expect this when you would like expect it. Like things like this. But other times you get these weird, you get these filter operators.
And I always tell people to be very, very suspicious of filter operators. In this case, this filter operator is not that big a deal. I mean, granted it would be nice if, you know, any of this could run in batch mode.
Or this did run in batch mode and we could push the bitmap down further. But instead we get this row mode, batch mode bitmap thing over here. And we do a whole bunch of filtering.
You’ll see that this reduces rows from 1, 333, 900 to 10,000 rows. But coming back to what I was talking about earlier with the top that runs with the ties, right? Top with ties over here.
What I think is pretty amusing is that this being an inner join and all, right? We get 10,038 rows from this, right? Just from seeking into the user’s table.
But then we get 10,054 rows down here. So some of the people in here would have that same, like, tied max creation date per user ID thing going on. So I thought that was very interesting as well.
Because we end up with 10,054 rows from down here. Rather than eliminating any rows because we only got 10,038 from here. So I just thought this plan was adorable and interesting and had a lot of funny things going on.
And I wanted to talk about it. And I chose to talk about it with you. Because you’re the only people who would care if I tried to tell my wife and kids about this. They would just leave the room.
They might even leave the state, right? Country even. Who knows? I would never see them again. Anyway, happy Friday.
Thank you for watching. I hope you enjoyed yourselves. I hope you had… I hope you… I don’t know. What else? That’s it. Anyway. Get back to drinking, you lazy bums.
Alright. Goodbye. Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Stored Procedure IF Branching and Deferred Compilation In SQL Server
Video Summary
In this video, I delve into the concept of deferred compilation in SQL Server, specifically focusing on how it can mitigate issues related to if branching within stored procedures. I explain that when SQL Server initially compiles a plan for a procedure containing if branches, it often does so based on initial parameter values, which can lead to unexpected behavior later when different parameters are passed. To illustrate this, I demonstrate an example using temporary tables and show how deferred compilation can help in such scenarios by deferring the actual compilation until the temp table is populated. Additionally, I provide tips for those interested in consulting services or training, including reasonable rates for my consulting work and membership options to join our channel community.
Full Transcript
Erik Darling here with Darling Data. And this video is going to be about how deferred compilation can help with some if branching issues that you might see in SQL Server. Primarily, one of the big problems that you run into with if branching is that all of the plans in your procedure will get compiled when SQL Server first compiles a plan for the store procedure, right? So if you have if branches and you’re thinking, well, the, like the compile time things that I pass in, SQL Server is never going to look at this branch because we’re not going there yet. Wrong. It compiles a plan for the entire procedure based on the initial set of compilation values. So if some of your parameters have null values assigned to them, well, you might, you might run into some very strange stuff when the, when that branch actually does execute because you will be executing it with values. So that can get very strange. So that can get very strange. But there are, there are of course exceptions to that. And I don’t mean like option recompile at the statement level or with recompile at the store procedure level. There are things in your store procedure that might make that untrue that cause deferred compilation. So we’re going to look at an example of that today. If you would like to hire me for consulting, because that’s what I do for a living. I consult and I make SQL Server.
I deliver faster in exchange for money. It’s my, it’s my gig. It’s my whole thing. And remember, my rates are reasonable. Yeah. All right. Been a while since I said that on camera, I think. Uh, if you’d like to buy my training also at a reasonable rate, you can do that. Uh, if you would like to become a channel member, you have a variety of reasonable rates to choose from. Uh, monthly plans are as low as $4. Four. Crazy, I know. Uh, if you would like to become a channel member, uh, that’s, that’s the way to do that. Uh, you can ask me office hours questions, uh, you, where that, that is free. Um, if you, it’s gonna be $25, $50, because, um, you’re spending clearly, you, you’re all spending far too much money on illicit substances. And, uh, it is degrading your ability to form coherent questions.
So, you need to start giving that money to me instead. I’m, I’m a safe keeper for, I am, I, I will get incoherent. Uh, and of course, if you like this channel content, please do like, subscribe, tell a friend, all that good stuff. Uh, you can tell a family member too, if you have a family member who you would like to torture with. These videos. Uh, advanced T-SQL training will be coming to DataTune Nashville March 6th and 7th and March 13th and 14th at Data Saturday Chicago. Uh, two wonderful events. Get out there in the world, you know, wear some sunglasses, look cool. Maybe we can get incoherent together. Who knows?
Uh, two wonderful events. Don’t know what that would look like. But, uh, for now, we must, we must muddle through December, January and February, and we must do that somehow. All right. Let’s look at this thing. So, um, when you use temporary objects in store procedures, um, for everybody, that’s going to mean temp tables. For people in certain situations, uh, certain SQL Server situations, uh, that would also mean table variables.
If you are getting the table variable deferred compilation, uh, intelligent query processing feature, uh, in your query plans. Um, only you can find that out. I can’t tell you that. You have to figure that out. But for temp tables, uh, you can see this happening. So, um, let’s actually add, uh, down here. I want to add a dbcc free proc cache.
Go just in case. So we’re going to look at this store procedure. Uh, I mean, a temp table is going to get created, but it’s not going to get used, right? So like we’re going to have two joins to a temp table, the stuff, I don’t need to put data in the temp table. It’s just to show you that the temp table, uh, will cause deferred compilation. Right.
And we’re going to have that in both of these queries, but let’s make sure this store procedure is run to not use a temp table. And let’s do a dbcc free proc caching and let’s run the store procedure using the reputation parameter first, right? So if we do this and we look at the execution plan and granted, I haven’t created any helpful indexes for this.
That’s really not a performance demo. It’s just a behavioral demo. And we look at the parameter list. We will see that, uh, over here we have, uh, compile and runtime values for, uh, reputation is 500,000, right? So, uh, the initial compilation for the store procedure was done with reputation at 500,000.
And that is also getting the actual execution plan, the runtime value for this execution. The reputation parameter gets used in this query against the users table. The score parameter gets used against the post table in this query.
So, uh, if we run this now and we say, uh, for score equals nine, nine, nine, nine, that’s four nines. And we look at the execution plan and we do the same thing as last time. We look at the parameter.
Where are you hiding from me? There we go. We look at the parameter list. We will see that the runtime value was nine, nine, nine, nine. But that’s not right.
That’s not right. The, uh, compile time value for this is null. All right. So SQL Server sniffed a null value and it’s doing cardinality estimation based on that null value. We have a one row estimate here, right?
So we got 12 rows back. There was one when we SQL Server estimated one row for null. We found 12 rows, uh, uh, for based on what we were looking for, which I don’t know, is that weird? Maybe, but, um, you know, not, not really.
There were four rows with a score greater than or equal to nine, nine, nine, nine. But, uh, there was a one row estimate from the null compile time value. So, uh, this is, you know, something worth noting.
If you are, if you have lots of if branching in your store procedures and there are lots of different parameters that get used in different if branches. And as you pass things in, you might actually supply different values up here.
Things can get pretty weird, right? It’s not, it’s like, like a, like another layer to a parameter sensitivity issues. But what I want to show you now is if we go and we say we put these left joins in, and I’m only using these very spurious left joins to show you that the behavior of involving a temp table.
This is the deferred compilation that you get, uh, from all temp tables and some table variables depending on your SQL Server version edition and, uh, all that good stuff. Uh, we rerun this and recreate the store procedure.
Let’s clear out the plan cache just to make sure. And what we’re going to do is the same old boring thing here. Well, we run this for reputation equals 500,000 first. And we’re going to get about the same thing happen for the first execution where the, um, compile and runtime values are both 500,000.
But now if we run this for score equals 9, 9, 9, 9. And we look at the, look at the execution plan. Of course, you got to flip that around a little bit.
Now in our parameter list, we have a compile and runtime value of 9, 9, 9, 9. So if you’ve heard me go on and on about if branching and store procedures and how, you know, like, like, like the behavior that we looked at without the temp tables can mess you up. Um, if you’re using temp tables in your if branches, you might not see that exact behavior because SQL Server might be deferring compilation of those, uh, queries until, uh, the temp table is populated, uh, and the query runs against them.
Right. So until SQL Server has to compile a plan that uses that temp table, then you might see this instead of the other behavior where it’s sniffed and null, like in the last set of, uh, runs of this. So just kind of something kind of interesting there.
Um, that’s about it on this one is usual. I don’t know if that there’s a very loud truck horn outside. I don’t know if that’s picking up on the microphone and it won’t know until the recording is complete, but it was perfectly timed with the silence there.
Anyway, I hope you enjoyed yourselves. I hope you’ll learn something and I will see you in tomorrow’s video. Uh, I forget what day tomorrow is.
I usually do, but we’re going to be there and we’re going to have fun. All right. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Neat Trick with Using SELECT to Assign Variable Values
Video Summary
In this video, I delve into the nuances of variable assignment in SQL Server using `SET` and `SELECT`. Specifically, we explore how these commands behave differently when no rows are found, highlighting the peculiarities of `SET` where it overwrites variables with nulls, while `SELECT` maintains or overwrites them based on the result set. I demonstrate a clever workaround to make `SET` behave more like `SELECT`, especially useful in scenarios involving multivariable assignment within loops. This technique ensures that your code remains robust and avoids potential infinite loops or unexpected behavior when dealing with non-existent rows, providing a valuable lesson for anyone working with dynamic SQL assignments.
Full Transcript
Erik Darling here with Darling Data. And we have a fun little video today where we’re going to talk about how, like, if you, like, back in, like, the, I think if you watched, like, the, like, the earlier, like, Learn T-SQL, either, if you, if you were kind enough to purchase the course, well, thank you. But if you saw some of the, like, preview videos here on YouTube, one of, in one of them I talk about the difference between, some of the differences between set and select when assigning variable values and how the behavior is strange and how, like, set is kind of annoying because you can’t do, like, multiple variable assignments with it. But you can with select, but select and set act differently when a row is not found and how variable values are either maintained or overwritten with nulls.
So we’re going to talk about how to make select behave more like set in that video when you’re doing multivariable assignment. I mean, technically it would work with single variable assignment too, but you may not want to write this into every single query. Anyway, down in the old video description, boy, oh boy, if you, if you want to hire me for consulting, you, you can do it.
The power is in your hands. As long as the credit card is in your hands, the power is in your hands. You can buy my training.
You can become a supporting member of the channel. Likewise, you can do things for free. Free stuff.
Ask me office hours questions, though that may soon cost $5. $10. Maybe up to $25. Because some of the questions that come in, I think, I think, I think some, some form of angel dust was involved with that.
And if you enjoy this content, you can, of course, like, subscribe, and tell a friend or two or three or a thousand. However many you have, just break out the old Rolodex, flip through, shoot everyone a fax, say, hey, check out that Darling Data YouTube channel about SQL Server. Out in the world, yeehaw!
Nashville and Chicago in March. Back to back weekends, 6th and 7th, 13th and 14th. I will be doing pre-cons at both on Advanced T-SQL. And you should go.
You should come see me. You should go to the events. You should get out and support the wider data community before, you know, I don’t mean to sound morbid here, but any time you do something could be the last time you do it. So, get out into the world and do some, do some good and fun things, like come to data platform events.
Anyway, Merry Christmas. Let’s look at how to make set behave more like select. So, the first thing we have to do to set this adventure up is look at a couple selects from the users table.
You will notice that if I select, and I’m going to show off a fun SSMS 2022 thing here. I’m going to dynamically zoom in on these results. ID5 returns Mr. John Galloway.
I don’t know John personally. He seems wonderful. I appreciate when, I appreciate that he is in my first name boat where everyone probably looks at his name like in an email or something like that and still finds a way to spell it incorrectly when they type it. So, we have John Galloway here for ID5.
And then for ID6, we turn no rows, right? There is not a single result down here. There is nothing. I’m not hiding anything from you, I promise. But that is for ID6.
Now, if we were to do this, let’s say under normal circumstances, and we were to run this set of queries here, where we declare some variables, and we set them equal to stuff for ID5, and then we look at the contents, and then we set them equal to ID6, and we look at the contents. These are the results that we get back, right? Because no row is produced for ID6, nothing is overwritten, right?
We may just have the same variable values in here. Golly. Anyway.
Pardon me. We have this here. So, this can get people into a lot of trouble if they are unaware of this behavior and they are assigning variables to something in a loop. And then all of a sudden, they stop finding new values.
And so, they just keep either assigning nothing to this row and, like, reprocessing whatever row is in that current set of variables. Or, I don’t know, just, like, infinite loop, right? They’re just, like, they’ll, like, process this thing multiple times, or they’ll end up in an infinite loop processing the same thing over and over again, right?
Not a good time. Something to be very much aware of when writing this sort of thing, writing anything that does variable assignment. Now, set, of course, doesn’t do this, but you can’t do multivariable assignment with set because what happens is you quickly find out you can’t do it, right?
Unless you hit the, like, we would have to write three separate set queries to the user’s table in order to have that work. So, something that I picked up while working with a client was that you can have a dummy row, right? And you can do something like this with the values clause.
You could, of course, just say select null or something if you felt like it. You don’t have to use values, but I like to use values because I don’t have enough of them. Quite frankly, there’s a real shortage of values in the Darling household.
So, every opportunity I have to use values, gosh darn it, I try to get it in there. So, what you can do is something like this where you say select from values and then outer apply to your query that would do the variable assignment and then at some point do the variable assignment out here. You couldn’t do it in here.
That wouldn’t work. So, if we do this and actually I think we have a few things to run down here. Run all this stuff. Now, what we have is John Galloway set when we hit ID 5 and then we have overwritten those rows with null.
Or rather, we have overwritten those variables with null values when we did not find a row for ID 6. So, this is a nice way of making sure that your loop hits some null check condition or something when it doesn’t find anything to do. And you can do this without having to like figure out like, hey, what are the current, like what are the last set of values that I just processed?
Do I have that same set of values? Or like something like that. You know, I guess you could, you know, if you’re feeling fancy, you could do something with like row count and be like, well, did I find any rows there?
No? Okay. Well, this is just another way of doing that. This might actually be a little bit more flexible in a lot of ways because, you know, if you find nulls, you might have to go do something else or something.
Right? You don’t want to necessarily like break out of stuff. You just might want to say, hey, these are all null. Or, hey, one of these is null and it shouldn’t be.
Or, hey, two of these are null and they shouldn’t be. Or something like that. Right? There’s like all sorts of things that, you know, having this assigned would allow you to, you know, maybe explore within your loop and say, hey, that ain’t right. So, just a neat kind of trick that I learned about.
And I thought you would enjoy. That’s all I had for this one. Thank you for watching. I hope you enjoyed yourselves.
I hope you learned something. And I will see you in tomorrow’s video. I do forget what the topic is. You’ll have to forgive me. But it is one of these three windows over here. So, there is something more to talk about.
Isn’t that wonderful? All right. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A New Query Hint to Override Batch Mode on Row Store Heuristics
Video Summary
In this video, I share an exciting new query hint that I recently learned from a fellow data enthusiast, who prefers to remain anonymous for now. This hint allows you to override SQL Server’s batch mode heuristics, which can be particularly useful when the default settings don’t quite meet your needs. I demonstrate how this works by showing a simple example where applying the “use hint override batch mode heuristics” option changes an index scan and top sort from row mode to batch mode on rowstore, highlighting its potential benefits. Additionally, I point out some quirks in SQL Server Management Studio 2022 that might affect your experience when using this feature, such as a keyboard shortcut issue with the execution plan tab. Overall, this hint offers a handy workaround for situations where other methods of enabling batch mode on rowstore are not feasible.
Full Transcript
Erik Darling here with Darling Data. And we have an exciting video for you today. Because, not me, but someone out there in the world who is very handy with a debugger, I’m going to protect their name because I don’t necessarily want them to get yelled at by Microsoft in case this is explosive information. But there was a new query hint that I was made aware of from Russia with love. And it is something that I think I am going to find very, very useful in my life. And it is a query hint that allows you to override the batch mode heuristics. And by this I mean SQL Server in 2019 introduced the batch mode on rowstore feature.
The problem is that even with that feature, like SQL Server 2019 Compat Level 150 Enterprise Edition, even if you check enough boxes to get to the point where SQL Server will start applying heuristics to your queries, those heuristics may not always kick in when you want them to. And you may like have to find yourself doing stupid things in order to have batch mode on rowstore kick in for your queries. And you may not want that because the heuristics might be stupid. If you remember my parameter sensitivity training videos, I talked a lot about the parameter sensitive plan optimization and how it has heuristics and decides when or when not to kick in, which you may or may not disagree with.
And then furthermore, of course, we talked about its poor bucketing strategies, but that is way too much for this video. We have other cool stuff to talk about. So down in the video description, you’ll see all sorts of helpful links. You can hire me for consulting, you can buy my training and you can become a supporting member of the channel.
These are all things that do require you giving me money. You can do things for free. We all like free things.
You can ask me office hours questions. You know, I do appreciate a thoughtful question from the audience. And of course, if you enjoy this content, you can like, subscribe, tell a friend, all that good stuff.
If you want to see me out in the world, it’s going to be a few months, but hey, that’s okay. It’s going to be cold and you’re not going to, you know, you wouldn’t like me when I’m cold anyway. Data Tune in Nashville, March 6th and 7th.
Tickets are on sale for pre-cons there. And Data Saturday, Chicago. Well, tickets are on sale for pre-cons and a regular event in both of these things. And then Data Saturday, Chicago, March 13th and 14th.
I’ll be there as well with the pre-cons. So buy my book. Buy my book. Buy my book.
And so come see me out in the world when I’m nice and warm. Maybe I’ll wear some shorts. Who knows? We might get crazy. Anyway, I remembered to change my deck image because we are now in the December month. And I better watch out.
I think there’s a mistletoe on there. So there might be some smooches in your future. But it is a Christmas miracle. All of our friends have all of their arms. I think little Timmy over there grew his arm back.
So we can all… We still have a ghost in here. A leftover ghost. A Christmas ghost. Not sure if this is past, present, or future.
But we have a Christmas ghost. Leftover from Halloween. So I guess we didn’t take all the decorations down. Some of our decorations have been repurposed.
Anyway. Let’s talk about this hint. So first thing I’m going to do is just show you this wonderful hint. It is a use hint.
So we need to do the whole option thing. And we need to say use hint. Override batch mode heuristics. Rolls right off the tongue as many of these hints do. This hint will not appear in the sys.dm exec valid use hints DMV.
Much like some of our other favorite hints like enable parallel plan preference. Which, you know… Again, since Microsoft started using this in their code to create the disk and vector indexes.
I assume it’s safe for everyone to use in production. Because Lord knows they don’t test crap. But we’ve got this one now.
How nice. How lovely. So the first thing I want to do is just show you the query plan for this without the hint. Applied.
Applied. And if we run this and we look at the execution plan. It’s nothing terrible. But we can see that this index scan… This is not like, wow, look at how much better performance is. This is just to show you that it’s there and functional.
Then this is like the first demo query that I had where I could show you that quickly. And I just wanted to get this out quickly. So, you know, I guess more interesting stuff will happen in the future. But we can see the actual and estimated execution mode for this index scan is row.
Likewise, we can see this top end sort also occurring in row mode. All right. Cool.
Most of the other operators in the plan are not eligible for batch mode anyway. For example, the repartition streams. But rather the parallel exchanges. So gather streams, distribute streams, gather streams, blah, blah, blah.
Nested loops, not eligible. This sort would be eligible. But this top would not be eligible. But now… Actually, this is a good time to tell you about a strange buggy thing that currently is a problem in SSMS 22.
SQL Server Management Studio 2022. Is that control and R, the keyboard shortcut that allows me to quickly sort of hide results, doesn’t work from the execution plan tab. I’ve opened up an issue about it.
But you have to either click back to the script or click back to messages or results in order to hide them. Which is a little strange. I don’t know how or why that happened. But I’m sure Aaron and co will get that fixed very quickly.
But anyway, if we allow this hint to work its magic and do its thing. Override batch mode heuristic. Look how patriotic this is.
Look at this red, white, and blue over here. Hello, dark mode. If we run this now and we look at the execution plan, we will see this index scan now operates in batch mode on rowstore. Right?
There’s batch mode and there’s rowstore. And we will also see that our top end sort over here is also in batch mode also on rowstore. But you can see the batch happening there.
This sort I don’t think gets it. Oh, thanks tooltip. No, this sort for some reason remains batch mode free. But these two operators over here changed to batch mode. So if you are in a situation where, and again, control and R coming back to bite me.
If you’re in a situation where you want to see if batch mode on rowstore will work for you. And for some reason the other more common tricks for getting it to work are not options for you. Such as creating an empty non-clustered columnstore index filtered to a result that can produce no rows.
Or, you know, doing the left join on 1 equals 0 to a columnstore organized object that is empty. And you can’t make those changes. Then, you know, this hint might be the one for you.
I think the other nice and convenient thing about this hint is that you can create plan guides and stuff in query store. Or using it. And then you could, you know, have SQL Server use query plans and stuff that apply the hint.
So there are upsides to this that perhaps the other tricks do not make available to you. Anyway, something I learned. Something I thought I’d pass on to you.
Because I like you. I think you’re smart and you’re funny. You’re good looking. Everyone likes you. And you, I don’t know.
Let’s just roll out some superlatives here. I think you’re the most. What? Just that. Anyway, thank you for watching.
I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we will talk about something equally compelling in the world of SQL Server. All right.
Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
When does make sense to use tables in user DBs rather than tables in tempdb to hold intermediate results in large stored procedures? Assume no hardware difference on user db drive vs tempdb drive.
We are currently suffering from parameter sniffing issues in our stored procedures (SP) due to the heavy use of IF statements for validation, permissioning and case handling. We are aware that we could use dynamic sql to improve the problem as it would only create and cache execution plans when it’s triggered, as opposed to regular sql statements where plans will get generated even if it’s not executed for the first time the SP is called. Does this mean that we will have to change every sql statement to dynamic sql (bloating the whole SP)? Alternatively, we thought of transferring our SP code to the app level and trigger each sql statement / blocks of SQL statements with Dapper, however let the backend app handle the IF statements. Would this help with parameter sniffing – given that it wouldn’t reuse plans like SP if it isn’t triggered? We are not really familiar we Dapper either, and we would like to know what you think about this (disadvantages and trade-offs). We are also open to other suggestions to tackle this challenge. Thanks in advance 😀
Azure SQL Db have Automatic tuning (identifying expensive queries, forcing the last good execution plan, adding indexes, removing indexes). Does it work?
Erik this is reErik this is really P*ss*ng me off and I can’t find the answer (and yes I’ve tried AI). How do you stop SSMS (22) from putting square brackets [] round all object names? Arrgghhh please help me before I bite my own head off with frustration. (from one of your loveliest channel members) xxx
is it okay to truncate table tables? i read aboit dropping them being bad.
In this video, I dive into a series of community-submitted questions during an office hours session. Erik Darling from Darling Data tackled topics ranging from when it makes sense to use tables in UserDBs versus TempDB for intermediate results in stored procedures, to dealing with parameter sniffing issues and the pros and cons of using dynamic SQL or moving logic to the application level. I also discussed Azure SQL DB’s automatic tuning features and why they fall short of expectations. Additionally, I addressed the frustration of SSMS22’s habit of adding square brackets around object names and offered potential solutions. Lastly, we explored whether it is acceptable to truncate temp tables within stored procedures for better tempDB management. It was a packed session with plenty of valuable insights!
Full Transcript
Erik Darling here with Darling Data and it is a fine Monday, which means we need to have an office hours. Hey, you didn’t see that coming, did you? Which I answer five community submitted questions of varying degrees of quality. And I don’t know. I don’t know. Maybe I should figure out a way to let people vote on these things. I don’t know. Anyway. Down in the video description, you will see all sorts of helpful links. You can hire me for consulting, buy my training, become a supporting member of the channel. All of these things do require money. But you can do free stuff as well. Like, ask me these office hours questions. Maybe I should start charging a dollar for those. Because then people might put a little bit more into them. Who knows, though? And of course, if you like this content, you can always like, subscribe, tell a friend, tell a multiple friends if you have multiple friends. Your mother and I would be very proud of you if you had multiple friends. If you would like to catch me out in the world, I have a couple confirmations for starting off the 2026 conference season. I should have some more of these coming up. There’ll be some pass-on tour dates, probably, and, well, some other stuff, too. We’ll see how life goes.
But Datatune Nashville. I actually announced this a little bit early. My fault. They just officially announced pre-cons today. Tickets go on sale today. So if you go to the Datatune Conf website, you should be able to figure out how to buy a ticket to come see me. I’ll be doing Advanced T-SQL at Datatune. And then Data Saturday Chicago. More Advanced T-SQL. Someday I’ll learn my lesson, right? But tickets are on sale for the pre-cons there. So please do go ahead and buy those and come see me and hang out as the weather starts turning more baseball-y and can all get back to wearing normal clothing, I hope. But with that out of the way, oh, you know what? It’s December. I actually have a new image for this. I just haven’t changed it yet.
But so I’m going to switch over to the Christmas one for the next video, I promise. Typical oversight on my part. Not updating the images there. But I have a brand new one for the month of December. So we’re going to get all Christmassy up in here. But anyway, we’ve got, again, some questions here. And ZoomIt is being a pain in my entire body.
And let’s start way up at the top. Let’s make sure we have one, two, three, four, five. We have five questions. I counted correctly. We are off to a great start, my friends. When does it make sense to use tables in UserDBs rather than tables in TempDB to hold intermediate results in large store procedures?
Assume no hardware difference on UserDB drive versus TempDB drive. I would say almost never. Unless the data that you’re loading is something that you, is like something maybe for like a staging table.
And if the process fails, you don’t want to like restage it. You know, like you might want to preserve the data to say what the hell happened. Then it makes total sense to use a physical table in a user database rather than a temporary table.
The rest of the time, I’m going to get pretty much a hard no from me on that. I can’t think of a lot of exceptions to that rule off the top of my head. But like honestly, unless you need to preserve that data for some, you know, forensic analysis of, again, like what went wrong here type thing, I can’t see a good reason for that.
And ZoomIt is still being a pain in all of my body. Let’s see. Oh, well, it’s letting me draw more things.
It’s just not letting me get unzoom. That’s fun. There we go. Hey, thanks, buddy. All right. This is a long question.
It’s really just like, hey, maybe you need a consultant. We are currently suffering from parameter sniffing issues in our store procedures due to heavy use of if statements for validation, permissioning, and case handling. We are aware that we could use dynamic SQL to improve the problem as it would only create in cache execution plans when it’s triggered, as opposed to regular SQL statements where plans will get generated even if it’s not executed the first time the SP is called.
Does that mean that we will have to change every SQL statement to dynamic SQL, bloating the whole SP? Alternatively, we thought of transferring our SP code to the app level and trigger each SQL statement blocks of SQL statements with Dapper. However, let the backend app handle the if statements.
Okay. Would this help the parameter sniffing given that it would reuse plans like SP if it isn’t? Well, we are not really familiar with Dapper either.
And we’d like to know what you think about this disadvantage in trade. So, um, uh, it sounds like the, the problem you’re hitting is not, uh, I mean, not like sort of the classic parameter sensitivity issue where, um, you know, uh, the cache plan for one set of parameter values is not a very, uh, efficient plan for a, a different runtime set of parameter values. Um, what, what you’re dealing with more is the null sniffing thing.
Um, where I think, you know, in, in general, um, like, like you, like you, you kind of went all over the place with like, like the, like bloating the plan cache and stuff. Uh, but, um, like you, usually what people get concerned about in these situations is, um, that like, let’s say you have a store procedure and I actually have a similar video, not exactly this, but a similar video coming up, um, I think probably tomorrow or the day after. But, um, the, the problem that you run into is that when SQL Server, like you have multiple parameters for a store procedure and maybe one set of parameters is not supplied values for and SQL Server sniffs a null for them, then you run into issues.
So, uh, in general, uh, you really only want to do the dynamic SQL where, uh, the, the sniffing occurs, right? So, um, uh, I’m not terribly familiar with Dapper either. I have, I have run into like many people who didn’t feel like writing dynamic SQL and just handled all the, like, what am I going to run in the app code?
That’s totally fine. That’s totally valid. I don’t have a problem with you doing that. If you want, if you want to learn Dapper and figure out how to, you know, uh, run that stuff, go ahead and do it.
Uh, but you know, the typical, um, you know, the, the typical sort of caveats apply to that, you know, make sure that your, uh, parameters are strongly typed in Dapper. And make sure that your code is parameterized, right? So those two things are going to be the big important thing.
Otherwise you will have a completely different set of terrible problems. Uh, but, um, in general, when you run into this issue, the only code that needs to be put into the dynamic SQL is the code that, um, that would, that would potentially sniff a null parameter value. And then on execution, use a terrible plan based on that null parameter value.
So I think, I think that’s, that’s my final thought on that. Um, but no, you don’t have to make everything inside the store procedure dynamic SQL. Um, you could like certainly group batches together in dynamic SQL, but you know, uh, that gets a little, it gets a little funny looking to me.
Uh, Azure SQL DB have automatic tuning, uh, identifying expensive queries, forcing the last good execution plan, adding indexes, removing indexes. Does it work? Let me ask you a question.
If something like this worked, do you not think that you would see a lot of people just on the roofs, on the balconies, singing its praises, right? Tearing up their SQL Server training, taking down their SQL Server. Like we don’t have to worry about it anymore.
Azure SQL DB have automatic tuning, identifying expensive queries, forcing last good plan, adding indexes. Indexes, removing indexes. We would just not have to be concerned for one moment.
Would we? The answer is it sucks. I’ve seen people use it. The index tuning garbage, the forcing a last good plan.
Well, it forces the last better plan, but that might not be a good plan. And sometimes it doesn’t even force anything. And other times it like the fail, the forcing fails.
And then you end up with a whole different set of terrible things. So, no, if something like this truly worked, you would hear about it. This would be a major breakthrough in database management.
But it doesn’t work. It tries to work, but it doesn’t work. So, no.
No, it doesn’t. No, it doesn’t. Eric, this is Rerick. Hi, Rerick.
Nice to meet you. This is really pissed. Well, I guess passing me off. Well, that wouldn’t be any better.
I don’t know. Anyway, and I can’t find the answer. And yes, I’ve tried AI. Well, I’m glad that you used AI and then you finally, you know, deigned yourself to ask a human being. How do you stop SSMS22 from putting square brackets around all object names?
Arg. Please help me before I bite my own head off with frustration. From one of your loveliest channel members, XXX. Well, all right, lovely channel member.
I have good news and I have bad news for you. The good news is that your head will probably taste lovely. The bad news is, did I say bad news already?
Anyway, the bad news is I don’t think that option is in SSMS22. I went through all the, like, script your stuff out settings. And I didn’t see anything about not putting square brackets in there.
The one thing that I can maybe recommend is Redgate SQL Prompt, which has a lovely keyboard shortcut that says remove square brackets. And it’ll just strip them all out. Otherwise, it’s like, you know, the whole find replace thing on the square brackets.
But that’s not really fun to do every time, is it? I don’t know. The good news is that SSMS22 is under very active development.
And one could very reasonably make a feature request to say, hey, I miss being able to remove square brackets from the scripting options. Maybe, maybe, maybe you could add that in for me.
There might be a good reason why they remove that. I’m not sure what it would be. But, you know, it doesn’t hurt to ask the people who build the product. What’s up with that?
Is it okay to truncate table tables? I read avoid dropping them bad, being bad. I’m going to assume you meant tempting.
Sorry. I’m going to assume you meant tempting. And, yeah, truncating temp tables is okay.
I’ve not run into a problem with people doing that. Dropping them in a high concurrency environment can certainly add some tempDB contention into the mix because you’re messing with how SQL Server caches and reuses temp table stuff.
So it’s explicitly dropping them. Yeah, at the end of the store procedure, like just drop table, pound sign, whatever.
That can all certainly not be great. But truncating them is okay. I actually have one customer who creates some very, very large temp tables along the way in their procedures.
And, like, midway in the store procedure, they might be done with specific temp tables, and they will truncate those to just clear up some tempDB space so they don’t have to wait for the full end of the store procedure because the store procedures, despite a lot of tuning, they do, like, very, very big calculations and aggregations.
So they are time consuming. And it does help manage tempDB space for them to truncate the larger temp tables when they’re finished using them within the store procedure.
So that isn’t… To me, that’s an okay thing to do. Table tables. All right.
That’s good for me here. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we will talk about some… Oh, I don’t know.
Some other grim database reality, I suppose. What else? What else is there? No joy in Mudville, huh? All right. Cool.
Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the challenges posed by parameter sniffing in stored procedures, focusing on the parameter sensitive plan optimization feature. I explore how heuristic weaknesses and poor bucketing practices can lead to suboptimal query performance. Using a practical example of a vote type ID parameterized procedure, I demonstrate how the lack of skewness threshold adherence results in inconsistent execution plans that significantly impact performance for certain input values. The video also highlights the limitations of recompiling stored procedures as a workaround and discusses the frustration with Microsoft’s approach to handling these issues, suggesting potential improvements like smarter bucketing or additional query variants.
Full Transcript
All right, we are back in action. We are doing part three of our parameter sensitive plan stuff. So now we’re going to look at problems with the parameter sensitive plan optimization. So like this thing rolled out and I have not seen any real tweaks or improvements to it. This could have been something real cool, but apparently instead we’re just getting the same fabric, fabric everything. It would be nice if Microsoft treated its enterprise database product with the according respect it deserves, but nope. Nope. So one might think that with a feature named the parameter sensitive plan optimization, that it might act sanely and rationally in its effort to optimize parameter sensitive plans. One might need reminding that we also got availability groups. And availability groups seem to rarely make things more available. Usually it’s quite the opposite. So there are two main issues that we have to deal with when the parameter sensitive plan optimization kicks in. The first one is heuristic weaknesses. All right, and the second one is poor bucketing practices. I would call this poor bucketing hygiene. In fact, I’m going to make that change now. Let’s say poor bucketing.
Hygiene. Hygiene. Ah, I spelled that wrong. There we go. I might still be spelling that wrong. I before E except after C, but that looks funny to me. I don’t know. I’ll fix it later. Maybe we’ll just go back. So we’ll be killing it later. Exactly. And troublesλι the vote sniffing store procedure. So this thing has one parameter, vote type ID, that’s an integer.
It selects some data from the votes table. And the only thing in the where clause is, of course, our one parameter here. But then for everything that it finds that matches with this vote type ID, we have to say, we don’t, we want things that where this didn’t happen, right? Cannot exist, select from badges, join the posts, and you know, this other stuff in here, right? Again, not important business logic. It’s just enough to get us a good demo. So if we were to look at the statistics histogram for our index on the votes table that leads with vote type ID, we might think that this looks pretty skewy. If you’re wondering what this crazy number is, it’s 37 million, right?
This is the most common number. And this is the most common vote type ID, which is two in the votes table. The least common is vote type ID four with 733, right? So if we look at that, we’ll see for that, like that was that crazy number in there. If we convert this to something more readable, we get 37,332,000, etc. Let’s see if this works. Oh, it does look, I can zoom in and the results with my mouse in SSMS 22. Look at that nice clean number. Anyway, I know that zooming that in is going to bite me sometime. So bear with me here. But because the least frequent value in the histogram is vote type ID four at 733 rows, and the most frequent is vote type ID two at 37 million rows, we do not meet the minimum skewness threshold.
And the reason for that is something that I said earlier, where the most common value has to be, or rather the least common value times 100,000 has to be greater than the, or equal to the most common value, something along those lines. But anyway, 733 times 100,000 is 73.3 million. 73.3 million is greater than 37 million. So we do not meet the minimum skewness threshold for the parameter sensitive plan optimization to kick in. And we can see that if we run this, and we say, hey, vote type ID four, what’s your query? What’s your plan?
All right. We get this, right? And it takes zero mil, I guess it takes four milliseconds, right? Okay, you got me four milliseconds. If we run this for vote type ID one, oh, that is big. Look at that. It’s a bit much. All right. That’s a little more reasonable.
If we run this for vote type ID one, this will get very, very slow. Worse, if we run, if we were to try to run this for vote type ID two, we would have a very, very bad time, right? So if we come and look at this saved off execution plan, we will see that this runs for four minutes and 53 seconds in total. Most of the pain in here, well, we spent like 11 seconds up to this point, and then we spend nearly five minutes with this sort spilling. So this is particularly not a good time, right? This is the opposite of a good time. This is not party mode.
What’s annoying is that this is a batch mode sort. Batch mode sorts are much, much, much, much, much slower than row mode sorts. If we come in the query plan and we look at the weight stats though, actually, this is a nuisance to do here. Let’s just look at the plan XML and let’s scroll down.
If we look at the weight stats for this query, we will see almost nothing of use or value, right? We see about four seconds. Oh, sorry. Not even four seconds. We see 1.2 seconds and then seven seconds of, so like 1.237 milliseconds of like a second 230, one second, 237 milliseconds worth of wait time for this query. Microsoft is ashamed of the weight stats for, I think, a lot of the stuff that goes on in here. In an honest world, we would see BP sort, which is the weight that crops up when a sort in a, in a, when we sort data in, in batch mode. And then we would see like either sleep tasks or IO completion, depending on the type of spill that we get. So this, this doesn’t go well. And the plan for vote type ID one is equally noxious where, uh, this thing now takes about six seconds to finish, or I guess it takes about seven and a half seconds to completely finish with a bunch of time spent in the sort over here, which spills a bit as well. So these two other queries using the plan for vote type ID four did not go well. And despite the, I think, giant skewness of, um, the, of the, like the range of values for vote type ID in the votes table, the parameter sensitive plan optimization does not kick in for it. So if we recompile this and, uh, we run this for vote type two first, you know, this will be okay at around four or five seconds, right? We run this, we get, oh, wow, 2.2 seconds. Something, something, something cool happened. So this runs pretty quickly, right? We get this query plan back. I don’t really have any complaints about this for vote type ID two, nor do I have any complaints about this for vote type ID one, right? Vote type ID one. How long do you take? Under a second, right? This is, I’m totally okay with this. This is, this is fine. I don’t mind when vote type ID one and two share a plan. This is, this is much better than the alternatives.
But then when vote type ID four uses this, it’s almost a total waste of time, right? Look at this execution plan. Vote type ID four went from taking like four milliseconds to now like almost a full second to complete because we have a big parallel plan, lots of hashes and scanning and startup costs.
And now vote type ID four is using a bunch of memory too. So vote type ID four, uh, this, this plan is way overkill for this one. And we don’t love this, but what we can do is we can execute, or rather we can insert a dummy row into the table and so that we meet the statistics skewness threshold because one times a hundred thousand is a hundred thousand and 37 million is greater than a hundred thousand. Right? So if we set identity insert on and we put one row of dummy values into the votes table, and then we set identity insert off because we’re done doing that, uh, and then we update statistics.
And I did, I do have to do this with a full scan. Um, I tried to get this to, um, happen predictably with lower sampling rates that took a little bit less time, but instead we’re going to spend 15 seconds making sure we get it right. Right? Rock solid demos. That’s what we care about. We’ll, we’ll waste a little bit of time getting that correct. So let’s, uh, let’s throw a recompile on that store procedure just in case. And what I want to show you is how we know that the parameter sensitive plan optimization is now kicking in. If we run this for vote type ID zero, this is the dummy row that we just inserted.
We get back this, right? And I know it looks weird that there’s a post for this. It’s because I had to insert, um, the, a value of negative 2.1 billion into that row. Um, I couldn’t insert a null cause post type ID doesn’t accept null. So it looks like we actually have a post associated with this, which is a little crazy, but yeah, you know, these things happen. Uh, but if you look at, I guess I could fix that with the case expressions, a case when post type ID equals negative 2.1 billion than null else zero. And, but anyway, if we look at the query text for this, we’re going to punch these little ellipses over here, way down at the bottom, we will have, uh, this stuff, right? This option plan per value thing, right? We have all this stuff going on in here and we will have our predicate ranges in here and we, you know, I don’t know, whatever. But the important thing is that since this is a very uncommon value, this gets query variant ID one. Okay. Now if we run this for vote type ID four, then we get query variant ID two, right? Come over to the execution plan and we look down here and I’m not going to expand the whole thing again. I’m just going to focus in on the query variant ID. Vote type ID four gets query variant ID two. Okay. All good. Everything fine so far. If we run this for vote type ID two, right? We’re going to get our four or five second plan and this is great, right? Or two seconds now, right? Something, something miraculous must have happened. Now I wonder what, what’s going to be weird next. But anyway, this gets query variant ID three, which is fine. I don’t mind query variant ID three here, right? Cause we get a different plan. Vote type ID two gets the plan that it is fastest with and no one goes home crying. The trouble is that if we run this for vote type ID one, vote type ID one will be bucketed alongside vote type ID four, right? And we get the same plan that we got last time. Uh, I mean, I guess it’s about a second and a half faster at six seconds versus 7.5 seconds, but I’m still not thrilled with this. Right? And if we look at the query text that we got for this one, what do we get? Look at query variant ID two. So now vote type ID one and four are still sharing plans.
And I’m just going to like, when we look at how this breaks out, it’s, it’s really unfortunate. So this is the way that the data or rather, this is the way that the parameter sensitive plan optimization will treat this, right? So, uh, up at the top is vote type ID two, which is unusually common down at the bottom is our dummy row vote type ID zero, which is very uncommon. And every other vote type ID, despite massive skewnesses, right? 3.7 million, 3.5 million, 3.5 million, 2 million, 1.2 million, 800,000 down on to vote type ID four at 733 gets bucketed in together.
This is not a clear sign that someone does not like you, does not respect you and does not care about you. I don’t know what is, right? Cause this could obviously be vastly improved upon, but no, this is what we get. All right. This is what we get. Uh, I don’t love it. I don’t love it one bit. So let’s get rid of our dummy row. All right. Let’s say goodbye to you and let’s update statistics again with the full scan. And you know, um, I, I don’t, I guess I don’t understand some of the rationale in there. Um, I feel like, you know, there, there could be maybe, uh, an additional query variant ID, or maybe there could be some smarter bucketing or something along those lines in order to make this a little bit more reasonable. But you know, someone had to build fabric.
It was probably all the people who could be improving that. So let’s just make sure that we got rid of our dummy row successfully. That looks good. We started one now instead of zero. So that’s great. So what’s really annoying with, especially with the, um, uh, situation that we just saw there is let’s say that, you know, we were like, Oh, well, you know, maybe, maybe I can force one of those queries to use a different plan. The trouble is that, so the parameter sensitive plan optimization does something similar to using dynamic SQL in a store procedure and that it sort of detaches the statement, uh, in the procedure that gets the different plan from the store procedure itself. And so they all get different query IDs. So, Oh, stop jumping. So for the last three executions that we just had here, we got query IDs, 31884, 31883, and 31882. If we wanted one of these, uh, let’s, let’s say that, um, we wanted, um, to use a different plan. We can’t cross boundaries here. So like query ID, 31884 can’t use plan ID 6890. That I don’t disagree with because of course, like if you could force a query ID to use any plan ID you wanted, you could have a completely different query. And like the query plan would just be like, I think just different tables and a different, like nothing would make sense. Like nothing would line up. So I’m glad that this gets avoided, but it’s a little frustrating that we can’t like re-bucket. Like we can’t choose which buckets things go into. We can’t say, I want this, uh, value to use this query variant ID, right? We can’t, we don’t have a way to sort of guide the here, the heuristics in a way that would make sense for us.
So this ain’t great, right? So like there, there is a store procedure where you can force a query ID to use a particular plan ID, but it would not work for us here because the query ID, plan ID boundary can’t be crossed in a way that we want. So we’ve done a fair bit of character assassination now in our dealings with the parameter sensitive plan optimization, both in it not kicking in heuristically when I believe it should and for the way that it buckets things. But I’m sure that it will work great for you. I’m sure that when you go and use it in real life, everything will be peachy, keen, perfect. You’ll never have to seek a bit of help, do a bit of tuning, nothing like that.
So most of the time with parameter sniffing, it’s a matter of comparing different plan choices choices and often making some query or index adjustment to give the optimizer fewer choices and sort of guide it towards a plan choice that works reasonably well for everyone. That is, that can’t always happen though. Some spans of data are just far too different in order for that to work. Like sometimes you can get it like, you know, fix some stuff, like, you know, fix some indexes up, you know, um, you know, things like that. And Antigual server will choose a plan that’s generally good for like, you know, any set of parameter values. You can totally get to that point.
Other times you do have to break things out further and dynamic SQL gives us a way to, uh, look at, rather dynamic SQL gives us a way to generate queries and guide SQL Server to the correct optimization path for different things. One way that we can fix the current store procedure we’re looking at though is just by using a temp table, right? So since there’s one part of this code that is sensitive to parameters, which is the votes table, and there is one part of, um, uh, like the query that, you know, really messes things up. If we isolate that one part of the query and we dump all the stuff from, uh, votes depend on based on whatever vote, vote type ID we pass in, into a temp table, we can get pretty good performance across a variety of vote type IDs.
I’m going to say pretty much, I’m going to say everything except vote type ID two. So vote type ID one, we do a simple insert into our temp table. And now the second query in here finishes in zero milliseconds, right? Uh, vote type ID four, we run this, use this, uses the same plan.
And everything finishes perfectly fine here, right? Every like, this is maybe faster than it was before. The problem that we would run into is if we were to run this for vote type ID two, right? This would take a lot longer because we would be, you know, like, you know, get 37 million rows rather than dealing with the number of rows that we had for all the other ones. So this, this gets a little bit more bleak. If I had to deal with a situation like this in real life, I might go for a hybrid approach. I might have an outer store procedure that figures out if I’m running vote type ID two, and then I might just run the regular version of the store procedure where the query takes four or five seconds. And if vote type ID is not equal to two, then I might use this and dump it into a temp, I might, I might use the temp table approach. So that would just look like, you know, the vote sniffing procedure would just do the normal thing and run the query for vote type ID two. And the temp table approach would run for anything that was not two, right? So that’s one way of handling this situation. And then all this stuff would be fine. But we can get even more creative when we start using dynamic SQL. Like I said before, there are all sorts of fun things that you can do, and you can choose how you want things bucketed. So, you know, you might say something like if for any of these vote type IDs, just add one equals select one, you might say for any for vote type ID two, you you’re special, you get two equals select two. And you might say for these vote type IDs, add three equals select three. And those literals, the one equals and the two equals and the three equals will get you three different execution plans. You can also add in different hints depending on what values come in here. So for example, you might say for these, I really like a merge join, hopefully a serial merge join. You might say for vote type ID two, I really like a hash join.
You might say for all these other vote type IDs, I really prefer a loop join. So there’s all sorts of things where with dynamic SQL that you can do and control a lot better than I spelled option wrong. That’s silly of me, that you can do and control better than when you are using when you are just allowing SQL Server to use a parameter sensitive plan optimization. You might even go a step further and say, I think like I know from testing, right? I don’t just think I don’t just have a feeling I know from testing that when I search for any of these vote type IDs that I do best when I optimize for vote type ID equals one. And of course, vote type ID equals two, we should optimize for vote type ID equals two because vote type ID equals two is crazy. And then for maybe this group of vote type IDs, I tested all this and vote type ID 15 works the best. You might also go a step further and say, hey, I think that I want each of these vote type IDs to cache and reuse its own plan. And you could do that by tokenizing the dynamic SQL and then replacing that token with the vote type ID at runtime and then executing for that specific vote type ID. And SQL Server will generate a new plan per vote type ID, cache it and reuse it. So as long as it stays in the plan cache. And if you’re in a situation kind of like I talked about earlier, where, you know, you have date ranges that you’re focusing on, and you want to figure out and like, you know, sometimes for the most part, these date ranges are very, very small and narrow, like an hour or a day of data. And these searches are very fast, because you are all set up and indexed and everything’s good for these. But then every once in a while someone throws a big date range out there, and like everything does grinds to a halt and gets terrible. Well, you can even do something like this and say, you know what, if these are more than three months apart, then I want to recompile. Right? And just say, come up with a new plan, and then probably go back to the normal plan after that. Or, you know, you might have to, you know, maybe do a little extra work to like re recompile a plan, re re, I actually lost on how that that sentence works, but that’s okay. But dynamic SQL is your friend. Just a few things about it is to make sure it’s parameterized.
Otherwise, you’re not doing a good job. Make sure that objects, any like, you know, database schema, table, etc. names have a wrapped in quote name to prevent any weirdness. Make sure that it’s formatted nicely. So when you print it out, you can read it. And if it’s coming from a store procedure, then make sure that there’s a comment in there that tells you which store procedure is coming from.
And if it’s being generated in the like, if you’re generating dynamic SQL in the application, it’s really helpful to put a little comment in there that tells you which module or which thing executed it. So that when you find it, you know how to trace that back to the part of the application that generated it. So with that, I am done. I did manage to fit this in three in three videos, I don’t need to jump out the window. This is a fantastic day. That’s just about it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you next week where we will start afresh with all sorts of office houry things and things and stuff and things. All right. Thanks for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.