Parameter Sensitivity Training Part 3
Video Summary
In this video, I delve into the challenges posed by parameter sniffing in stored procedures, focusing on the parameter sensitive plan optimization feature. I explore how heuristic weaknesses and poor bucketing practices can lead to suboptimal query performance. Using a practical example of a vote type ID parameterized procedure, I demonstrate how the lack of skewness threshold adherence results in inconsistent execution plans that significantly impact performance for certain input values. The video also highlights the limitations of recompiling stored procedures as a workaround and discusses the frustration with Microsoft’s approach to handling these issues, suggesting potential improvements like smarter bucketing or additional query variants.
Full Transcript
All right, we are back in action. We are doing part three of our parameter sensitive plan stuff. So now we’re going to look at problems with the parameter sensitive plan optimization. So like this thing rolled out and I have not seen any real tweaks or improvements to it. This could have been something real cool, but apparently instead we’re just getting the same fabric, fabric everything. It would be nice if Microsoft treated its enterprise database product with the according respect it deserves, but nope. Nope. So one might think that with a feature named the parameter sensitive plan optimization, that it might act sanely and rationally in its effort to optimize parameter sensitive plans. One might need reminding that we also got availability groups. And availability groups seem to rarely make things more available. Usually it’s quite the opposite. So there are two main issues that we have to deal with when the parameter sensitive plan optimization kicks in. The first one is heuristic weaknesses. All right, and the second one is poor bucketing practices. I would call this poor bucketing hygiene. In fact, I’m going to make that change now. Let’s say poor bucketing.
Hygiene. Hygiene. Ah, I spelled that wrong. There we go. I might still be spelling that wrong. I before E except after C, but that looks funny to me. I don’t know. I’ll fix it later. Maybe we’ll just go back. So we’ll be killing it later. Exactly. And troublesλι the vote sniffing store procedure. So this thing has one parameter, vote type ID, that’s an integer.
It selects some data from the votes table. And the only thing in the where clause is, of course, our one parameter here. But then for everything that it finds that matches with this vote type ID, we have to say, we don’t, we want things that where this didn’t happen, right? Cannot exist, select from badges, join the posts, and you know, this other stuff in here, right? Again, not important business logic. It’s just enough to get us a good demo. So if we were to look at the statistics histogram for our index on the votes table that leads with vote type ID, we might think that this looks pretty skewy. If you’re wondering what this crazy number is, it’s 37 million, right?
This is the most common number. And this is the most common vote type ID, which is two in the votes table. The least common is vote type ID four with 733, right? So if we look at that, we’ll see for that, like that was that crazy number in there. If we convert this to something more readable, we get 37,332,000, etc. Let’s see if this works. Oh, it does look, I can zoom in and the results with my mouse in SSMS 22. Look at that nice clean number. Anyway, I know that zooming that in is going to bite me sometime. So bear with me here. But because the least frequent value in the histogram is vote type ID four at 733 rows, and the most frequent is vote type ID two at 37 million rows, we do not meet the minimum skewness threshold.
And the reason for that is something that I said earlier, where the most common value has to be, or rather the least common value times 100,000 has to be greater than the, or equal to the most common value, something along those lines. But anyway, 733 times 100,000 is 73.3 million. 73.3 million is greater than 37 million. So we do not meet the minimum skewness threshold for the parameter sensitive plan optimization to kick in. And we can see that if we run this, and we say, hey, vote type ID four, what’s your query? What’s your plan?
All right. We get this, right? And it takes zero mil, I guess it takes four milliseconds, right? Okay, you got me four milliseconds. If we run this for vote type ID one, oh, that is big. Look at that. It’s a bit much. All right. That’s a little more reasonable.
If we run this for vote type ID one, this will get very, very slow. Worse, if we run, if we were to try to run this for vote type ID two, we would have a very, very bad time, right? So if we come and look at this saved off execution plan, we will see that this runs for four minutes and 53 seconds in total. Most of the pain in here, well, we spent like 11 seconds up to this point, and then we spend nearly five minutes with this sort spilling. So this is particularly not a good time, right? This is the opposite of a good time. This is not party mode.
What’s annoying is that this is a batch mode sort. Batch mode sorts are much, much, much, much, much slower than row mode sorts. If we come in the query plan and we look at the weight stats though, actually, this is a nuisance to do here. Let’s just look at the plan XML and let’s scroll down.
If we look at the weight stats for this query, we will see almost nothing of use or value, right? We see about four seconds. Oh, sorry. Not even four seconds. We see 1.2 seconds and then seven seconds of, so like 1.237 milliseconds of like a second 230, one second, 237 milliseconds worth of wait time for this query. Microsoft is ashamed of the weight stats for, I think, a lot of the stuff that goes on in here. In an honest world, we would see BP sort, which is the weight that crops up when a sort in a, in a, when we sort data in, in batch mode. And then we would see like either sleep tasks or IO completion, depending on the type of spill that we get. So this, this doesn’t go well. And the plan for vote type ID one is equally noxious where, uh, this thing now takes about six seconds to finish, or I guess it takes about seven and a half seconds to completely finish with a bunch of time spent in the sort over here, which spills a bit as well. So these two other queries using the plan for vote type ID four did not go well. And despite the, I think, giant skewness of, um, the, of the, like the range of values for vote type ID in the votes table, the parameter sensitive plan optimization does not kick in for it. So if we recompile this and, uh, we run this for vote type two first, you know, this will be okay at around four or five seconds, right? We run this, we get, oh, wow, 2.2 seconds. Something, something, something cool happened. So this runs pretty quickly, right? We get this query plan back. I don’t really have any complaints about this for vote type ID two, nor do I have any complaints about this for vote type ID one, right? Vote type ID one. How long do you take? Under a second, right? This is, I’m totally okay with this. This is, this is fine. I don’t mind when vote type ID one and two share a plan. This is, this is much better than the alternatives.
But then when vote type ID four uses this, it’s almost a total waste of time, right? Look at this execution plan. Vote type ID four went from taking like four milliseconds to now like almost a full second to complete because we have a big parallel plan, lots of hashes and scanning and startup costs.
And now vote type ID four is using a bunch of memory too. So vote type ID four, uh, this, this plan is way overkill for this one. And we don’t love this, but what we can do is we can execute, or rather we can insert a dummy row into the table and so that we meet the statistics skewness threshold because one times a hundred thousand is a hundred thousand and 37 million is greater than a hundred thousand. Right? So if we set identity insert on and we put one row of dummy values into the votes table, and then we set identity insert off because we’re done doing that, uh, and then we update statistics.
And I did, I do have to do this with a full scan. Um, I tried to get this to, um, happen predictably with lower sampling rates that took a little bit less time, but instead we’re going to spend 15 seconds making sure we get it right. Right? Rock solid demos. That’s what we care about. We’ll, we’ll waste a little bit of time getting that correct. So let’s, uh, let’s throw a recompile on that store procedure just in case. And what I want to show you is how we know that the parameter sensitive plan optimization is now kicking in. If we run this for vote type ID zero, this is the dummy row that we just inserted.
We get back this, right? And I know it looks weird that there’s a post for this. It’s because I had to insert, um, the, a value of negative 2.1 billion into that row. Um, I couldn’t insert a null cause post type ID doesn’t accept null. So it looks like we actually have a post associated with this, which is a little crazy, but yeah, you know, these things happen. Uh, but if you look at, I guess I could fix that with the case expressions, a case when post type ID equals negative 2.1 billion than null else zero. And, but anyway, if we look at the query text for this, we’re going to punch these little ellipses over here, way down at the bottom, we will have, uh, this stuff, right? This option plan per value thing, right? We have all this stuff going on in here and we will have our predicate ranges in here and we, you know, I don’t know, whatever. But the important thing is that since this is a very uncommon value, this gets query variant ID one. Okay. Now if we run this for vote type ID four, then we get query variant ID two, right? Come over to the execution plan and we look down here and I’m not going to expand the whole thing again. I’m just going to focus in on the query variant ID. Vote type ID four gets query variant ID two. Okay. All good. Everything fine so far. If we run this for vote type ID two, right? We’re going to get our four or five second plan and this is great, right? Or two seconds now, right? Something, something miraculous must have happened. Now I wonder what, what’s going to be weird next. But anyway, this gets query variant ID three, which is fine. I don’t mind query variant ID three here, right? Cause we get a different plan. Vote type ID two gets the plan that it is fastest with and no one goes home crying. The trouble is that if we run this for vote type ID one, vote type ID one will be bucketed alongside vote type ID four, right? And we get the same plan that we got last time. Uh, I mean, I guess it’s about a second and a half faster at six seconds versus 7.5 seconds, but I’m still not thrilled with this. Right? And if we look at the query text that we got for this one, what do we get? Look at query variant ID two. So now vote type ID one and four are still sharing plans.
And I’m just going to like, when we look at how this breaks out, it’s, it’s really unfortunate. So this is the way that the data or rather, this is the way that the parameter sensitive plan optimization will treat this, right? So, uh, up at the top is vote type ID two, which is unusually common down at the bottom is our dummy row vote type ID zero, which is very uncommon. And every other vote type ID, despite massive skewnesses, right? 3.7 million, 3.5 million, 3.5 million, 2 million, 1.2 million, 800,000 down on to vote type ID four at 733 gets bucketed in together.
This is not a clear sign that someone does not like you, does not respect you and does not care about you. I don’t know what is, right? Cause this could obviously be vastly improved upon, but no, this is what we get. All right. This is what we get. Uh, I don’t love it. I don’t love it one bit. So let’s get rid of our dummy row. All right. Let’s say goodbye to you and let’s update statistics again with the full scan. And you know, um, I, I don’t, I guess I don’t understand some of the rationale in there. Um, I feel like, you know, there, there could be maybe, uh, an additional query variant ID, or maybe there could be some smarter bucketing or something along those lines in order to make this a little bit more reasonable. But you know, someone had to build fabric.
It was probably all the people who could be improving that. So let’s just make sure that we got rid of our dummy row successfully. That looks good. We started one now instead of zero. So that’s great. So what’s really annoying with, especially with the, um, uh, situation that we just saw there is let’s say that, you know, we were like, Oh, well, you know, maybe, maybe I can force one of those queries to use a different plan. The trouble is that, so the parameter sensitive plan optimization does something similar to using dynamic SQL in a store procedure and that it sort of detaches the statement, uh, in the procedure that gets the different plan from the store procedure itself. And so they all get different query IDs. So, Oh, stop jumping. So for the last three executions that we just had here, we got query IDs, 31884, 31883, and 31882. If we wanted one of these, uh, let’s, let’s say that, um, we wanted, um, to use a different plan. We can’t cross boundaries here. So like query ID, 31884 can’t use plan ID 6890. That I don’t disagree with because of course, like if you could force a query ID to use any plan ID you wanted, you could have a completely different query. And like the query plan would just be like, I think just different tables and a different, like nothing would make sense. Like nothing would line up. So I’m glad that this gets avoided, but it’s a little frustrating that we can’t like re-bucket. Like we can’t choose which buckets things go into. We can’t say, I want this, uh, value to use this query variant ID, right? We can’t, we don’t have a way to sort of guide the here, the heuristics in a way that would make sense for us.
So this ain’t great, right? So like there, there is a store procedure where you can force a query ID to use a particular plan ID, but it would not work for us here because the query ID, plan ID boundary can’t be crossed in a way that we want. So we’ve done a fair bit of character assassination now in our dealings with the parameter sensitive plan optimization, both in it not kicking in heuristically when I believe it should and for the way that it buckets things. But I’m sure that it will work great for you. I’m sure that when you go and use it in real life, everything will be peachy, keen, perfect. You’ll never have to seek a bit of help, do a bit of tuning, nothing like that.
So most of the time with parameter sniffing, it’s a matter of comparing different plan choices choices and often making some query or index adjustment to give the optimizer fewer choices and sort of guide it towards a plan choice that works reasonably well for everyone. That is, that can’t always happen though. Some spans of data are just far too different in order for that to work. Like sometimes you can get it like, you know, fix some stuff, like, you know, fix some indexes up, you know, um, you know, things like that. And Antigual server will choose a plan that’s generally good for like, you know, any set of parameter values. You can totally get to that point.
Other times you do have to break things out further and dynamic SQL gives us a way to, uh, look at, rather dynamic SQL gives us a way to generate queries and guide SQL Server to the correct optimization path for different things. One way that we can fix the current store procedure we’re looking at though is just by using a temp table, right? So since there’s one part of this code that is sensitive to parameters, which is the votes table, and there is one part of, um, uh, like the query that, you know, really messes things up. If we isolate that one part of the query and we dump all the stuff from, uh, votes depend on based on whatever vote, vote type ID we pass in, into a temp table, we can get pretty good performance across a variety of vote type IDs.
I’m going to say pretty much, I’m going to say everything except vote type ID two. So vote type ID one, we do a simple insert into our temp table. And now the second query in here finishes in zero milliseconds, right? Uh, vote type ID four, we run this, use this, uses the same plan.
And everything finishes perfectly fine here, right? Every like, this is maybe faster than it was before. The problem that we would run into is if we were to run this for vote type ID two, right? This would take a lot longer because we would be, you know, like, you know, get 37 million rows rather than dealing with the number of rows that we had for all the other ones. So this, this gets a little bit more bleak. If I had to deal with a situation like this in real life, I might go for a hybrid approach. I might have an outer store procedure that figures out if I’m running vote type ID two, and then I might just run the regular version of the store procedure where the query takes four or five seconds. And if vote type ID is not equal to two, then I might use this and dump it into a temp, I might, I might use the temp table approach. So that would just look like, you know, the vote sniffing procedure would just do the normal thing and run the query for vote type ID two. And the temp table approach would run for anything that was not two, right? So that’s one way of handling this situation. And then all this stuff would be fine. But we can get even more creative when we start using dynamic SQL. Like I said before, there are all sorts of fun things that you can do, and you can choose how you want things bucketed. So, you know, you might say something like if for any of these vote type IDs, just add one equals select one, you might say for any for vote type ID two, you you’re special, you get two equals select two. And you might say for these vote type IDs, add three equals select three. And those literals, the one equals and the two equals and the three equals will get you three different execution plans. You can also add in different hints depending on what values come in here. So for example, you might say for these, I really like a merge join, hopefully a serial merge join. You might say for vote type ID two, I really like a hash join.
You might say for all these other vote type IDs, I really prefer a loop join. So there’s all sorts of things where with dynamic SQL that you can do and control a lot better than I spelled option wrong. That’s silly of me, that you can do and control better than when you are using when you are just allowing SQL Server to use a parameter sensitive plan optimization. You might even go a step further and say, I think like I know from testing, right? I don’t just think I don’t just have a feeling I know from testing that when I search for any of these vote type IDs that I do best when I optimize for vote type ID equals one. And of course, vote type ID equals two, we should optimize for vote type ID equals two because vote type ID equals two is crazy. And then for maybe this group of vote type IDs, I tested all this and vote type ID 15 works the best. You might also go a step further and say, hey, I think that I want each of these vote type IDs to cache and reuse its own plan. And you could do that by tokenizing the dynamic SQL and then replacing that token with the vote type ID at runtime and then executing for that specific vote type ID. And SQL Server will generate a new plan per vote type ID, cache it and reuse it. So as long as it stays in the plan cache. And if you’re in a situation kind of like I talked about earlier, where, you know, you have date ranges that you’re focusing on, and you want to figure out and like, you know, sometimes for the most part, these date ranges are very, very small and narrow, like an hour or a day of data. And these searches are very fast, because you are all set up and indexed and everything’s good for these. But then every once in a while someone throws a big date range out there, and like everything does grinds to a halt and gets terrible. Well, you can even do something like this and say, you know what, if these are more than three months apart, then I want to recompile. Right? And just say, come up with a new plan, and then probably go back to the normal plan after that. Or, you know, you might have to, you know, maybe do a little extra work to like re recompile a plan, re re, I actually lost on how that that sentence works, but that’s okay. But dynamic SQL is your friend. Just a few things about it is to make sure it’s parameterized.
Otherwise, you’re not doing a good job. Make sure that objects, any like, you know, database schema, table, etc. names have a wrapped in quote name to prevent any weirdness. Make sure that it’s formatted nicely. So when you print it out, you can read it. And if it’s coming from a store procedure, then make sure that there’s a comment in there that tells you which store procedure is coming from.
And if it’s being generated in the like, if you’re generating dynamic SQL in the application, it’s really helpful to put a little comment in there that tells you which module or which thing executed it. So that when you find it, you know how to trace that back to the part of the application that generated it. So with that, I am done. I did manage to fit this in three in three videos, I don’t need to jump out the window. This is a fantastic day. That’s just about it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you next week where we will start afresh with all sorts of office houry things and things and stuff and things. All right. Thanks for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.