Indexing SQL Server Queries For Performance: Missing Index Requests Are Bad And Dumb
Thanks for watching!
Video Summary
In this video, I dive into the often-overlooked topic of missing index requests in SQL Server. While many people might jump straight to creating indexes based on these suggestions, I argue that it’s crucial to take a step back and actually analyze the query plan. Missing index requests can be misleading; they don’t always point to the most impactful changes you could make. In this video, I demonstrate how simply following missing index scripts without deeper analysis can lead to significant performance degradation. By walking through an example where creating the suggested indexes resulted in a query that took over two hours instead of 39 minutes, I highlight the importance of understanding the broader context and potential alternatives, such as using batch mode or leveraging early aggregation techniques. The key takeaway is that missing index requests should not be your primary focus; they are just one tool in your performance tuning arsenal, and it’s essential to critically evaluate their impact before implementing them.
Full Transcript
Erik Darling here with Darling Data, failing my Darling Data dandiest. And I have a special video for you. I’m recording it a little bit later than I thought I would, and I think the reason why will become apparent in just a moment. But we’re going to talk about missing index requests in this video. Because I don’t know what I’m going to say. I still, to this day, see a lot of people talking, almost bragging about implementing them. Like, like, cool, you followed the instructions on the bottle. Alright. We’re all proud of you. Ding dong. And how missing index requests do not really always expose the performance problems that queries to the performance problems that queries are having. And how sometimes you’re going to actually have to get up off your lazy keister and actually look at the query plan and figure things out. You can’t just, like, I don’t know, use whatever bargain basement script you got off the internet that scripts out every missing index request and spend some time staring at them and then create them and act like you did your job. Because that’s crappy. So, let’s, we’ll talk about that. But before we do, we need to talk about four dollars. Four of them. If you, if you would like to support the work that I do in this channel, you can, you can give me four dollars a month by clicking on the little link in the video description right down there, right about where my desk chair is. You can’t see via the magic of green screens. And that’s a nice thing to do.
Pretty spiffy of you if you do that. If four dollars a month is beyond your means. If you just don’t have four bucks, well, you can like, you can comment, you can subscribe, and all that stuff is free. At least, at least it is for now. Who knows what you two will start charging for that. That’d be weird. If you are in need of SQL Server consulting, because you are the type of lazy person who just creates missing indexes based on what SQL Server is, you know, you can get a lot of money. SQL Server is telling you might help. And you actually want someone who will do some more better work than that. I am available to do all these things. And it’s been requested by the editors at Beer Gut Magazine that I clarify their position on me being the best SQL Server consulted in the world. Apparently, that title is not valid in New Zealand. So, everywhere else, I win. Some places you can’t beat, though.
If you would like some fantastic SQL Server training, you can get all of mine for about $150 and you don’t have to subscribe to that. That is just a one-time payment. And then you can watch it for the rest of your life over and over again until something finally sinks in, which hopefully won’t take you for the rest of your life because I hope you live a long life. And I would be sad if it took that long for these lessons to be learned.
Once again, I will have news about upcoming events later in 2025. Right now, I’m just enjoying living life in the moment. With that out of the way, let’s do this thing. Let’s talk about missing index requests. So, I’ve run this query, and this is why this is getting recorded later than usual.
This thing took 39 minutes and 5 seconds to run. All right. 39 minutes and 5 seconds. Now, SQL Server is telling us that it wants an index on the comments table.
It wants that index to be, have score as a key column and include user ID, well, as an included column, somewhat obviously. I guess that’s a little bit of a spoiler there. But let’s use our heads for once.
Let’s not just be missing index idiots. Let’s actually look at the query plan. And let’s see, how long did SQL Server spend touching the comments table over here? 998 milliseconds.
What happens if we create that index? Can SQL Server spend 0 milliseconds touching the comments table? Maybe.
Maybe. But if you take 998 milliseconds away from 39 minutes and 5 seconds, you have 39 minutes and 4 seconds. It doesn’t seem like an index that is going to be very helpful, does it?
Does it seem like an index that is going to really do much for us? Now, missing index requests are, of course, sort of, well, this one is interesting. And this one is interesting because if you grab the XML from this thing, there are actually two missing index requests.
SSMS only shows you one in green, but there are two that get logged in the XML of the query plan. If you are the type of person who is smart and likes to go digging into query plans and you hit the properties of things and you expand the missing indexes thing, you will see both of them in here. They show up.
It’s not very friendly for scripting, of course. If you look at these things, you kind of have to keep expanding these little plus signs until you have some idea of what’s going on. But they’re both in here, right?
Now, let me ask you a question. Between you and me, I know none of us in here are born and bred lifelong mathematicians, but is 998 milliseconds, is that really 22% of 36 minutes? I don’t think so.
I don’t think that’s the impact that they have in mind here. I think SQL Server thinks it can reduce the effort of running this query by 22%, but I don’t think that we’re going to see those results if we’re looking at whatever time-telling device you care about. Maybe you have an hourglass or a stopwatch or a grandfather clock or, I don’t know, maybe you’re just real good at counting exact seconds.
But I don’t think that 22% of anything is going to disappear if we create that missing index. Do you? I don’t.
I’ll be honest with you. I don’t see that happening for us. The problems in this query are very much because of spills, right? This is where things start to get real hairy. We spend three minutes and 59 seconds up to this point.
Well, sort of. This looks a little weird. The operator times, God bless, this is 13 minutes and then we go to 36 minutes. So something weird is happening in here, right?
This is not a well-performing query. We’re having some problems. So what I did ahead of time and part of the reason why we’re up so late recording is because I actually created both of the missing indexes that SQL Server asked for. There were the two on the comments table.
And then if you keep getting the estimated plan after you create indexes, SQL Server will keep suggesting indexes. And so I kept doing that. And eventually SQL Server suggested one on votes and then one on posts.
And so after I created the one on posts, I ran the query. And you can’t see it because my armpit’s in the way a little bit. But we’re going to make this magic work.
We’re going to do a little Hollywood. We’re going to do some Michael Bay style special effects. And we’re going to scroll over this way. And you’re going to see you’re not reading that wrong. That’s not two minutes and seven seconds.
There was not a 37 minute improvement. That is two hours and seven minutes. If we look at that query plan, I’m going to have to be real careful hitting control in our here. That is not two minutes.
That is two hours, seven minutes and three seconds. This query has all sorts of problems. Many problems.
Gigantic problems. Two hours and seven minutes worth of problems, really. And SQL Server still thinks that a missing index on the user’s table is going to be what solves that problem. But if we go all the way over here and we look to where we touch the user’s table, that was 36 milliseconds out of two hours.
What does SQL Server say? We could have an impact of nearly 35% on this query. Wow.
Wow. Would that really fix this? No. No. We would still be screwed. We would still have a query that runs for, well, I guess two hours, seven minutes and minus 36 milliseconds probably. That just might do it.
Just sounds about right anyway. So this thing just really gets screwed up in all sorts of places. You know, there’s 44 minutes in here and 36 minutes in here. And, well, we spent 46 minutes up until here.
So this thing is really screwy. The funny thing is that, really, no amount of missing index is going to help this thing. But if all you do is look at missing indexes and think, well, if I create that index, something will get better, right?
I created, like, five indexes. And it got worse. It went from 39 minutes to two hours and seven minutes.
That’s not an improvement. That’s a worsement. It’s not a good showing. What missing indexes tell you are a very brief opportunistic, a passing thought, really, by the optimizer.
Well, I’m creating this query plan. You know what? It might be better if we had this thing, right?
Oh, it might be okay. Maybe things would get better if we had this. But it doesn’t really know. It doesn’t really understand what’s going to happen after you create that index. After you create an index, you could get an entirely different query plan.
All of a sudden, the costs are all different. Maybe it wouldn’t be 35%. Maybe it wouldn’t have a 35% lower cost. Maybe it would come up with a query plan that cost more. What can we do?
This is quite a conundrum. What missing indexes don’t tell you are ways that you can tune a query that have nothing to do with indexes. It has everything to do with how you write the query and maybe even how you get this query to run.
Right? So what a missing index request won’t tell you is that if you run this query and you use batch mode, if you do something, well, for the sake of making sure that you don’t think I have any weird things up my sleeve, I just allowed this query to run in compat level 160.
Now, cardinality estimation under compat level 160 using the new cardinality estimator is very hit or miss. So I’m letting this one go. Sometimes when I use the optimizer compatibility level hints so that I can get intelligent query processing features like batch mode on rowstore, I pair it with the legacy cardinality estimator because it’s generally better.
I like it better anyway. So this one, I just let it fly. And this all finishes in about eight seconds.
All right. So we made this query better either from the original one by about 36 minutes or 39 minutes or by the second one after I created all the indexes by two hours and seven minutes.
And if you notice, this query is not using any nonclustered indexes. This query is using just all clustered indexes. It’s just it’s scanning all of them.
Right. We’re not doing anything better here. We’re not doing anything more efficient when we touch these indexes, except for the fact that they are all happening or probably the majority of them are happening in batch mode. All right.
So batch mode works a lot of magic on queries that do stuff like that and have lots of problems running in row mode. All right. This query looks incredibly dissimilar to what we had before, but it’s the exact same query. This is what I ran here.
This is what I ran here. All right. It’s the exact same query over and over again. This is what I ran here. Let me scroll back up to where this thing is. It’s the exact same query, just in a different execution mode. All right.
And another thing that missing index requests won’t tell you is that sometimes you can get things better by using a temp table. All right. We create a little table here called precheck, and we dump a partial aggregation of the results we care about into that temp table, and then we join that temp table off to just one other query.
We can get this thing down to three seconds, right? I mean, close enough. There’s 1.7 seconds there.
There’s 1.8 seconds there. So, okay. So maybe a little closer to four seconds. That’s fine. But we’re still benefiting from batch mode. We’re still getting some results there.
So we could run this query just as it is and see a quite significant improvement from, you know, 40 minutes to over two hours to eight seconds. That might be good enough for most people, right? We could also, you know, if we’re allowed to rewrite the query, we could use a temp table.
We can get that down to three seconds. Probably three seconds, probably preferable to eight seconds, depending on the activity, right? Depending on your level of impatience.
And one thing that missing index requests are also not going to tell you is when the optimizer just misses serious opportunities, like early aggregation. Early aggregation is one of the most powerful things you have up your sleeve as a query tuner because the optimizer often misses this. The optimizer often doesn’t think, oh, I can group by some stuff first.
I could group by these columns that I’m joining and doing other things with, and I could join fewer of them because I’ve aggregated early. So if we could rewrite this query to do one early aggregation as part of a derived join, and another aggregation as part of another derived join, and I could join those two derived joins together, and I could even use batch mode as I’m doing all of these things, and I could get this query to run in about a second and a half.
And, of course, SQL Server still thinks that we’re missing an index on the user’s table, which, you know, granted, in the other query where we would have saved like 35 milliseconds on a query that took, you know, 40 minutes or two hours, well, that’s not too significant, but maybe 40 seconds out of 1.425 seconds.
Maybe if we were really just itching to get this thing down to 1.38 seconds or something, what would be, I don’t know, I actually don’t know anyone who would do that. But if you’re out there, I want you to leave now.
I would appreciate if you just step away from the computers. You can buy a book on rock collecting. It might be better for everyone if you would just do that.
So if there’s a message here, it’s that missing index requests should not be a big part of your query tuning efforts. Every once in a while, missing index requests do line up with something that you should do or an index that you should have in some way. But for me, when I’m dealing with this sort of thing, the last thing that I generally care about is missing index requests.
Even when there’s a missing, like even when you go and run the query and get the actual execution plan, you might see the missing index request line up with, let’s, Adam Mechanic calls them data acquisition operators. So there’s two forms of operators.
There’s data acquisition. There’s data processing. Data acquisition is, of course, whenever you read from a thing, right? So whenever you read from an index, whenever you do one of these things, you are acquiring data. Whenever you do one of these things like join or aggregate data or compute scale or filter, you’re processing data.
So there’s acquisition and processing. So if the missing index request does at least line up with when you are acquiring data, that taking a long time, it’s probably worth thinking about creating an index similar to the missing index request.
But another thing that’s a very big deal is that the missing index requests often don’t give you the best possible index. There are a lot of things missing from how it decides what columns go in the key and what columns go in the includes or even what columns show up in there at all. So be very, very careful out there with them.
Don’t just think that because there’s a missing index request, you could make anything meaningfully better or faster. You got to run the query. You got to get the actual execution plan. You got to see where time is spent.
And then you have to figure out how to reduce that time. You don’t have to figure out how to reduce logical reads. You might create indexes and reduce logical reads like in this query. I don’t know.
Did logical reads go down? Maybe, probably. Who knows? Who cares? I don’t look at them anymore. They’re stupid. You might reduce logical reads, but you might have a query that went from 40 minutes to 2 hours and 7 minutes. I don’t know.
It’s not a good thing to judge things by. What you care about is making this final number go down. This is the metric that makes you a query tuner.
Going from 40 minutes or 2 hours to 1.4 seconds, that’s query tuning. Reducing logical reads, if it happens as a byproduct of this, cool. If not, whatever.
This over there in the corner, that’s the metric that you care about. And there’s one final go around on this. Adding missing indexes is not what got us there.
All right. So with that out of the way, I hope you enjoyed yourselves. I hope you learned something. I hope you will not care so much about missing index requests in the future because they make you lazy and dumb. They make you a bad performance tuner.
What you should care about is figuring out why that query is actually slow because it’s probably not going to be all those missing indexes. Anyway, I’m going to go finish my drink and go to bed. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.