What Missing Index Requests Miss In SQL Server

What Missing Index Requests Miss In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into why missing index requests in SQL Server often end up being more frustrating than helpful. Erik Darling from Darling Data takes a humorous approach while discussing the intricacies of indexing and query performance. He highlights how SQL Server’s suggestions can be misleading, especially when dealing with simple queries that return results quickly but still generate missing index recommendations. I share tips on identifying truly problematic areas in your database through practical examples and explain why certain factors like trivial plans, non-sargable predicates, and existing indexes might prevent useful missing index requests from appearing at all. By focusing on real-world scenarios and query performance metrics, this video aims to provide a clearer understanding of when and how to address indexing issues effectively.

Full Transcript

Erik Darling here with Darling Data. Sorry, I’m going to straighten up my tie a little bit. Make sure that’s dangling in the right place. Someday I’ll come up with a better way of wearing this thing, but for now you all just have to deal with me having a strange lollipop poking out of my shirt. When I put it up here there’s just too much stuff. I want to avoid stuff. I don’t like the unpleasant stuff. No one likes the unpleasant stuff. Well, some people do. I suppose 4chan still has an audience. Anyway, in today’s video we’re going to talk about why missing index requests stink. And I think you’ll all enjoy this one. But before we do that, guess what? Yeah, it’s that time. If you would like to show your support for my humble channel with a small monthly donation, as low as $4, you can sign up via the link. down in the description below. If you don’t have $4 a month, I totally understand. Those bookies are bloodthirsty maniacs. Stop betting on football. Well, actually, baseball, I guess, is what’s wrapping up now. But it would make me very happy if you would like and comment and subscribe to the channel because that’s the kind thing to do. That’s the nice thing to do. If you need help with your SQL Server, for any of these things, that’s my job. I fix SQL servers. So give me a ring. You can contact me through any variety of ways. I seem to be everywhere. If you would like to get some high quality, low cost training, you can get that for life.

But for the for the for the low cost of around 150 USD with the discount code spring cleaning. If you would like to catch me live and in person, the only way to do that right now is to go to Seattle in November. If you would like me to come closer to you for whatever reason that is, please, please nothing nefarious. I I’m a nice boy. You can tell me what events might happen near you that might be good for me to go to because I like going to events. It gives me something to do other than be hung over every weekend. So I’m kidding. I haven’t had a hangover since 1994. I’m just I’m just I’m just that good at drinking. But anyway, let’s talk a little bit about why about why I have become disillusioned over the many years by missing index requests. So if all if all is gone, according to prophecy, I’ve already created this index, and I have just wonderful. What I’m going to do is I’m going to start off by running these two queries. And I think you might be somewhat shocked.

And amazed and amazed and astounded at what ends up happening when these two queries finish running. They’re not anything typically like particularly complex, at least as far as you know, like production queries go. Often I see production queries that are hundreds to thousands of lines long, with way more joins and conditions and sub queries, CTE stacked thousands of lines high. And, and, and, you know, there’s a lot to dig through with those, of course, but these are fairly simple queries. It’s actually the same query twice.

The only difference is one asked for the top 100, and one asked for the top 500. And at the end of the day, you know, they both return, you know, slightly different results, because from one, we asked for the top 100 and get 100. And for the second one, we asked for the top 100, and we get 158. But that’s okay, because that’s not really the point. The point is that this first query takes 15 seconds. You can see that there, spends 14, well, just let’s call it about 14 seconds, even creating an index for us, and doesn’t tell us that we need an index. Right? It doesn’t say this eager index pool means SQL Server created an index when this query ran. And notice that when we ran a query again, SQL Server was just like, well, I guess we’re going to build that index again. Right? It does, does the same thing. And it takes the same amount of time, right? Just about 14 seconds there.

What’s the most frustrating part about the whole thing, is that SQL Server only tells us that we need an index on the post table, on the score column, including the user ID column. The thing is, when we touch the post table, we spend 10 milliseconds here, and 91 milliseconds here, and 124 milliseconds in total, dealing with data in the post table. SQL Server does not tell us that an index would help down here, where we spent 15 seconds. SQL Server thinks that this part of the plan behind my giant head is where we need to focus our indexing efforts, which is just not true. It’s not, it’s not a good solution. Where we need an index is on the badges table, so that SQL Server doesn’t have to create an index every single time these queries run.

Notice that the index that we created happened a lot faster than the index that got created in the query plan. And now this all happens very quickly. And SQL Server is still insistent that we need an index on the post table. Right? And this is the kind of stuff that you have to watch out for. Right? SQL Server is going to tell you that you need indexes in places where you don’t need indexes. It’s going to tell you that you need indexes that almost already exist. It’s going to ask for very duplicative stuff. It’s not very good at this game. Notice these queries. They all have slightly different where clauses. And that’s okay. The important thing here is that every single one of these queries runs very quickly. We have four milliseconds. We have three milliseconds. We have three milliseconds again. We have two milliseconds.

And every single one of these queries tells you that an index would impact the query by some giant percent. We have 99.6 up there. We have 99.5 over there. I think there’s a 99.7 up a little higher. At least there was before. Yeah, look at 99.7% impact on what? On three milliseconds? Are you out of your mind? How would that impact three milliseconds in a meaningful way? You see, all these missing index requests are things that SQL Server comes up with based on its usual shenanigans with costing and estimates.

And it doesn’t take into account anything that actually happens when the query runs as far as figuring out what indexes are necessary. There are some things that will prevent missing index requests from even showing up in queries where they might help too. All right. So remember, all those queries it finished in two milliseconds, SQL Server was like big time index.

Add it now or else I quit. I’m going home. But certain things like trivial plans, if you look at these two queries, you can sort of see the signs of trivialness up here because I did not have any of this brackety or parametery stuff around my query. And I certainly did not ask SQL Server to treat my query as a trivial plan, but yet it did.

And we have this trivial plan where SQL Server tells us no index could possibly help this query go any faster. And then we have this query down here where SQL Server has now decided that an index would be so much more helpful than not having an index here. All right. So that’s one thing that can prevent missing index requests from showing up.

Other things that can prevent missing index requests from showing up are a lack of sargability or is a lack of sargability. We’re talking about one thing that spans multiple queries here. Now, the textbook definitions of sargability is wrapping one or more columns and functions, even these lovely built-in ones that turn pink.

Concatenating columns together to ask SQL Server for stuff, right, or even concatenating a value onto a column will do the same thing. And also doing anything kitchen sinky like this. This will happen with store procedure parameters too.

So if you need to figure out if, you know, a column is like a parameter or if that parameter is null, SQL Server will not be forthcoming about missing indexes, like being helpful for any query like this. Granted, that’s for sort of a good reason because you couldn’t seek into the index with any of these query patterns. But sometimes having a smaller version of the index to scan is helpful, especially if all you’re doing is scanning a clustered index and that clustered index is gigantic.

It still might be helpful to know, hey, you know what, an index would be pretty okay here. I wouldn’t mind that. Other times when SQL Server will not tell you that an index would be useful is when there’s a crappy index in place already.

Or maybe not even crappy. Let’s just say mediocre. Or maybe this index was great for a different query, but this index is not great for another query that ends up using it.

That’s something else where a SQL Server will be like, oh, but I got this index. I can just use this one. It’s fantastic.

I’m just going to stick with this thing. It’s the best index I’ve ever seen. If we run these two queries and we look at the query plans, we’re going to see quite different execution things going on. Now, granted, this isn’t terribly slow.

And we have an index seek over here. It did have to go parallel to be 227 milliseconds, and we did have to sort some data, which meant we had to ask for memory. This isn’t a very, very big deal for these queries.

But look what happens when we rearrange the order of the index key columns. In this one, we have creation date and then score. In this one, we have score and then creation date. Because the score predicate is much more selective in both of these queries than the date range predicate that we’re passing in, we’re looking for basically 11 years of dates.

And if you think this is abnormal, I don’t know, maybe you should try consulting because this stuff is everywhere. If we look at these two query plans, the second one that helps us seep to the scores first, way faster. Here, right, zero milliseconds this thing took to find that one row.

This one over here, 226 milliseconds to find that one row using DOP8 threads. Jeez, if this query weren’t at DOP1, how long would it take? I don’t know.

What’s 226 times 8? Bigger number, right? So often SQL Server will make up for shoddy indexing with parallel execution plans. But for that first query, you know, SQL Server is not going to sit there and be like, hey, good idea here. Index, please, now.

Sometimes the columns you select will have an impact on SQL Server giving you a missing index request, too. If we look at these two queries, and we look at the execution plans, the first query, which is almost identical to the second query, has no missing index request. SQL Server says, nothing would help here.

We’re doomed. All we can do is scan this clustered index. That’s our only choice. It’s our only hope. You’re just, this is it for us. For the second query, SQL Server’s like, wait, no.

We could impact this query by 91.2% if we had an index. Golly and gosh, what happened? Well, in the second query, all I did was quote out this body column.

Now, this isn’t always going to happen. Sometimes you’ll have a select star query and SQL Server will be like, well, they asked for the whole table. Might as well just put the whole table in the includes.

Smart, right? It’s a good index design pattern. Thanks, cookies. You know, it’s really, yeah. Why not? Go for it.

Disk is cheap. Why not just do that? But in here, the only difference is coding in or out the body column. And then SQL Server all of a sudden decides that an index would be great. Sometimes other things will impact whether missing indexes show up or not, like the types of predicates you use.

Sometimes SQL Server will look at range predicates and think, no way. Other times SQL Server will look at equality predicates and think, awesome, I can do that. Even though these both logically ask for the same thing.

This query just wants the parent ID 184618. And this one wants a parent ID greater than 184617 and less than 184619. So that can only be 184618.

These queries both want the same thing. But SQL Server only thinks that the equality predicate could be helped with an index. And gosh, look at the impact that that would have.

99.9969. I didn’t make that up. I couldn’t possibly make these numbers up. SQL Server comes up with these dirty, filthy numbers on its own. It’s got a nasty mind, SQL Server.

This query down here, SQL Server is like, nope, nothing would help. Again, we’re doomed. We’re just lost here. So missing indexes.

They often show up when they wouldn’t help. They often show up on tables in the query plan that would be of no value to add additional indexing to. And they don’t show up when they would help.

There are all sorts of situations that we looked at in this video where a SQL Server could have been like, hey, yeah, actually an index would have been pretty all right for this. There are all sorts of things that will prevent missing index requests from appearing, too. Eager index rules, trivial plans, non-sargible predicates, mediocre existing indexes, columns that you select sometimes, and the predicates that you use sometimes.

You will probably find some caveats to the last two points there at various junctions in your query tuning life. But these are things that can affect whether or not SQL Server gives you a missing index request. It gives you a missing index request, singular.

Now, I don’t want you to feel bad about this. I don’t want you to feel at all ashamed. I don’t want you to, you know, if this is your kink, if this is all you’ve got going for you, my rates are reasonable. But I used to love missing index requests because I really thought that SQL Server was trying and thinking really hard about what missing indexes would help for a workload.

It’s just like how I used to love the plan cache. I used to think the plan cache was full of all this fantastically useful information until I really started working with missing indexes and the plan cache. And the more that I worked with them, the less I liked them.

They have a lot in common as far as sort of, you know, unpredictability and instability and, you know, just not being able to really, like, tell you what would be the best thing to do in certain situations. But I really, I just end up getting misled by them over and over and over again. And the right set of indexes wasn’t obvious until I started really looking at the queries that were running and figuring out what indexes would actually help tune them.

If we even needed an index to tune them at all. There were lots of queries that I looked at where SQL Server was like, missing index, missing index. And I was just like, we’re good there.

There’s a different index we need. It’s elsewhere. You know, there’s probably a joke about princess castles or something that would be great if I was into jokes about princess castles. So what you should pay attention to when you are tuning queries or looking for queries to tune is you should look for queries that are slow.

That is wall clock time. That is a metric that you can meaningfully affect the amount of time from point A to point B. Right.

From hitting F5 until results are completely sent back to SSMS. You would want to tune queries that users run. Right. Because usually those are the ones users complain about. Now, there are going to be exceptions to that because you might find at certain points that there’s some other process that runs while users are trying to do stuff.

And maybe that messes up user queries. That can totally happen. And you should totally feel free to fix those queries, too.

And then maybe once you fix those, the user queries will kind of calm down a bit. And you can usually figure out which user queries still need some work and some help. Just an example.

Like you might have some background process that like does stuff and it might block user queries. It’s not the user queries that are slow. They’re blocked. They can’t make progress. Why?

Because read committed is a garbage isolation level. You’re welcome. So when you’re doing this, get actual execution plans. I have a video coming up about what to do if you feel like a query runs too long to get an actual execution plan. So just hang on tight for that.

Look at the operator times and those query plans to find the slow parts or the slowest parts and see if an index would indeed help in those slow parts. If you have a missing index request for a slow part of a query, you have been blessed like no other. Because you don’t have to actually type the index definition.

But you do have to review that index definition. Because SQL Server does not put a lot of thought into the order of key columns. SQL Server also does not put a lot of thought into how columns outside of the where clause might make good index key columns.

It’s up to you to figure out if columns that you’re joining on, ordering by, grouping by, partitioning by, all that other good stuff might make useful key columns too. And you should fix those issues. Don’t just create a bunch of missing index requests and think your job is done because it’s not.

And finally, don’t fall for scam imaginary metrics. Things like logical reads, missing index impact, and query operator costs will all lead you astray. They will lead you down the wrong path as far as which queries you should focus on.

You should avoid these things like the plague. Now, physical reads, of course, are a bit different than logical reads. And it is possible that a query that does a lot of logical reads will sometimes do a lot of physical reads.

But that would show up as duration anyway, and that’s the thing you want to fix. No user is going to come complaining to you about how many logical reads your query does. No user is going to thank you for reducing logical reads.

They’re going to thank you for making their queries faster. So stick to that stuff. And I think that’s about it. I’m going to, I don’t know, upload this as I am wont to do and schedule this. And then, I don’t know, record something else because I’m feeling real fired up today.

Feeling extra firecrackery. Me and Bats getting down. Bats doesn’t have, Bats is, Bats is running on empty. Not me though.

I’m running on love. Cool. Thanks for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.