The SQL Server Performance Tasting Menu: When Missing Index Requests Make Performance Worse

Haha NOT Get It.


Video Summary

In this video, I delve into the nuances of missing index requests in SQL Server, particularly focusing on instances where indexes aren’t truly missing but are still causing confusion. I share a real-world scenario from my consultancy work, highlighting how a seemingly problematic query with a missing index request might not actually need an index at all due to its infrequent execution and low impact on overall performance. The video explores the importance of domain knowledge in making informed decisions about indexing, emphasizing that while SQL Server’s suggestions can be valuable learning tools, they should be carefully evaluated before implementation.

Full Transcript

This video we’re going to talk about missing index requests, or rather not missing index requests. Indexes that weren’t really missing. But we got complaints about them, didn’t we? Now I can sympathize with the situation a lot because I am a consultant and I see day-to-day servers for the first time a lot of times and sometimes the only time a lot of times. And I can also sympathize with people who other people, you know, new DBAs are a job, right? You get a new job, you look at a database, you’re like, what is going on with these indexes? Or if you’re just kind of new to index tuning and you’re not really sure what to do or where to start. SQL Server’s missing index request can be super helpful, at least as a learning tool to start figuring out exactly what kind of problems indexes solve, stuff they can help with, stuff they can help with, stuff they can help with, stuff they can help with, stuff they can help with, and stuff they can help with. And so, you know, you can see a lot of things that you can help with indexes. You can’t really help with, you know, like, like, you know, you might create a great index, but your predicates are all not sarkable. So you scan everywhere anyway, and you’re like, oh, what happened? But anyway, so this is close to a situation that I ran into with a client where there was some weird code on the server. And the weird code looked sort of like this, where a long time ago, some some rogue user, can you believe that a user did something crappy and unexplained?

expected. I can’t. I can’t. I can’t imagine why a user would do that. Telling lies on the internet. Who would do that? Who would do such a thing? So, there was some code in here to fix when, to fix a parameter to make sure that we could not, we’re not passing a negative value in over here. And for the most part, that was fine because we had, we had this index on owner user ID, and that’s one of the predicates here. And that’s, and that was good enough to get, to seek to a normal, like a, like a smallish range of variable, range of values here. Friday is catching up with me. Rather, Thursday night is catching up with me. So, that index was serviceable enough to get us here. We could do some additional predicate evaluation, the clustered index here, but things were generally fast.

So, like, when we go and we run this query, and we look at the query plan, this finishes quickly, right? 64, 0.064 milliseconds. Fast query. And this has a missing index request on it, too. SQL Server’s like, oh, we could make this faster. And I’m like, I don’t really think you know what you’re doing anymore. But, yeah, sure. If we had an index on those three key columns, we could technically make this faster. We wouldn’t have to evaluate predicates in our key lookup here. Fine. Technically, you are right, SQL Server. Technically, you’re right.

Technically. But I don’t really see, if I, if I see this query running in 0.64 milliseconds, that’s probably not going to be the query that I’m like, oh, we got to fix that. It’s an emergency. That’s not the query that I want to focus query and index tuning for. This query might look a little bit more like something that I might want to fix. This thing runs for a couple seconds.

If you look over at the query plan, it’s fairly expensive, costing 3,473 query books. It’s like a, it’s like a Gucci tracksuit query cost, right? It’s astronomical. And this has a missing index request on it, too. It’s a missing, you know, pattern ID, creation date, last activity date, post type ID.

There’s all sorts of, all sorts of things in there that we can go, oh, if we had an index, this would be so much faster. So the thing is, this is one of those queries where, you know, if you are looking at a server for the first time, or if you’re looking at a server for the only time, you might see this missing index request and go, man, we could really reduce this expensive query down. But this is the sort of thing that does take some sort of domain knowledge to make really good long-term decisions about.

Now, realistically, if we’re, if we’re looking at a server for the first time, we’re analyzing the query plans, or looking at the missing index DMVs, or like digging into the data, we’re going to analyze some stuff. This might jump out to us as a great index to create. We might say, we’ve got to get this in there. But, you know, having the domain knowledge to know that, hey, this query only runs at night.

This isn’t a query that we care about all that much. It can, it can run for two seconds. This is going to populate a few things. It’s going to run a hundred times. It doesn’t really matter a lot. It’s going to happen at midnight when no one’s there anyway.

No one can hear you scream. No one can hear this query scream. So, but maybe, I don’t know, we look at things and we’re like, you know, I’m going to give this index a chance. I’m going to, I’m going to give this index a shot because, um, make up a reason.

Make up a reason in your head. Give me a good, leave a comment with a great reason for why we’re creating this index. But now, after we create that index, our original query that ran in 0.066 milliseconds no longer runs in 0.066 milliseconds. And it runs for more like 10 seconds to do the same thing.

And if we look over at the query plan, we can see our new fancy pants index being used, but not being used very well. Well, we have a very bad guess with this index. It’s guessing that two rows are going to come out of here when we really get about 6 million rows out of there, plus 223.

That’s a lot of rows. And we get this guess that we are only going to have to read 3.05086 rows from this index to get those two rows out. And we’re going to have to do this key lookup.

Well, we estimated the key lookup, estimated number of executions, 1.8 actual number of executions, 6,223. So we had an issue with this guess. And that issue with the guess comes from the fact that we declared a local variable in the query plan, right, or in the store procedure.

This local variable means that we are going to get what’s called a density vector guess. And I have a post that actually by the time this comes out, that other post will have been out for over a while. So you can look forward to that.

I don’t want to spoil too much, but it’s about this sort of thing. But we declared this variable in here. That means we’re getting a not good guess. So if we show the statistics for that new index that we created, we will get a couple numbers back.

And those numbers are going to be the number of rows in the table, this 17 or so million, and this all density. And if we multiply those together, which I’ll do in a second, but first I want to show you that if we were using the actual factual histogram, we would have had a much, much better guess for the number of rows that were going to come out of that.

All right, that’s where that’s six. We have that six million number here. This isn’t dead on accurate because we didn’t sample the entire table. Sorry, this isn’t dead on accurate for some other weird reason.

We did sample the entire table. I’m not sure why that isn’t perfect, perfect, but who knows. But if we multiply these two numbers together, right, this number for the number of rows in the table and the all density of the parent ID column, if we multiply these together, we’re going to get that 3.0508 number back, which is the number of rows that SQL Server thought it was going to have to read to get the two rows out in a match.

So things are a little weird, a little weird in there. So this isn’t all the fault of the index that we added. Partially the code is to blame because we’re getting a bad guess because of that.

But again, this is one of those crazy domain knowledge things where you have to understand that like this code existed, why it got there. And then maybe, maybe, maybe, maybe adding that index was not necessary because the query that wanted it only runs at night and the query that ended up using it, you ended up using it badly.

So, you know, normally, heck yeah, I just want to drop that index. Why bother with it? It’s not going to help anything and it’s going to hurt things. But it might also start a conversation about that piece of code.

Now, even with that index still on there, if we change this store procedure a little bit to recompile, right? So if we add a re, option recompile hint to the statement, then what SQL Server can do is something called the parameter embedding optimization.

It can sniff this local variable with an actual, we’ll replace that with the actual value that gets passed in. And if we do that, even with that other index there, and we run our query, or run our store procedure rather, we’ll go back to using the original query that finishes quickly.

We’ll still get a missing index request for it, but that’s, that’s neither here nor there. Now, sure, option recompile works fairly well, but another option other than recompile might be to use dynamic SQL.

Now, what dynamic SQL can do is it can take that local variable and we can parameterize it. Since this dynamic SQL happens in a different execution scope as the store procedure, it has no idea what the, what the lineage or etymology of that local variable is because it’s parameterized in the dynamic SQL batch, right?

So if we change our store procedure to use dynamic SQL and we execute that, it will also, despite the presence of that crummy index, use a fast execution plan right there, right? Cool.

So, look, it wasn’t all the fault of that poor consultant who told you to add that index. There was some other stuff at play. I don’t know, there was some other stuff we had to think about. But, you know, SQL Server 2019 offers a little bit of hope for this sort of thing.

There is a new missing index DMV that I blogged about that will, as long as the query that asked for the missing index is in the plan cache, as long as it’s there, you’ll be able to tie the query that asked for a missing index to the missing index request.

So that’s fun. That’s great. If you’re on SQL Server 2019 and you have a happy, healthy plan cache, otherwise, I don’t know, you may need some fancy pants expensive monitoring tool to catch this sort of thing.

But, you know, right now, you know, I just want people to be a little bit more cautious with the missing index request. Please at least try to correlate it to the query that’s asking for it before suggesting it or creating it.

If you can’t do that, then, you know, maybe put it as like a something to be on the lookout for long term or, you know, something that you can add in a lower environment, dev test QA, run some kind of, you know, example workload against.

It’s usually a helpful thing to do if you have that sort of time on your hands. But, you know, workload knowledge is pretty important when you’re querying index tuning because you can run into all sorts of weird regressions and things like this that can backfire.

Anyway, thanks for watching. I’m going to hopefully get this video recorded so it happens in both of your ears. If it doesn’t, it’s not my fault.

Camtasia is fickle with this sort of thing. I don’t really know how to fix that. But anyway, I’m going to go do another one of these. I don’t know, maybe I might eat something first, but who knows?

Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.