Why Partially Fixing Key Lookups Doesn’t Work In SQL Server
Video Summary
In this video, I dive into why adding columns to an index to fix every key lookup might not always be the best approach. I share insights from my recent blog post on how fixing predicates in key lookups can sometimes be sufficient to avoid performance issues but highlight that there are scenarios where it’s not practical or efficient. Using the Stack Overflow database as a case study, I demonstrate through query plans and statistics histograms why adding every column to an index might introduce unnecessary overhead, especially for frequently updated tables with large string columns. This video aims to provide a deeper understanding of SQL Server’s costing mechanisms and encourage thoughtful index design rather than knee-jerk reactions.
Full Transcript
Erik Darling here with Darling Data. Still muddling through the holidays, if you can believe that. Anyway, in today’s video, it will hopefully be a nice brief video, I want to talk about why you can’t fix every key lookup just by adding some columns to the index, to an index. And this is, I do have a blog post fairly recently, I don’t actually, I’m not even sure right now if it’s, actually no, it did publish, about how fixing predicates in key lookups can be good enough to avoid some of the really slow crap that can happen with key lookups. However, not all key lookups have predicates in them. And there are some cases, where you might really be, like, gung ho about fixing the key lookup completely by adding, like, a bunch of columns from the select list to the include list of your nonclustered index. And that can have a lot of downsides for a lot of reasons. Right? Because now you, now, every time you modify the base table, you have a, I mean, inserts and deletes, for sure. But, you know, updates, if the columns that you’re now adding to the key of this index are in the include, now you have some additional considerations for when you, for when you update the table. All right? Take up more space on disk, more space in memory. If you are changing things frequently, then, you know, you have some additional transaction logging stuff to think about, some additional locking and blocking stuff to think about.
So, and, you know, and especially if you have columns in your table that are strings, particularly big strings, you know, as you approach the numerical max of nvarkar or varkar columns, either 4,000 or 8,000, or if you have max data types in your table, then all of a sudden you’re looking at potentially storing and having a lot of just really big indexes in your table. So, really big indexes if you start trying to fix key lookups that way. So, what I want to show you is the costing mechanism behind lookups and something that might even be surprising to you because it was certainly surprising to me when I learned about it or when I discovered it through vigorous testing, staring at query plans.
So, I’ve already got an index on my users table in the Stack Overflow database and I’ve updated the statistics to use a specific distribution. I mean, it’s not that I got it naturally once, but then like when I went to redo the demo, sometimes I would get different statistics and sometimes those different statistics would make the lookup versus non-lookup demo fail kind of miserably. So, in order to avoid that, I use a specific statistics histogram that I know gives me the outcomes that I want.
That’s this whole crazy thing here that makes for a really long scroll bar. This is the most interesting thought to ever go through my mind, which is nicely lined up with my ears. Anyway, so what I want to show you right now is how the reputations 23 and 51 are distributed in the users table.
So, if we run this and we look at the results, for reputation 23, there are 13,542 records. And for reputation 51, there are 13,116 records. Very close, right?
Almost suspiciously close in values. There’s reputation fraud going on there. Anyway, if I run this query to look for users with a reputation of 51, and remember 51 has 13,116 rows associated with it.
If I run this, we look at the query plan, and we quite naturally get a key lookup plan. Right here. 13,116.
Oh, look, that’s a good statistics histogram. We nailed that. That was correct. That was on the nose. And, of course, we have this missing index request, where SQL Server says, just go ahead and add every column to this index. All of them.
So, that includes, well, let’s make this a little bit easier for everyone to see all in one go here. If we look at what SQL Server wants in there, we have about me. That’s a bar car max.
We have location. We have website URL. We have display name. These are all string columns. They’re not the location, display name, website URL. Those aren’t maxes, but, you know, it’s a lot of work just to have, to avoid a key lookup.
And this query isn’t slow either, right? If we look at how long this thing takes, it’s 31 milliseconds. I don’t know if I need to maintain an index of that girth over a 31 millisecond query.
Maybe not the best thing in the world there. And if you have questions about why I have 1 equals select 1 at the end of these queries, you are free to visit my website.
Because the question that everyone asks every time they see a query of mine that uses that is, what’s the point of 1 equals select 1? And I have to say, what is the point of 1 equals select 1? What could it possibly be?
Could it be explained to you in a blog post on my site? Did you perhaps type, what’s the point of 1 equals select 1 into the wrong place? Did you mean to leave a comment or did you mean to type that into Google? Or Bing, DuckDuckGo, or whatever search engine wants to sponsor my next video.
That’ll be the one that I ask about. But when we run that same query for reputation equals 23, I’m going to run this two different ways.
I’m going to run this once. I apologize that it’s still highlighted. I know that looks a little hacky, but you’ve dealt with worse from me, so we’re just going to roll with it here. So this one, of course, is just a regular select and where.
And this query down here tells SQL Server that I want to use my nonclustered index instead. All right, so use the nonclustered index. Gosh darn it.
Now, if we look at the query plans for these, you know, this one takes 200 milliseconds about. I don’t know where my thing is. I mean, I don’t know where my cursor is.
Other things I’m acutely aware of. My books, my phone, where these lights are blinding me. Anyway, this query takes about 200 milliseconds.
The cardinality estimate on that is, again, spot on, right? Exact abundo. We’re not missing anything. The stats histogram that I created, beautiful. Doesn’t get much better than that.
Of course, because SQL Server’s query optimizer is so incredibly biased against random IO, it chooses to scan the clustered index rather than seek into our nonclustered index, dive right in, find a few rows, and do lookups to match them in the clustered index.
That’s this whole section here, right? We have our index seek, where I told SQL Server which index to use. And as usual for a key lookup, oh, that wasn’t what I wanted.
Let’s redraw this one. Let’s make sure we’re holding the control key. And let’s put a square here. And as usual with a key lookup, we have a nested loops join. So we take one row from over here, put it into the nested loops join, and then we go find it down here.
That’s what a key lookup is. It joins two indexes together. And unlike in the blog post that I had written recently about predicate lookups, if we look at the key lookup here, all we have is this big old output list.
And all we have in that green text up here that we already looked at is SQL Server asking for us to include every single column in the table in our nonclustered index.
Which again, we don’t really want to do. Now, I totally get it. Select star is a bad practice. You do select star queries, you’re asking for trouble, you’re a bad DBA or developer or whatever you fancy yourself, whatever title your company gave you so they can chronically underpay you.
That is what you are a bad version of. The thing is, when it comes to key lookups, they have the exact same cost regardless of if you’re getting all the columns or if you’re just getting one of the columns that is not part of your nonclustered index.
So coming back up here, this is the index that we created. The reputation column. And since the clustered primary key of the users table is on a column called ID, only that column is inherited by the nonclustered index.
No other column in the table is magically part of our nonclustered index definition. So the problem is that whether we get every column or we just get one column that’s not part of our nonclustered index, key lookup has the exact same per loop cost.
So if we come over to the query plans for these and we look at the key lookup, it has an estimated operator cost of 37.715 query bucks for the select star query and it has an estimated operator cost of 37.715 query bucks for the query that just asked for one additional column.
So technically speaking, to SQL Server, it makes no difference at all to the query optimizer whether you’re selecting one column that’s not the nonclustered index or you’re selecting all the columns that aren’t the nonclustered index.
Doesn’t care. Each loop through has the same fixed cost. So it doesn’t matter at all how many columns you’re selecting.
One column missing from your nonclustered index that’s required by the query accrues the exact same cost as 100 columns missing from your nonclustered index that are required by the query.
And that is absolutely fascinating. So let’s talk for a quick moment here. It would be really stupid, tremendously, monumentally, outrageously stupid if you had every nonclustered index include every column in the table or try to account for every column that you might someday maybe perhaps select from a query to do a thing with maybe.
Per chance, they say. We can sort of figure out why by asking ourselves a couple questions. First, what is the clustered index?
What is a clustered index in SQL Server? It’s every column in your table logically ordered by your clustered index key column or columns.
So higher table logically ordered by for the user’s table would be the ID column which for various reasons makes a pretty good clustered index. A nonclustered index is every column you may include in the clustered index.
All right? And then the key of the whatever columns you put in the key of the nonclustered index. All right? So we order all this stuff. We have the includes down here.
If you let’s say thinking about the user’s table let’s scroll back up a little bit and let’s just look at this select star query. So right now the clustered index is everything in here everything every column in the table logically ordered by this column.
If we were to make a if we were to follow the missing index recommendation that SQL Server wants and we were to put a nonclustered index keyed on reputation and include every other column in the table we would have every column in the table logically ordered by reputation which would just be like in everything but in everything but the name because you can’t have two clustered indexes on a table that nonclustered index would be almost like another clustered index wouldn’t it?
It would be every column in the table ordered by some column. You don’t need two clustered indexes. You don’t. Simply don’t. If you needed two clustered indexes Microsoft would allow you to create more than one clustered index.
So please don’t do that. Please don’t follow the missing index recommendations that include every single column in the table.
It’s often unnecessary. Please spend some time evaluating the speed and efficacy of your queries before you go creating super wide indexes to cover various things because you’re not going to be happy after you create like 10, 15, 20 of those suckers on your table and all of a sudden everything is blocking the deadlocks because every index is getting locked all the time.
Cool. All right. So it’s a snowy day here in New York City and I don’t know maybe I’ll go make some hot chocolate spike it with something extra spicy.
I hope you what is it enjoyed yourselves suppose I hope you learned something if you sat here for 15 minutes and didn’t learn anything well.
Oh maybe maybe I’ll do better in the next video. I hope you enjoyed yourselves. If you like this video at all in any way whatsoever even if you have to rewatch it with the mute button on or if you I don’t know have to have some nitrous before you watch it if you like it in any way shape or form feel free to give the thumbs up button down below a little taparoo.
If this is the sort of SQL Server content that you have any sort of appreciation for I do like getting new subscribers so that I can get more likes and so that I can annoy more people every time I publish a video with Hey Erik Darling did something did you know?
So yeah that’s that thank you for watching
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.
One thought on “Why Partially Fixing Key Lookups Doesn’t Work In SQL Server”
Comments are closed.