Lookup Costing Is Really Weird In SQL Server

Thanks for watching!

Video Summary

In this video, I delve into a peculiar aspect of SQL Server’s query optimizer behavior related to lookup costing between heaps and clustered indexes. Erik Darling from Darling Data shares his insights on why the cost estimates for these operations are identical despite their structural differences. I explore how the query plan costs remain consistent, even though the underlying data structures differ significantly. Through practical examples, I demonstrate that while a clustered index might result in more logical reads due to key lookups, SQL Server does not penalize this operation in terms of cost estimation. This video aims to clarify common misconceptions about clustered indexes and heaps, emphasizing their appropriate use cases in different scenarios. Whether you’re a seasoned DBA or just starting out with SQL Server, understanding these nuances can greatly enhance your query tuning skills.

Full Transcript

Erik Darling here with Darling Data, and in this video we’re going to talk about something that I just think is weird. Not good or bad or ugly, just kind of strange to me. We’ll talk about some reasons why things might be the way they are, and I will try to remain not complaining about this, just explaining. I don’t want to complain, I want to explain. Sometimes I really need to complain, and you’ll hear it, but this one, just get a nice walkthrough why look, what I think, one aspect of lookup costing that I think is really weird. Now, it’s not that, like, something that I will normally complain about is that SQL Server is very much stuck in the spinny disk era of lookup, of like, I.O. costing generally. This is different from that, so you don’t have to, you don’t have to worry about that. This is just the difference, well, the lack of a difference, really. In lookup costing between heaps and clustered indexes. Alright, so let’s go with that. So I’ve created a temp table already ahead of time, because if you, well, you can’t see it because I’m in the way, but it takes about six seconds, and I’ve wasted enough of your time babbling.

So, matching me, like, move data around is probably not on your, like, top ten list for ways you’re going to spend your life. So, I’ve already created this table called votes. It is a heap. There is no indexing on it currently. We’re going to look at how indexes change things in a minute. But, yeah, right now there’s just no order to this thing. It is just a heap of nonsense.

And if we run this query right here to look at the sort of page or index structure stuff for the heap, we’re going to have this big, long, confusing table name. But we’re going to collapse that a little bit, because no one needs to see that. Absolute barrage of underscores.

And if we zoom in here, so this is a heap. So it has an index depth of one and an index level of zero to contain everything. This is just how heaps look. All 242,798 pages are just flat across, kind of.

And you can see the record count is about almost 53 million rows. So, a fairly impressive row-to-page ratio, I think. You know, sort of.

Don’t worry, we’re not going to look at fragmentation. Oh. So, if we go ahead, come down here and run this query, necessarily, because it is a heap and we have no index with which we can find this data. If we look at the IO output, you’re just going to have to, like, trust me.

Oops, that was the wrong Windows key. You’re just going to have to trust me that this is the votes table, because I’m going to just knock this whole thing out. So, it’s not in the way.

But, yeah, we’ve got to read all 242,798. We’ve got to scan that whole thing. Right. And, of course, this has a scan count of nine, because it is a parallel query plan at DOP8. And, for some reason, that gets counted as nine scans.

Right. So, the coordinator thread also apparently gets included as a scan. But, this whole thing just ran at DOP8.

Right. So, look at this. DOP8. If we come over here. Look. No, you can’t really see much there. But, if you look at the number of executions, it is also eight here. But, stats.io counts it as nine.

So, I’m only using it for convenience. I don’t use stats timer I.O. when I’m tuning queries generally. I look at the stuff in the query plans, because it’s there and it’s easier.

I don’t like switching over to a messages tab to look at stuff. So, there. We got that. Anyway. Let’s go add a filtered nonclustered index to this table.

So, this filtered nonclustered index is going to get us exactly to the data that we care about in our where clause. Alright. So, good for us.

We’ve done it. I’m going to create this index. And, we’re going to look at what changes. Alright.

So, first, let’s look at the index structure again. And, now we’re going to have a couple extra rows in there for the new index that we just created. Alright. Let’s get rid of this stuff. And, we will have…

We still have the heap with its 242,000 pages. But, now we have a nonclustered index with an index depth of two. Alright.

Now, 2,286 rows in this index are on six pages. Which is also a very… I think it’s a very, very good row-to-page ratio.

It’s excellent. Good job, SQL Server. He nailed it. Alright. And, just because the bounty amount column is an integer and there’s 2,000… Like, we just don’t need a lot of data pages to fit that one column filter down to that one little bit of data.

So, if we go run this query now… Let me see. Hello. And, we look at the execution plan. We have a rid lookup.

Right? An RID lookup row identifier. Because, that’s how SQL Server, like, keeps track of unique rows in a heap. We don’t have a…

We don’t have, like, a clustered index. We get a row identifier. We’re not going to talk about, you know, when heaps are good or bad. This is just something that I find kind of interesting. So, if we come over here and we look at the stats.io output… Let’s delete this line because there’s nothing useful in there.

And, once again, we’re going to get rid of this incredible amount of underscores. And, we’re just going to zoom in on this. I’m going to see that we did 2,294 logical reads.

Right? So, it’s like the 2,286 rows. Like, we just did a thing and we had to do some extra reads and some navigating. So, close enough to, like, the number of rows that are actually in the nonclustered index.

It’s fine. Right? But, if we go over and look at the query plan. And, we look at the costing.

Eh, come on, buddy. Help me out here. I do that so that the tooltip shows up in a way that I don’t block it or it doesn’t, like… It’s not, like, cut off or anything annoying. So, if we look at the costing for this.

We have 2,286 executions. And, then we have our estimated costs. Right? Remember, costs are not actual, real-life anythings. They are unitless, meaningless metrics that the optimizer uses to come up with an execution plan.

Right? It’s just… It’s internal stuff so that SQL Server can, like, figure out what it thinks the cheapest plan will be. Right?

So, we have 2,286 executions. We have the total operator cost of the lookup is 7.46-something query bucks. And, then the estimated IO and CPU costs are very, very low.

But, if you add those together and then multiply them by 2,286, you actually get a slightly higher number. It was, like, 7.5-something. So, I’m guessing that there’s, like, some cost reduction that happens after, like, the initial read.

Because SQL Server does assume a cold cache. Right? It doesn’t assume anything is in memory when a query starts up. So, it assumes all the IO is going to have to be done on disk.

Which, you know, we’ve talked about this. But, that’s how things start up. So, maybe there’s some cost reduction after the initial, like, lookup thing. This is, like, oh, well, now the data is going to be in memory.

So, these might be a little cheaper. So, close enough, though. Right? Like, just remember those numbers. 0.003125, 0.001581, and the 7.46856. Right?

So, all very easy numbers to remember. No problems there. Now, let’s backtrack a little bit. And let’s touch our table one more time with a unique clustered index. Now, when I add this unique clustered index, two things are going to happen.

The table is now going to be a clustered table. Right? Now, it’s a good term that I heard from Tim Chapman, smart fellow who works at Microsoft. He used that term rather than clustered index because some folks do get confused when they hear clustered index.

So, now they think there’s, like, this magical heap structure and also a clustered index. But there’s no, you just cluster the table. Right? So, it’s a good thing to do there. Right?

Good thing to remember. So, that’s going to happen. And the other thing that’s going to happen is this index is going to get rebuilt using the clustered index key column ID as the row identifier rather than the internal rid that a heap uses.

This took a lot longer when I wasn’t using a filtered index because you basically had to create one 52 million row unique index to cluster the table. And then you had to rebuild the 52 million row nonclustered index. And that was not a good use of leisure time.

This is like an incomplete on that in kindergarten. Very, very bad use of leisure time. So, now that we have a clustered index and a nonclustered index, let’s look at how that table index stuff is laid out now.

So, the first thing you’re going to notice most likely is that we no longer have that single, let’s, I forgot to do that. I’m so forgetful.

Well, let’s zoom in on this now. So, the first thing you might notice is that we have, oh, zoom it. Be my friend today. This changes a little bit, didn’t it?

Right? We no longer have that one row for the heap and the two rows for the nonclustered index. Now, we have three rows for the nonclustered index. So, there’s one page up at the very top of the index, like the root page. And there’s 395 records in there.

And then there’s 395 pages with 242,000 records. And then there’s 242,000 pages for 52 million records. And, like, we have a much deeper index now, don’t we?

We don’t just have that flat heap structure. We have three rows. Or the three levels of index. So, that’s interesting.

But what’s even more interesting, I think, is that when we run this, and we look at the messages tab, again, we’re going to get rid of this work table line because it’s useless. And we’re going to delete all these underscores.

And we’re just going to zoom in and look at the logical reads. So, now, we have 7,015 logical reads. Interesting, huh?

We did a lot. We did about 3x the logical reads is when we just had the heap. Because now we have to navigate all that clustered index stuff. And now, I’m not saying clustered indexes are bad.

Because, I think, in SQL Server, for transactional tables, they’re pretty great. In SQL Server, heaps are pretty great for staging tables and stuff. But I think one thing that probably happened a long time ago is someone might have been testing clustered indexes versus heaps and come across maybe a plan with lookups in it.

And they might have seen something like, oh, maybe it’s a little faster with the lookups because we don’t have to navigate the whole clustered index thing. And maybe we did fewer logical reads with a clustered index.

My goodness. You’ve got to have those low logical reads or else. So, someone might have been comparing them. You might have been like, well, clustered indexes are going to make everything slow.

Why would we want them? This sounds terrible. Why would we ever have a clustered index? We do all these logical reads now with a clustered index. Bummer.

But, oh, now if we look at the query plan, what’s interesting, right, is now we have a key lookup rather than that rid lookup. But, if we zoom in on this and we look, we have the same number of executions, but we also have the exact same costing, right?

So, SQL Server doesn’t actually cost key or rid lookups any differently. It’s the same numbers in there. 7.46856, 0.003125, 0.0001581.

It’s the same numbers. So, there’s no cost difference to SQL Server when it’s looking at either heaps or clustered indexes. And these queries, I mean, there’s so few rows that the timing on these is almost indistinguishable.

So, this isn’t like a performance test. This is just something to show you how the costing is the same. Now, you might be wondering why the costing is the same.

And I was, too. And I was gently reminded by a client in New Zealand that when you have a, that, rather, that in early versions of SQL Server query plans, there was no distinguishing between key and rid lookups.

Everything was just called a bookmark lookup. It was one single operator. I think it was around, like, SQL Server 2005 when that changed, when you had some distinguishing between key and rid lookups. So, that’s probably why there’s just one unified costing strategy for key or bookmark lookups, despite the fact that there is a physical, like, difference in the structure between key and, between clustered indexes and clustered tables and heap tables.

So, it is kind of weird. And it might be kind of misleading, depending on how you approach query tuning. And depending on how you, you know, design your nonclustered indexes as well, where if you were, you know, just, you know, doing a basic sanity test of, like, you know, should this table have a clustered index on it, which, you know, I mean, I don’t know how many people are starting brand new databases from scratch these days.

Most of the databases I see have been around since, I don’t know, forever. Like, if you cut them open, there’s a lot of rings. But, yeah, like, you know, it’s one of those things where, like, if you were, you know, in the, like, very early days of SQL Server, you just didn’t understand, or, like, you’re building a new database now, but maybe your knowledge is not so expansive about, like, you know, clustered indexes, primary keys, you know, nonclustered indexes, query plans, all this other stuff.

You might, you know, test having a heap versus having a clustered index and think, wow, look at this query. It does all these extra logical reads of the clustered index. It’s three nanoseconds slower.

Better just leave that clustered index out. But that’s a pretty big mistake, I think, for most SQL Server workloads. So you should not, you should not probably consider that as part of your strategy. Heaps, great staging tables, clustered indexes, very good for transactional stuff.

Anyway, that’s about enough of this. Thank you for watching. I love you.

I don’t know. Maybe I do. Maybe I’m, maybe, maybe I do, maybe I don’t. Maybe I’m indifferent. Maybe we just haven’t met yet. Maybe someday I will love you. When I, if I do love you, I will always love you.

Darling data promise. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you’ll watch more videos. If you, if you like this video, the thumbs up button lets me know that you liked it.

Otherwise, I just see views and I don’t see thumbs ups. And I think, wow, what happened? If you like this sort of SQL Server content, I do try to publish as often as possible, both my blog and the videos. So you should subscribe to my channel if you want to get more of this stuff.

Because I don’t know where else people get it from these days. So yeah, thank you for watching and I will see you in the next video, whenever, whenever that may be. I love you.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.