A Little About Forwarded Fetches In SQL Server

Video Summary

In this video, I delve into the fascinating world of SQL Server heaps and forwarded fetches. Starting off with a brief detour through my upcoming speaking engagements and consulting projects, including Zoomit’s recent webinar, I transition to explaining why understanding these concepts is crucial for database administrators and developers alike. We explore how rows in a heap can move around due to data growth, leaving behind “forwarded record pointers” that point to their new locations—a phenomenon that doesn’t occur with clustered tables. By creating a sample table and inserting a specific number of rows, I demonstrate the process of updating these rows and observe the resulting forwarded fetches through DMVs (Dynamic Management Views). This hands-on approach helps clarify how nonclustered indexes inherit certain information from the base heap or clustered table, adding depth to our understanding of SQL Server’s internal workings.

Full Transcript

Your friend, Erik Darling here with Darling Data. And we’re going to take a small break from the T-SQL preview material to talk about… Well, I mean, it’s all SQL Server here all the time, isn’t it? We are not traitors to our beloved SQL Server. I don’t know, maybe someday I’ll get bored enough to do some videos about like DuckDB or something. I don’t know. Well, my luck, you know, I do a video about DuckDB and then like Databricks would buy them for $10 billion. Uh, I don’t know. It seems, it seems like the fastest path to becoming a billionaire these days is just to fork Postgres and do something.

So, uh, perhaps, perhaps my business plan for 2026 will be fork Postgres, do something, become billionaire. Seems pretty logical to me. Anyway, uh, today we’re going to talk about something special that happens with, uh, heap tables or, uh, tables that do not have clustered indexes on them in SQL Server called forwarded fetches. Uh, and we’re going to try to answer the question of why SQL Server uses forwarded fetches rather than, you know, well, something else.

Uh, so, I don’t know. Buckle up for that, I guess. We’re here. We’re here. We’re here for it, aren’t we? It’s the whole reason we woke up this morning just to do this. Uh, if you like this channel and you would like to sign up for a membership, you can do so, uh, in the, in the, in the, in the links down in the video description.

So, uh, you should, you should use it. Probably. Uh, if you would like to hire me to do some consulting work for a SQL Server, uh, I, I do, I do that. And of course, according to the, the beer gut magazine magic quadrant, I am still the number one SQL Server consultant in the world outside of New Zealand.

So, uh, we’re, we’re, we’re still, we’re still very proud of this, this magic metric. Um, if you need a health check, performance analysis, hands-on tuning, dealing with SQL Server emergencies, and of course, uh, training your developers, whipping them into the fine shape so that you, you don’t have SQL Server performance emergencies anymore, and you can, you can finally see your family on the weekend.

Well, I’m, I can, I can do all that stuff. And as always, my rates are reasonable. Uh, if you would like to buy some, uh, performance tuning content from me, I have about 24 hours of it. Uh, it’s available for 150 US dollars of just about, or thereabouts, when you use the 75% off coupon code right there.

This is also fully assembled for you down in the video description. Uh, if, if you would like to, uh, get, get in on the presale, for my new T-SQL course, T-SQL with Eric.

It’s a me. Uh, you can, you can buy the, at the presale price of $250. That will go up, uh, after the summer when, when the advanced material, uh, becomes available. Um, and then it’ll be $500.

This is, of course, companion content to the pre-cons that Kendra Little and I will be teaching about T-SQL at Pass Data Community Summit in November. And speaking of speaking, I am doing a whole bunch of speaking.

Uh, but Breadgate is kind enough to, uh, hire me as a, hire me as a full-time roadie on their, their Pass On Tour event. So I’m going to be helping them load in and load out. And, uh, they, they’ve apparently given me an empty room to, to just talk to myself in.

So if you want to come hear me talk to myself about SQL Server, you can do so in New York City, August 18th to 20th, Dallas, September 15th to 17th, and Amsterdam, October 1st to 3rd. And then, of course, for the, the, the, the Grand event, uh, the, the Grand Bon event, not the Grand Mal event.

That’s not, those aren’t good. Uh, Pass Data Community Summit taking place in Seattle, November 17th to 21st. Uh, and with that out of the way, let’s talk about these heaps and these forwarded fetches and, I don’t know, whatever else comes up along the way.

So we’re going to do a little bit of talking first, and then I’m going to, I’m going to demonstrate some things for you. Because what’s, what’s life without demonstrations? One, one wonders.

One wonders aloud about these things. Uh, so with heaps, all right, that is a table that does not have a clustered index on it. And it may have nonclustered indexes on it. And it may even have a non-clustered primary key.

It may have unique constraints. It might have foreign keys, but it is still a heap. It is still an unorganized clump of pages, uh, with no clustered index. SQL Server still requires a way to uniquely identify each row in the heap.

Uh, this is an important thing in database. This is an especially important thing for SQL Server to do things like, uh, lookups, right? When you can still create a nonclustered index on a heap, on a few columns, and you might have a query that, that queries those few columns, plus a bunch of other columns.

And then SQL Server still has to go get those columns from somewhere. And, and SQL Server needs a way to identify which rows to go look up and, and provide those columns for. Um, and, and to do so, it uses something called a RID, a row identifier.

Wow, that is an apt name for something. Remember when Microsoft used to be good at naming stuff? Not so much anymore.

Uh, but that RID is based on three components, uh, three metadata components about, uh, the row itself. That is the file ID of the file that the row lives in. Uh, the page ID of the page that the row lives on.

And the slot ID of the page that the, the row lives on more, more, more specifically. Um, so, uh, we, we have all that going for us, right? Uh, Microsoft has called this, this storage mechanism very efficient, or, uh, incredibly efficient, or some modifier plus efficient.

And I forget the exact wording they use. Uh, but when you have a unique clustered index or a clustered primary key, the key of that index is used instead. Right?

Of a, in place of a row identifier uses the key of the, the clustered index. Uh, when you have a non-unique clustered index, then the key plus a unique-ifier. Unique-ifier is the single hardest word in the English language to spell. Uh, the day that they throw that out at one of those national spelling bees, I’ll start believing that they’re actually using.

They actually mean something. Um, a unique-ifier is a four-byte integer. Uh, Microsoft was a real cheapskate on that one.

Uh, because you can run out of them. But the unique-ifier is a four-byte integer that gets added to rows when a non-unique value is inserted into your clustered index. Right?

So, uh, if you, if you have a, uh, uh, uh, if you decret, rather, if you create your clustered index is, without saying it’s unique, but you never add a unique value, then you don’t have to worry about the unique-ifier. Right?

You maybe should have called it unique, but you didn’t. You know, you forgot the, you forgot the word. Uh-oh. Uh, and so you have this, uh, this potentiality for a unique-ifier getting added. And like I said, the unique-ifier is an integer.

Four bytes. And you can run out of them. There’s even an especially devilish message, uh, that, that tells us about this. Uh, and, uh, that message is, of course, message ID on the, the triple six.

If Zoomit will wake, wake the hell up. Uh, and that message says that the maximum generated unique value for a duplicate group was exceeded for index with partition ID. Uh, that is, that is a big int, isn’t it?

I-64. Uh, and, uh, dropping and recreating the index may resolve this. Otherwise, use another clustering key. Just upend your whole life, ruin your whole weekend, because Microsoft couldn’t be bothered to use a big int. Thanks, Microsoft.

They’re really great at ruining weekends, aren’t they? Those of you with availability groups know this especially well, don’t you? You’re particularly keen on, um, just how many nights and weekends have been ruined by your high availability solution. Uh, but anyway, the point here is to sort of answer the question of why, the, why SQL Server uses forwarded record pointers when rows move around within a heap.

Right? Uh, these are, these are, these are the ones that are working for multiple people. And these of course occur, these forwarded record pointers occur, when a row, the values on a row, grow to the point where that row no longer fits on the page it was initially assigned to.

SQL Server will move it to a new data page, not to a new data page, just to a different data page that has enough space on it. It might be another existing data page that has adequate space, or it might be a brand new page.

Who knows, right? Crazy things happen. But when that happens, when a row moves, it leaves behind this little pointer that says, I have moved here.

I live down here now. I’ve changed, I’ve left my apartment. I’ve moved out. This is my new address. It forwards that down there. So that’s different than what happens when you have a clustered table.

Now, the term clustered index is good because, you know, it is an index. But it does lead people down this strange path of thinking that the clustered index is a separate copy of the table.

It’s just another index on the heap. When it’s not, I find that I find the terminology clustered table prepares people mentally more correctly for exactly what a clustered index is. So when you have a clustered index on a table and a row no longer fits on a page, SQL Server will take roughly 50% of the data pages and put them on a new data page.

All right, that is called a page split. Now, one of the key things about, like, why SQL Server has these rids and uses and does these things is that nonclustered indexes on your tables inherit certain information from the base table.

The base table could be a heap. The base table could be a clustered table, right? In other words, a table with a clustered index on it. When you have a heap, the rid is added to your nonclustered index. When you have a clustered index, cluster table, non-unique, for non-unique, nonclustered indexes, they get added as a key column.

Like after, like, you know, if you have, like, just a single key column index, your clustered index key will be the index key after that. If you have a multi-key index, then your clustered index will be the unique, well, right, sorry, your clustered index will be added to the key after whatever columns you’ve explicitly listed out.

If you have a unique nonclustered index, then the clustered key gets added as include columns to the index. All right. With all that stuff out of the way, let’s move on and let’s create a table and then let’s, we’re going to drop it if it exists, I suppose, and then we’re going to create it.

And then we are going to insert a very special number of rows, it’s not really special at all, into the table. 104, 8, 5, 7, 6. All right.

Okay. That many rows went into the table. There’s nothing special about those rows. And now we are going to add a nonclustered index to that table. All right. Following this, I saw on that date column right there. Okay.

And so we’re going to add this and our table is still a heap. We just have a nonclustered index on it now. All right. Nothing, nothing terribly special about that. Now let’s look at the table itself. I’m going to run this query.

And this table, this query is going to tell us some stuff about our heap. All right. Namely, it’s going to tell us that we are, our table is named El Heapo. It lives in the DBO schema.

Right now we have zero forwarded fetches. We have had 1, 0, 4, 8, 5, 7, 6 records inserted into the table. And we have had no updates or deletes against the table. I’m not entirely sure why this number does not quite match this number, but, you know, what’s a few rows difference amongst friends, I guess.

So to start things off, because our table is just freshly inserted into, there has been no opportunity for rows to move around, for a pointer to occur. All right.

So if we run this query and we look at the DMOS performance counters for forwarded records, this number will be the same before and after this query runs. All right.

So we have, oh boy, that’s tough to frame, isn’t it? We have a counter value. Let’s move that up a tad. Let’s rearrange things a little bit. There we go. That should make life easier. We have 4, 6, 0, 2, 5, 1 up here and 4, 6, 0, 2, 5, 1 down here.

So this counter has not budged. Now, one thing I do want to point out is that the a string column is an envarchar 200, right? So it’s 200 potential bytes of characters.

But when we did our insert, we didn’t fill up every single row with the full size, right? We did some, we did a little bit of funny math on here. And I don’t quite understand why IntelliSense is giving me red squiggles when everything worked correctly.

Perhaps SSMS 21 will eventually fix this issue. I’m not sure, but anyway, we didn’t fill up every row to its potential, which means that we can update rows to their full potential.

And what does that mean? We can have some forwarded records. So what we’re going to do is come over here.

Now, what I want to, what I’m doing in this window is pointing out two things. And this is going to look different when we add the clustered index and come back to this. I’m going to point out two things here.

One is that page splits do not occur with a heap. Heaps are just big flat structures, right? There’s no order to them. They’re not B trees the way indexes are, like both clustered and non-clustered. So we don’t have page splits when we have heaps.

So let’s run this. And what we’re going to do is update a bunch of rows in this table. And we’re going to see after the insert runs that there are no page splits that happen, right?

But because we don’t have page splits now, remember this, there were zero forwarded fetches here when we started. Now, when I rerun that query, we have had a bunch of updates. And those updates have created forwarded fetches, which means when we come back over here, right, and we run this query, which hits our heap table, we are going to have our forwarded fetch counter.

I should remember to move that, shouldn’t I? Our forwarded fetch counter is going to increment before and after, or after this query runs. We start with 546105, and now we have 55184. Now you may notice a slight difference here too, in that all of these columns now have a whole bunch of stuff in them that was not in there before.

The eyes look a little bit shorter, but that’s really only because they’re much thinner. These big fat letters down here take up way more visual space on the screen, but I promise you there are 200 lines going across there.

So anyway, if we look at the query plan for that select, we will see some of the stuff that I had pointed out earlier, where we have a seek into the nonclustered index. Next, then we have a nested loops join, and that nested loops join joins us back to the heap in order to fetch the columns that are not in the nonclustered index that we created.

Remember, the nonclustered index that we created was just on that date column, but we are selecting everything from the heap table. And when we look down here, we will see the seek predicate of the…

This is the bookmark, right? The BMK. And this is the RID that I was talking about earlier, right? So the RID lives in there now.

Now, you’re not going to see anything about the forwarded fetches in the execution plan. But now is as good a time as any to tell you about why SQL Server does forwarded fetches rather than just move the row entirely.

The answer is because of what I was talking about with that inheritance, right? So think about how SQL Server stores a RID, right? It’s the file ID, page ID, slot ID.

If every time we updated a row and it moved, and if we didn’t use a forwarded record pointer, we would have to not only update the heap, we would have to update the clustered index. Sorry, we would have to update all of the nonclustered indexes to, like, rearrange things so that, like, the row identifier was now rewritten to page ID, file ID, slot ID, right?

So it saves a lot of work with nonclustered indexes to do that. Now, I guess, you know, if you have a heap with no nonclustered indexes, it might make sense to just, you know, like, rewrite the thing since it’s just the heap anyway, but we are not that lucky.

But, oh, stop zooming, fool. But if we look over here, right, and if we look at the execution plan for the update that we did, right, like, we certainly scanned the heap and we certainly update the heap, but we do not also update the nonclustered index because we updated that string column, but we did not update any columns that are part of the nonclustered index.

So we only had to update the heap for this. So that is why SQL Server uses forwarded fetches. It is an optimization to help us from, help us not need to do unnecessary write work to nonclustered indexes to now, like, rewrite the, all of those so that they have to identify rows with the new file ID, page ID, slot ID of where the row moved to.

We can just use a forwarded record pointer to go find that row when we scan through the table and hit it. Now, what we can do to get rid of those forwarded record pointers is rebuild the table, right? Oops, we are not in the right database, are we?

Shame on me. But when we, when we, when we, when we rebuild the table, notice that we rebuild both the table and any nonclustered indexes on the table. I don’t know why that just showed up there.

I’m not sure what key combination I hit. So this is rebuilding the, the heap, right? We rebuild the whole heap here and we rebuild the nonclustered index on the heap here. Doing this will get rid of the forwarded fetches.

This will rewrite both of the indexes so that we, this will rewrite both of the indexes so that the new file ID, page ID, slot ID for all of these things just kind of, you know, is now the new row location, right? We can remove all the forwarded record pointers, rebuild the heap and like to get rid of the forwarded record pointers, rewrite the new rows with the new file ID, page ID, slot ID, and then rewrite the, like rebuild the nonclustered index so that it has that new link to it. So let’s come back a little bit.

And now let’s, let’s, what I’m going to do is run this whole portion to drop the table. And now this is going to be a clustered table. And because this table, because this table has a clustered index on it, this query is no longer going to return anything.

But what, what, and you know, like if I run this query now, rather than having a bookmark lookup, we have a key lookup. This is looking up based on the key of the clustered index rather than a bookmark or a rid lookup for the heap. But what I, what I do want to do real quick before we close this thing out is show you what an update does when there’s a clustered index.

So before when we ran this, we had the results, there were no page splits before and after the update query ran, or rather there were no, there were no new page splits after the update query ran. But now when we run this with the clustered index, you will see that there were quite a few page splits that occurred. Now, are page splits good, bad, ugly?

Well, I guess, you know, if you, if you, if you have enough of anything happening in a database, it can get bad. Right. You can certainly produce a lot of extra work if you have a lot of page splits happening.

I guess transaction logging and, you know, IO and stuff, but whether it’s, whether it’s, whether it’s a significant enough problem for you to attempt to address in some way is, is left as an exercise to the reader. So anyway, that is about as much as I can fit into this video about why SQL Server uses forwarded fetches in heaps rather than like doing anything else. And sort of some differences between clustered tables and heap tables.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video where we’re going to talk about a little bit about sub query unnesting and why the optimizer is kind of a goofball about that unless you, unless you do some extra stuff. So anyway, I’m going to just throw this out there and we’re going to go, go do this.

Honestly, we’re just going to wait for this to upload and then record another video. So whatever. I’m going to catch my breath first.

Anyway, goodbye.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.