Table Variable vs Local Variable Cardinality Estimates in SQL Server

Video Summary

In this video, I delve into the fascinating world of local and table variables in SQL Server, particularly focusing on how they behave differently when it comes to cardinality estimation. After exploring these concepts in yesterday’s video, I decided to dig deeper by comparing temp tables with table variables through a series of tests designed to reveal their unique behaviors. By running various queries that include equality and inequality predicates, we observe how the lack of statistics on table variables can lead to unexpected estimates. This exploration not only reinforces the importance of proper indexing but also highlights the subtle differences between local and table variables in cardinality estimation. Stay tuned for tomorrow’s video where I will address some lingering questions about table variable deferred compilation and its implications for query optimization.

Full Transcript

Erik Darling here with Darling Data. And if you watched yesterday’s video, I don’t have a great intro for this one, I’ll be honest with you. If you watched yesterday’s video, we asked and answered the question, why do local variables behave the way that they do with cardinality estimation? And hopefully the answer was sufficient for you. But, you know, looking at that kind of got me a little curious and thinking. about, you know, table variables and local variables. And I wanted to, I wanted to mess around with that just a little bit more. Just to, I don’t know. Some, some details you just never think to look at in detail. If you look down in the video description, my beautiful, wonderful video description, you’ll see all sorts of helpful links for you and me. You can hire me for consulting, buy my training, become a, a, a paid, a, paid, contributing member of the channel. If you don’t feel like doing any of that stuff. Well, if you be so kind as to ask me an office hours question. I do, I do like getting those. It’s a fantastic mailbag. I, I, some of the questions in there have truly blown my mind. And of course, if you enjoy, if you are titillated by this content, please do like, subscribe, tell a friend, family member. I don’t know.

Something like that. Uh, I will be out on the, well, I mean, by the time this gets published, I will be home from what was pass on tour Dallas, at least hopefully. All right. Fingers crossed. Uh, that means that the remaining, uh, travel for me will be, uh, Utrecht, the Netherlands, October 1st and 2nd, uh, and then past data community summit in the Seattle, November 17th to 21st, uh, where, uh, Kendra Little and I, uh, will be presenting two days of, uh, probably the best T-SQL pre-cons you’ll, you’ll see for the rest of your life. Set the bar impossibly high for anyone else ever. But that out of the way, let’s address the, the matters at hand here. Now, uh, I think that, you know, we, I’ve, I’ve covered to a sufficient degree, uh, uh, differences between temp tables and table variables. Um, in tomorrow’s video, uh, we’ll talk about table valued parameters.

Actually, maybe we’ll do that today. Yeah, we’ll know what we’ll do that today. Uh, so what I, what I wanted to sort of go over a little bit first is, um, just like sort of how table variable cardinality estimates, uh, sort of compare to local variable estimates. And so I wrote, um, what I think is a reasonable series of tests, uh, one using a temp table. We all know that temp tables get statistics on them. And because we know that temp tables get statistics on them, I want to make sure that I update my temp table statistics with a full scan.

And, um, I also want to, uh, use a table variable, which I, I believe that at this point I have indelibly proved that even with an index on them, uh, we, we do not get a statistics histogram for, for the index or for anything on a table variable. Um, if I were to try to run a statistics update for a table variable down here, I would get an error. It would, it would just yell at me. Um, I suppose it would be funny if I tried to do it in some error handling code and had it print out something ridiculous, uh, on failure, but you know, um, gotta save something for the paid material, I guess.

Uh, but then, uh, we’re, I’m going to run a series of queries, series of queries. Why have I never thought about that phrase before? It’s great. Wow. It’s like the other day, someone said the phrase benchmarking to me and I was like, it’s fantastic. I want to, I want to find a way to say that somewhere.

So, uh, we’ll just run a couple of full counts from both of the tables. Uh, we will run a count with an equality predicate. Uh, the equality predicate will be using a literal value for the table variable. And for the temp table, we will be using a local variable that we declared up here called zero, right? That’s that thing.

Uh, and that’s, that, that’s basically, that pattern is going to repeat for inequality predicates greater than, equal to, less than, equal to, uh, greater than, I don’t know. And, uh, of course, less than down here.

So this will be our first test to kind of see how the two compare and we’ll run all this and we’ll get all these results back. Won’t life be grand. So we’ll have a couple uninteresting areas.

The uninteresting areas are of course the inserts. We don’t care so much about those. And then once we get down here, we will have some vaguely interesting areas. So because I am in, uh, I am on SQL Server 2022 and I am using the, um, 160 compatibility level.

And I have not, uh, disabled the database scope configuration for a table variable deferred compilation. I will get a 100 row estimate from the table variable. But because the table variable has no statistics histogram on it and the temp table does, uh, we, we, this is about where, um, things, things might start to get a little interesting.

So as soon as zoom, uh, we can look at some of the more interesting stuff. So selecting from the, uh, table variable with a literal value of zero and saying where ID equals zero. Well, we get zero rows back, but SQL Server thought that we would get 10 rows back.

This is a stock, this is a stock 10% estimate, uh, unknown estimate for the, what do you call it? Table variable. Cool.

Now, because our statistics histogram on the temp table, uh, accurately describes the data in here, we get a one row estimate back from the temp table when we use the local variable. SQL Server is like, hmm, I think we can only get one row here. Now we could absolutely get a one row estimate from the table variable as well.

If we made the clustered index unique. SQL Server would look at that and say, oh, well, uh, this is a unique index. Uh, I know that I’m only going to get one value back.

I don’t know what that value is going to be, but I know I pretty sure that I’ll get one back. So we could change that and get a little, something a little bit different on the second one, but I’d rather just say that and move on. Now, if we come down to where we start having the, uh, inequality predicates, and this is where I should probably start like actually clicking on the plan.

So SSMS stops resetting where I was before. Uh, if we look at the inequality predicates, uh, for the, uh, table variable, we get 51. And what I did prior to recording this was I tried with larger numbers because sometimes like, you know, like 51 out of a hundred, like if you did like a thousand, you might get five, one and some other numbers.

Or if you did like a million, you might get five, one and some different numbers, but it was always like five, one and however many zeros would have been 51%. So for the table variable, uh, with an inequality predicate, uh, we always get 51% using the default cardinality estimator. Remember I’m in compat level 160.

So I’m using the new cardinality estimator say new, but it’s like 2014. So screw whatever. Uh, I, I have a test with the legacy one and the next tab over, which we’ll look at. And then, uh, this one for this one, we get 90.

This was the only one that like really meaningfully changed between, uh, the legacy and default cardinality estimators. This used to, this will, this will get a slightly different number in the other tests. Everything else is just about the same.

And if we scroll down through the, uh, range predicates, so the greater than equal to less than equal to all that other stuff, these will all get 30%. 30% will be true for either the table variable, uh, where there’s no histogram or the local variable with the temp table where there is a histogram, right? Like, so the local variable works to screw up the histogram a bit, uh, for the, for the temp table and the lack of statistics on the table variable, uh, screws with the literal value.

Right. So kind of a, we’re just like dueling banjos of kind of like, ah, but why, why, why would you do that? Uh, so if we scroll down, this will be the same 30, 30, 30, all over there.

Uh, the next set of tests I did, and this is where I really had to like, I, I did some scrounging on this to, to make it work in like, uh, the same way. So if you remember the last one, when I did the temp table, I had an update statistics with full scan, doing the full scan update, update statistics. And this test was not getting me kind of what I wanted to show you.

So I had to create statistics on the temp table, uh, sample zero rows and, uh, tell the SQL Server not to recompute the statistics ever. So, um, I think bit of an unlikely scenario in real life. So let’s run this and we, we got, we got execution.

We didn’t get execution plans. Let’s do that one more time. There we go. Execution plans. So, uh, getting down to where things maybe start to get interesting, uh, because, uh, the table variable deferred compilation is still on. Um, the guesses up here for things like the count are still going to say a hundred.

And I wanted that. I wanted to maintain that because, uh, I want to have a table cardinality for the table variable to compare to the temp table. So, uh, let’s click on that so we don’t completely lose our spot.

And let’s come down here to the first sort of point of interest. And here is where we have, uh, the literal value for the table variable, uh, which is now guessing within the legacy cardinality estimator, 32 rows, uh, instead of 10 rows. And for, when we use the, uh, local variable for the temp table with no statistics on it, essentially, we also get 32 rows.

So a temp table with no statistics acts strangely like a table variable, huh? Weird, right? Bizarre.

It’s almost like I’ve, I’ve said that before and I’m just repeating myself at this juncture. Uh, and then we get to the inequality predicates. Remember I said this is where it was different with the legacy cardinality estimator? Uh, and the other one, it was 90 and this it’s 68, right?

So this was, uh, the same for either the table variable, uh, with the literal value or the temp table with the local variable. So that was a little weird, right? Like one from 90 to 68, just changing the cardinality estimation model.

I don’t know that that’s all that, uh, great of a different guess. I’m not sure why that changed, but you know, here we are. Uh, for the rest of these, for the, for the, all the range predicates.

So greater than equal to less than equal to yada, yada. These all get 30, right? So 30 all down here, right? Everything is 30 from there on out.

It doesn’t matter legacy or default cardinality estimator. They all get this, uh, guess of 30% of the table. So this, like that, like that 30%, right? Three zero 30 out of a hundred 30%.

Again, when I cranked the numbers up to like, like a thousand or 10,000 or a million, it was always just 30%. Now, of course, this, um, you know, leads to the question a little bit. Uh, is there a difference between, uh, temp tables and table value, table, table variables and table valued parameters?

And the answer is of course, yes. But, uh, before we do that, um, because you are like something that I wondered about a little bit. And, um, maybe you, you would also wonder this because you’re a naturally curious person filled with wonder for the world around you is what would it look like if we mixed the local variable with the table variable?

So, uh, I did that, right? So this is all table variable and local variable. And the answer is that because the table variable has no statistics on it anyway, the local variable really doesn’t do anything interesting, right?

We get all the same guesses that we would have gotten sort of between the two. So it’s like for the equality predicate, it’s 10%. For the inequality predicate, it’s 90%. For everything else, it’s 30%.

Okay. Well, um, great. All right. Wonderful. We’ve again, once again, sort of found a new way to say table variables don’t get statistics on them. But what about table valued parameters?

Now, table valued parameters used to be a way to get table variables to behave more like parameters, right? Because, uh, it’s sort of like the way table variables behave, uh, with the table variable deferred compilation, uh, intelligent query processing feature, uh, where table variables act a bit like parameters. But we’ll, we’ll talk a little bit about a little more about that tomorrow.

So, um, with, uh, table variables. Now, if you’re using 2019 plus and you’re getting table variable deferred compilation, the first time your store procedure compiles, you’ll get table cardinality for the table variable and further executions will reuse that. Okay.

Uh, this generally works by deferring compilation of the execution plans for queries that access the table variable until after the table variable has been populated. Now, uh, when I wrote that sentence, I had some questions and that’s what tomorrow’s video will answer those questions.

So here is the table type that I created to have a table value parameter, uh, owner user IDs. We have an owner user ID column in here with a clustered primary key on it. We’re not going to make this one unique here, even though we could, uh, it wouldn’t really do much for us, but, um, I’ve already pre-run this query because, uh, if I didn’t pre-run this query, you and I would be sitting around for a while.

Now, I only put one row into our table value parameter, right? Or rather, you know, our table variable that is assigned this table type. So it’s a table value parameter magic, magically for us.

And the execution plan for this, it just does not really very well reflect, um, a situation where a cardinality estimation was done statistically. Uh, you might notice here that we get a one row estimate coming from the clustered index scan of the table value parameter. And if I duck down a little bit here, right, this is going to show off the fact that I do in fact do squats in real life.

Uh, SQL Server estimated 12 rows would come out of the post table, but we get almost 28,000 rows. And then when we come over here, SQL Server is like, well, no, that’s still wrong. And this is obviously still wrong.

And then we get, when we get over to comments, well, we are, we are just very, very wrong. SQL Server is like 319, but really it’s a much larger number. And so this thing, whole thing runs for two minutes and 20 seconds.

So I don’t know. I thought that comparing the, uh, temp table and table variable with a local variable thing. And looking at the cardinality estimation would yield like some, some really interesting stuff.

But all, all, all I really seem to do was just further prove that the lack of statistics on table variables and by extension, table value parameters can really screw things up. All right. Well, hopefully tomorrow, tomorrow’s video will give us a little bit more to think about.

All right. A little bit, give us a little bit more to, uh, to chew on in our lives. Uh, I’m going to talk a little bit about just how far table variable, table variable deferred compilation goes in a store procedure. All right.

Thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.