Indexing SQL Server Queries For Performance: Computed Columns

Indexing SQL Server Queries For Performance: Computed Columns



Thanks for watching!

Video Summary

In this video, I delve into the world of computed columns in SQL Server and how they can be used to enhance query performance through indexing. After recently completing a series on writing efficient SQL queries and another on indexing for performance, I decided to bring these concepts together to explore computed columns in detail. I discuss common pitfalls when using computed columns, such as concatenating year and month fields without proper data type handling, and how converting the result to an integer can provide more predictable behavior. Additionally, I explain the importance of making computed columns deterministic for indexing purposes and demonstrate how to create indexes on these columns effectively. By sharing this knowledge, I hope you gain insights into optimizing your SQL Server queries and improving their performance through strategic use of computed columns.

Full Transcript

Erik Darling here, Darling Data, and you may notice that I look a little bit different than I did in my last video because I got a haircut. I don’t know exactly how I feel about the haircut at the moment. It’s not precisely what I requested, but we’re going to see what happens when this thing grows in. I think I need to sharpen myself up a little bit. Oh, that’s blurrier. There we go. I think that’s maybe, that’s a little bit better there. I look like I’m going to see what happens when this thing grows in. I think I need to sharpen myself up a little bit. Oh, that’s blurrier. There we go. I think that’s maybe, that’s a little bit better there. I think that’s like less of a weird roll on deodorant person with that properly adjusted there. So in this video, we are going to talk about sort of an intersection of concepts. I have, I’ve recently, most recently been recording a series of videos about indexing SQL Server queries for performance. And before this, I did a series on how to write SQL Server queries correctly. And I have another sort of ongoing, ever developing playlist called a little about, in which I go into some detail, a little detail about various concepts in SQL Server. And this video kind of meets, makes all of these things meet.

It’s a much happier thruple than the one that I described between like index views, computed columns, and filtered indexes. This one is just about computed columns. So I’m going to call this video, indexing for performance and writing computed columns query, computed column queries correctly. And then it wouldn’t fit on the slide, but in parentheses, a little about.

All right. So before we talk about this subject over here behind me, let’s talk a little bit about how you can subject me to $4 a month by becoming a member, the supporting monetary, monetary contributing member of the Darling Data family. Become one of my data darlings by clicking on the link in the video description right there. And that will take you to the Become a Member page where you can plug in whatever payment form you’ve currently, I don’t know, whatever you stole out of your mom’s wallet or something.

If you don’t have a mom or a wallet, you can like, you can comment, you can subscribe, and you can help me grow this channel in other ways, big and small, which I appreciate. I appreciate the big and the small. If you need help with SQL Server because you are running into problems with the type of stuff that I talk about in these videos, well, golly and gosh, are you in luck.

Look, I’m a consultant, and in exchange for money, I will solve performance problems on your SQL Server. Crazy, isn’t it? Any one of these things and more.

And as always, my rates, they are reasonable. And again, just because I just exude reasonableness from my being, from all of my pores, through all of my glands, just out every single hole, you can get all of my training for a very reasonable price, about $150. If you go to that URL up yonder and enter in that discount code right next to me there, you can get money off of that, and then you can just have it forever.

And, you know, unlike a haircut, my training will never go out of style, not even moments after you walk out of a barbershop. Upcoming events, of course, we have the one, oh, man, my favorite, my favorite, because I don’t have to go very far, SQL Saturday, New York City, 2025, coming May 10th. Boy, it’s going to knock your socks off.

Might even actually have to give you a pair of socks to knock off, because we’ll knock off both pairs of socks. This one, I have a feeling. But with all that out of the way, let’s party, I’m not sure I sold that one.

But anyway, let’s talk about computed columns here. Now, this does stem a little bit from a video that I made recently about how to search better for, like, dates and years in queries, where, you know, like, I talked about how you can use creation date. And the original query that I had written, because, like, I’ll be honest with you, that’s code that I had sort of, like, had sitting around sort of adopted from, like, something that I worked on a long time ago, where, you know, SQL Server 2012 wasn’t quite popular then.

And so, like, the end of month function wasn’t, like, readily available across every place where it would have been used. So, the end of month function, honestly, it just skipped my mind. So, thank you to COBOL7 for reminding me about the EOMonth function.

So, originally, in my query, I had a bunch of complicated crap to figure out when the, like, leap years and the end of the month and stuff. You can skip over all that if you just use the EOMonth function, like COBOL7 suggested there. Now, the original query that we started with was this one, where year column equals 2013 and month column equals 1.

And absolutely, sure, if you wanted to create computed columns and maintain an additional index on those computed columns, you could do something like this. And adding these computed columns in without the persisted keyword would add them in instantly. There would be no locking, no nothing on there.

They would just act as virtual columns in the table. And then, if you wanted to create a nonclustered index on both of those columns, even not persisted, you could do that very easily with an index that looks like this. Cool. Great.

What a lot of people end up trying to do, though, is trying to make sort of a calculated field from both of these things. This is where you can run into trouble. This is going to be the first bit of trouble that I’m going to show you.

This is where you can run into a column where if you were to try to directly concatenate things together. So, I have a few columns in the select list here. One where I’m just showing you the year and month that comes out of the creation date.

One where I’m showing you what happens when you just try to concatenate those things together. That’s a terrible arrow, but we’re just going to deal with it. And then one that kind of has an unexpected side effect when just left to its own devices.

And then one that is probably what you’re expecting, but takes a little bit more work by converting the result to an integer. So, if we look at the results of this thing, you’ll notice that I have the year and the month, which comes back as 2008 and 7. But if I try to just concatenate those together, like with this line here, the result is just going to say 2015.

Because what comes out of year and what comes out of month are both integers. And so, SQL Server says, oh, you just wanted to add those together. 2015.

Congratulations, dummy. Okay. Fair enough. Integers, you know, that’s what you do with the plus sign. You add one to the other. So, this isn’t what we want here at all. We could just do this.

Oops. We could just do this. And let me scroll down a little bit in here, actually. We could just do this and say, like, rtrim year plus right double zero plus rtrim month two so that we get the year portion and then a zero padded month portion here. And that could be fine.

You know, I’m okay with that. As long as, you know, the data type that comes out of that is what you’re okay with. And any queries that search on that use the correct data type thing.

But we can also convert that to an integer. And really, the only limitation on this is that the leading integer can’t be zero padded, right? So, like, if you’ve ever worked with zip codes, you’ll know that storing zip codes is an integer.

In America, where they’re all numbers, is a mistake because some of them start with zero. Yeah. And you can’t have an integer start with zero.

So, this, I think, would be pretty safe because we’re a long way off from any years started with zero. Where, you know, probably, I can’t imagine when we would have another year that starts with zero again. But, anyway, the 07 would be safe because, you know, months start with seven.

And most people, you know, I don’t know if they, I don’t know, whatever, really whatever search criteria works best for you, just go with it. Most people expect the zero padded month, but if you don’t, whatever. I’m not going to sit here and argue with you.

Where things get a little surprising for some is if you say, if you, if we were to run this store procedure, sp.describe.first.result.set, which is very handy for figuring out what data types are coming out of queries that you write or anything like that. So, you can figure out, like, oh, do I need to fix this somewhere?

Is that the one where we just have this as a string comes out as a varchar14. So, we do have to be a little bit careful there. The SQL Server does not, like, change that back to an integer or anything else unless we explicitly tell it to.

So, you know, we could do that. But what makes it tricky with the computed column stuff is that we would have to add the computed column like this. Right?

Convert integer, whatever. So, let’s do that. And let’s create two indexes on the votes table. One, we’re going to create on the year month column that we just added. And the other one, we’re just going to create straight on the creation date column that we just added.

The thing with SQL Server, and this goes for standard edition and enterprise edition, is that unless you write your query to explicitly reference the computed column that you created, you might be relying on something called expression matching for the optimizer to choose your computed column to use with the query that you write.

So, the sort of unfortunate thing is that if we were to just write our query like this, and we were to say, hey, give me the year and give me the month, like 2013, I mean, the old one is sort of unnecessary, but if we were to just write this query, the best that we would get from this currently is just a scan of the index we created on creation date.

This is not the index. This is V1. This is the index on creation date right here. This is not on the computed column that we created, or rather on the index that we created on the computed column.

We would have to write our query to very explicitly match the definition of the computed column, and have, say, where that equals 2013-01 in order to referent, in order for a SQL Server to pick up on the computed column and use it automatically, right?

Which, and I say automatically, but this is far from automatic. This is you taking a pretty big step towards, you know, creating the, rather, rewriting, like creating the computed column, rewriting queries to use it.

But at this point, like if you’re going to do all, like if your query is doing all this stuff, fine, already, fine. But if it’s, you know, if you have to rewrite it to do this, you’re better off just rewriting it to say where year month equals something.

Like, don’t go through all this trouble. Now, one place where adding computed columns can get tricky is, let’s say that we wanted to do this.

And I’m going to change this a little bit first, because I just, I want you to see where this can go wrong. So let’s put this on a new line.

And let’s put this on a new line. And let’s put this on a new line. So what I’m going to do is take out this and this to begin with.

So now we’re just going to have add to the post table, last edit date as is null, 1900, 0101, whatever. And if we do this and we try to create an index on the column like this, SQL Server is going to say it cannot be used in an index or statistics because it is non-deterministic.

And the reason why it’s non-deterministic is because we, SQL Server isn’t quite sure what to do with this. All right. We’re not telling it what to do. So it doesn’t really know.

I’m going to drop those computed columns. Actually, I’m not going to be able to drop the one on votes, am I? Because it’s dependent on that index. So we’ll just do that later. I’ll take care, fix it in post. So, but I got rid of that one.

Now let’s put these back in. All right. Let’s say convert date time, 1900, 0101. And we’re going to use the 121 culture with the convert there. And if we add that in with that in the right place, now we’ll be able to create an index on that computed column.

And the nice thing is that we don’t have to put the convert part of the computed column into any queries that we expect to use it. If we just say, hey, select count from posts where is null, last edit date, whatever is greater than this, then SQL Server will have no problem using and seeking into the computed column plus index that we created on it to support this query.

So the good news is that there is some finagling and work that you have to do to get, especially whenever you’re doing anything that involves an expression like this, you do need to really help the optimizer understand what to expect out of that column and make sure that, you know, you know, it’s like things are being converted to the proper, you know, date and all that other good stuff.

So it is possible to do, and it is possible to get expression, expression matching to happen without adding any additional logic into queries that you expect to use things in this case. But you do need to have the, you do need to make computed columns deterministic in order to index them. And if you don’t do that, SQL Server is going to say no index for you.

So there’s that. Anyway, just a little bit about indexing computed columns for performance and writing computed column queries correctly. I hope you enjoyed yourselves.

I hope you learned something and all that other good stuff. This is going to be my last recorded video. I mean, you’re not going to be able to tell because I have video scheduled out for about three or so weeks at this point. But this is, for me, exciting because this is the last video I’m recording for a bit.

I’m going to take some recording time off for the upcoming Christmas and New Year’s holidays. So I’m going to, I’m going to just shut this whole mess down and, and think about other things for a little bit. But when I come back from break, we’re going to, we’re going to talk, we, I have, I have a whole slate of things to talk about.

So, and hopefully there, there, there, there will be some, some news on SQL Server V next that I’ll be able to talk about as well. So that’s, that’s what’s upcoming. But in the meantime, I do wish everyone happy holidays, whatever you’re out there celebrating.

You know, we’re, we’re, we’re a Christmas bunch. So Merry Christmas. If you’re, if you’re feeling Christmassy too. Happy New Year.

Even if the new year you’re celebrating is a little bit after the new year that I’m celebrating. Again, it’s, it’s cool with me. Celebrate, celebrate your new years whenever it makes sense to you. You know, it’s, it’s fine.

It’s just more, more, more, more days on the calendar where it’s okay to drink, which means it’s okay with me too. But anyway, I will see you when I return from my, my, my bon voyage. And until then, you know, keep indexing SQL Server queries for performance and writing queries the correct way in SQL Server.

It’s still, you know, it’s a nice thing to do. Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.