Learn T-SQL With Erik: Filtered Indexes

Chapters

00:00:00 – Introduction
00:01:32 – Free SQL Monitoring
00:03:02 – Upcoming Events
00:05:03 – Filtered Indexes Overview
00:07:13 – Filtered Index Limitations
00:08:43 – Query Plan Analysis
00:11:04 – Variable Assignment Issues
00:12:32 – Temp Table Workaround

Full Transcript

Erik Darling here with Darling Data and in today’s video we are going to continue our T-SQL learning adventure where I show you little snippets, tidbits, whimsical little chunks of the full Learn T-SQL with Erik course material. The full course is of course available at my site training.erikdarling.com and there’s a link down in the video description for you, you special son of a gun you, to get $100 off the price of the course. Also down in the video description, all sorts of other helpful links for you to engage in consensual monetary exchanges with me.

You can hire me for consulting. You can buy Learn T-SQL with Erik or any of my other courses. You can become a subscribing, what do you call it, member of the channel.

And then if you want, now taking place every Tuesday, you can ask me office hours questions for free. And of course if you like this content, but perhaps not enough to give me any money for doing any of these songs and dances for you. I understand, you know, can’t all be daddy, rich daddy warbuckses.

You can always like, subscribe. And tell a friend. That is a fantastic way to make this channel grow and expand to something bigger and better and more beautiful than it was before.

If you are a real cheapskate, and I’m just kidding, I understand. Frugality is an important thing. I think Benjamin Franklin said some stuff about that.

I don’t know what any of it is though, because I dropped out of high school. But what I did do pretty well at was building a free SQL. Server monitoring tool.

All of the stuff that I care about monitoring for performance in a SQL Server. Totally free, totally open source. No weird email, phone home, telemetry, any of that stuff. You can just download it.

Run an executable. And start getting great performance feedback about your SQL servers. And then if you want to talk to, or have your robot friends talk to the performance data. And just the performance data.

And not go in there and start dropping tables and indexes. And writing wacky queries. And figuring their life out on your time. I have optional, opt-in, built-in MCP servers. So that you can start having your robots at least tell you what’s going on in the performance data.

I don’t know if I take their advice for fixing things. Maybe, maybe not. Depends, but you know.

What can I tell you? You’re not going to pay me to do it if that’s all you care about. Getting out into the world. Boy, I love going places. Doing things.

Seeing people. You know, amongst my adoring fans. Hopefully not getting shot. I’ll be in Jacksonville, Florida. May 1st and 2nd for Day of Data Jacksonville.

I’ve got an advanced T-SQL pre-con there. As you might be able to tell by the URL. So buy those tickets. Because they’re flying out the door fast.

And I would hate to see you lose your seat. Lose your seat to an alligator or a snake or a bottle of blue Gatorade or something. So, you know. Don’t, don’t, don’t wait.

Benjamin Franklin probably said something about waiting too. Hesitating or something. I don’t know.

Again, I’m unclear on a lot of things, historical. But then, after that. I will be in Chicago, Illinois for Pass On Tour. Again, advanced T-SQL pre-con. Buy your tickets.

I will be at SQL Day in Poland. 11th through 13th. Again, advanced T-SQL pre-con. Buy your tickets. And then, I don’t know. I come home. I take a shower.

I take a little nap. I do a whole crap load of work and don’t sleep for a month. And then I am back in Croatia for Data Saturday Croatia. Wow, a lot of Data Saturdays out there.

I’m still unclear on the difference between SQL Saturday and Data Saturday. But people, people like to fight about that. I don’t get involved with those things though. Just like Benjamin Franklin said not to do. So, I will be there.

Again, advanced T-SQL pre-con. Buy your tickets. And then I will be in Past Data Summit Seattle. Past Data Seattle Summit. In Seattle, Washington.

Not the other Seattle. In Zurbistan. November 9th through 11th. So, you know. That will be nice too.

I’m going to stay in a hotel far away from everyone else. But it is still April. What can I say? I’m getting ahead of myself.

All that talk about May and November and June and whatnot. We are still Apriling here. So, let’s April ourselves. And let’s talk a little bit about filtered indexes.

Now, you just can’t talk about indexing in SQL Server really without talking about filtered indexes. They are a very, very important thing. Conceptually, they are just not that hard to figure out.

It’s an index with a where clause. It only indexes some of the data. It qualifies for the where clause. I don’t know. Like the benefits of that just seem rather apparent to me.

But filtered indexes, of course, function best when they meaningfully reduce the total number of index rows compared to unfiltered indexes. Or when they function to improve statistical information about a segment of data that otherwise might get lost in the shuffle a little bit during the normal course of building a 200-step histogram. Filtered statistics can, of course, do the same thing without the extra index to maintain.

So, those are always, always on the table. But, you know, also having a smaller data source to read from is typically considered an advantage to most people. Especially people who are unnaturally obsessed with logical reads.

Curse you all. There are some rather well-known oddities and limitations with them. And perhaps some less known ones as well.

But, you know, like stuff you can’t do in filtered indexes. You can’t use an OR clause or any other sort of disjunction. You can’t use NOT IN. They can’t be used as a candidate foreign key index.

You can’t create them on index views, which I always found stupid. They can’t reference other tables, which I guess makes sense, right? How would that even look?

That syntax would be weird. You can’t filter on a computed column. Now, you can have a computed column in, like, the key of the index or the includes of a filtered index. But you cannot have it in the filter definition of the filtered index.

Which, again, always found that rather annoying. But, I guess, you know, whatever. They don’t play well with local variables and or formal parameters. And, by extension, they don’t play well with the forced parameterization database setting enabled.

That’s this one right here. You can’t use the IGNORE DUKE KEY creation option with them. And Merge has several issues with filter indexes that Microsoft insists are by design.

And not just out of pure laziness. You also, if you are going to use filtered indexes and you don’t want your data modifications to fail. You need to pay very careful attention to the ANSI set options that your client uses while connecting to SQL.

SSMS just happens to satisfy all these. But many ORMs and other client drivers do not subscribe to this set of core beliefs. So, you might just have a bad time there.

But, anyway, let’s create this index. Oh, I’m not even connected to SQL Server. Isn’t that wild? It gets dusty here sometimes.

And you would think that a filtered index like this would be very, very easy for SQL Server to use. But, alas, if we run this query. And we say, SQL Server, give me a count of comments where the user ID is null.

SQL Server says, you know what? Well, use a different index. Not our filtered index.

That’s not confusing. Not confusing at all. But, hey, anyway. The joys of overlapping demos. You can see that this index up here is called post ID filtered. And this is not.

So, it did not use our filtered index. It said, I got bigger ideas. But if you force SQL Server to use the filtered index, you start to get a sense of why it may have not used that index. Because SQL Server is like, oh, well, you know what?

I’m just not sure. I’m just not really sure what’s in there. So, we scan the entire filtered index. Right?

And then we do a key lookup. And why on earth would we do a key lookup here? Why on earth would we be doing this? Well, SQL Server applies this predicate in the key lookup. And it says, well, I got to go back to the clustered index and figure out if user ID is null.

Even though I feel like we were fairly explicit in our filtered index definition. This will only be indexed where user ID is null. But alas, alas, we do not have that.

So, another kind of funny thing you can do is run this query. Right? And, oh, well, we use our filtered index now because the cardinality was low enough.

Right? So, now SQL Server naturally was like, oh, I’ll do that. But it still has to go back to the clustered index and figure out if user ID is null.

So, the only stuff that could possibly be in here is where that is null. I don’t have a great explanation for that one. However, if we change our index definition.

And this is one thing that is in every single index analysis script that I’ve worked on. I’m not claiming credit for saying that I put it in there. Especially with SP Blitz Index.

But SP Index Cleanup also checks around this stuff. But… If you have a filtered index and the filter definition columns are not at least in the includes, they should yell at you about that. So, if we change our index and we say, you know what, we’re going to include user ID in there.

And all of a sudden SQL Server is like, oh, yeah, well, by the way, now that I can figure that all out in one index, I don’t really have to worry about it. Now, there’s fairly well documented stuff that I said before about filtered indexes. It’s not playing well with local variables or parameters or anything like that.

And there’s all sorts of ways to get around that, right? So, like, we create this index on the users table where reputation is greater than or equal to 100,000. And we run this query and we say, SQL Server, go use that index.

And SQL Server, well, it doesn’t use that index, right? Because it uses an index called reputation. But it didn’t use an index called reputation filtered.

So, that’s kind of a bummer. But if we throw option recompile on there, then SQL Server all of a sudden is like, oh, well, it’s not, look at that. It still didn’t use our index.

Well, you know what? Again, the joys of overlapping demos. There is an exception to this, though. And that is if you are assigning a variable, a value to a variable like this, even with an option recompile hint, SQL Server can’t use the filtered index.

And the reason why it can’t use it is because assigning a value to a variable like this disables the parameter embedding optimization. So, be careful with that. If you find yourself with a filter in this situation, this very situation, at home or at work or wherever you run your queries from, what you can do is stick that result into a temp table and then assign the variable from the temp table if using the filtered index.

The filtered index is important to you because the recompile hint won’t fix that. Another neat thing you can do if you want to play all sorts of funny tricks on SQL Server and you want to use a local variable or a parameter is you can hard code some stuff. So, remember our filtered index up here is on reputation being greater than or equal to 100,000.

You separate those zeros, it’s 100,000. And you can always write your query in a very funny way with some literal values mixed in. So, this first query.

It’s going to say where reputation is greater than or equal to reputation. And the reputation local variable is less than this hard coded 100,000 value. And then you can also say another union all because you need to be able to find both outcomes of this, right?

And you can say where reputation is greater than or equal to reputation. And reputation is greater than or equal to 100,000. And reputation, the local variable, is greater than or equal to 100,000.

So, you can get a very interesting query plan out of this with all sorts of startup predicates, right? These startup filters right here. Normally, I would see filters in a query plan and say, curse your eyes.

I hate you. But these ones are kind of interesting because they have startup expression predicates on them. Meaning that rows don’t pass any of this.

Rows don’t go here unless they pass this filter, right? So, you can see there’s zero action going on in this one. That is a zero and that is a zero. But down here, right?

This is where we’re saying, ah, is it greater than or equal to 100,000? And SQL Server says, yes. And it goes and gets one row from our filtered index down here. So, that is kind of a neat way to work around some of the limitations with filtered indexes.

I do apologize for the weird demo thing up there. Again, overlapping demos. I guess I neglected to drop indexes before starting off on this adventure.

But I promise. In the actual course material, I was way more prepared. So, I’ve got that going for me. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you not tomorrow, probably. But I will see you on Tuesday for office hours. All right.

I’ve got a four-day weekend. All right. Go me. Everything is coming up, Eric, for once. All right. Thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.