Indexing SQL Server Queries For Performance: Indexed Views And Non-SARGable Predicates
Thanks for watching!
Video Summary
In this video, I delve into the world of index views and how they can be used to address non-sargable predicates in SQL Server queries. We explore two different approaches using a single query as our test case, aiming to optimize performance by creating an index view that helps SQL Server process the query more efficiently. Along the way, we also discuss some quirks of query plans and the importance of considering the broader impact on other queries when introducing new indexes or index views into your database workload. If you’re interested in supporting my work, consider signing up for a membership starting at just $4 a month—every little bit helps! And remember, if you can’t afford that, sharing this video, liking it, and subscribing are all free ways to help the channel grow.
Full Transcript
Erik Darling here with Darling Data. Hooray! We did it! We survived another day. All together. All of us. Didn’t we now? Everyone except Intel, who still… Still can’t find those drivers. Don’t like it. Don’t like it one bit. In today’s video, we’re going to be talking about how you, can use index views, we’ve got some more index view stuff to talk about, I guess, to fix non-sargable predicates. Now, we’re going to talk about this in two different ways, but with one query. So we’re going to kill two birds with one query. Might be a record. You know, Randy Johnson gets to kill birds with baseballs, I get to kill birds with queries. That’s not fair, is it? Only I were six foot nine. Anyway, let’s do that. Before we get into all that good stuff, let’s talk about how you can make me nowhere near as rich as Randy Johnson. If you would like to support the work that I do on this channel, you can sign up for a membership for as little as $4 a month. Assuming that maybe you are also not as wealthy as Randy Johnson, but if you have an extra four bucks a month, you can do that.
If not, liking, commenting, subscribing, all free things you can do to help this channel grow. And maybe, maybe if you can’t afford the four bucks, maybe you doing this stuff can, can fool the algorithm into appealing to people who do have four bucks a month. Disposable income. It’s all sorts of ways you can help. If you would like assistance with your SQL Server, I have much more reasonable rates than Randy Johnson. And I can do all of this stuff. And, you know, aside from beer gut magazine making me exclude New Zealand from, from, from this sales pitch, I am the best SQL Server consultant in the world outside of New Zealand. If you would like probably the best SQL Server training on the internet, you can get all of mine for 150 US dollars. That is a fantastic deal. Especially now that probably all those other Black Friday sales are over and, you know, jacked up to thousands and thousands of dollars a year to subscribe. For me, you get everything all at once. It’s wonderful.
No upcoming events until later this year. That’s okay with me. With that out of the way, let’s talk about these things. So I’m in the Stack Overflow database, I hope. And I’ve created these three indexes, I also hope. Yes, look, we got three errors, which means I did everything right. Now, I’ve created these indexes in an attempt to assist this query in being faster. The thing is, there are, there are two things in this query that are going to, um, uh, aid and abet its slowing down. Aid and abet its slowing down, not abed.
Aid and abed, uh, I mean, you can abet a query, I guess, but, uh, a little bit, might be a little sad. But, uh, there are two things in here making life tough on this query. We have this, we have wrapped the score column in this coalesce because a lot of people make this foolish mistake, thinking that, uh, SQL Server can’t deal with nulls correctly. Uh, if, if you have a null in an integer column and you want to know where that integer column is greater than zero, nulls aren’t going to mess SQL Server up. So you don’t need to do this. Uh, and the other thing that’s going to be tough here is that we have a rather restrictive predicate on this thing. But, um, the thing is that like the SQL Server can’t calculate this until the query has run.
Right. So when we say having some is no, blah, blah, blah, blah, blah, blah, SQL Server has to calculate this thing, right? And it has to figure out what this is by joining posts to comments to votes. And, and, and then it can calculate this aggregate and then we can filter on this aggregate here. We cannot put this in the where clause here, right? We can’t because it doesn’t get calculated until the, until the select happens and projects this stuff out. And, and then, and then, then, and only then can we think about things in the having, but what, and I think what’s, what’s funny, what’s funny here too, is that, um, and I don’t know if this is a SQL Server quirk or if this is, um, or if this is, uh, a pure, like purely something for logical processing that would happen in every database.
I know that recently Oracle made it so that you can reference column aliases in the group by, uh, so we could say group by total score here. We can say order by total score in SQL Server. But if I tried to put total score in the having clause, uh, SQL, like, um, SQL Server will say, I don’t know what that is.
So we ha we do have to repeat this expression down in here, which is, which it’s like, just as a query writer, this is annoying. Like, like Microsoft, like do me a solid help, help, help a little in these things. You know, when people complain about Microsoft, you know, SQL Server and T SQL being like dusty and antiquated dinosaur languages, this is the kind of crap that like is just annoying to people who write queries.
So Microsoft could do more in this regard, but, um, we run this query and we, we have query plans turned on. So we don’t have to, we don’t have to be worried about running it twice. Uh, this will take about 10 seconds and we’ll, we’ll look at the plan briefly because quite frankly, the plan isn’t terribly interesting.
Uh, but here’s what the plan does. We scan the post table that only takes 900 milliseconds. We scan the comments table that takes a second and a half.
We scan, well, we don’t scan the table. We scan the nonclustered index and we scan the boats table. That takes 1.6. Ah, that takes 1.6 seconds, 1.4 seconds. Uh, and then like, like I said, the query goes on to do other things.
Uh, six seconds, nine seconds. Uh, I think maybe this, no, that’s still row mode. Um, so this is this, if, if you want to talk about other stuff that makes query plans real confusing, we could start here.
All right. And let’s, let’s look at this. So, uh, everything for operator times looks good enough going across to here.
All right. Uh, except, I mean, this doesn’t make any sense because 1.4 seconds and then 600 milliseconds. So that, that’s not great.
And then 1.6 seconds and then 1.1 seconds. But like I’ve said, I, I, I usually disregard the, the, the parallelism operators when I’m looking at query times because like the time’s often not going to make a lot of sense. Unless you go from like a really small number here to a really big number here, then, then I might pay attention to it.
But going from a big number to a small number, I’m just like, you’re on crack. Like this is, this is not working out for you. You should, you should think about something else, but let’s focus in over here.
Just this last portion. We have six seconds. And just to make sure, cause batch, like, like you, you might see some operators show up sometimes in batch mode and sometimes in, and in for batch mode operations, uh, SQL, SQL Server does this funny thing where, uh, it, like batch mode operators only track their execution time in wall clock time, not, not in CPU and wall clock time.
But row mode ones, it’s supposed to be cumulative going from right to left. So these numbers should add up, like, like just get higher across the board. So like that does happen here from nine to 1.1 to 1.2 to 2.8 to six.
So this, this top section is fine. This section’s a little cracky, right? Going from 1.4 down to one point, down to 599. And this section is a little cracky going from 1.6 to 1.1.
But, but like we can, we can at least sort of like add those numbers up and get round about six here. Where, where things get real good and cracky is, uh, we go from six to nine to 5.9 to 6.6, 6.6 again to 13.8 to 10.2. I realize that it’s winter time, January-ish.
But if, if Microsoft were, were perhaps looking for, uh, issues to work on in a few months when the next batch of summer interns come in to work on SQL Server, this, this might be a good portion of the code to address. Because this doesn’t make any sense. Right?
Especially, uh, this last repartition streams. Like, if you look down under this, this armpit, if I put this, this arm up a little bit, and let’s just zoom in way down here. Uh, this thing ran for 10 seconds.
Right? We, we can see the 10 right in here. And I had to draw a square on my own because this hand is what hits the button to make it draw rectangles. So we have 10 seconds over here.
But this gather streams might make it look like this query might have taken a sneaky, like 14 seconds. But then if we go and we look at the properties and we say, look at query time stats, we can see that like this agrees with the final operator in the plan. So things got real weird in, in here.
Like, uh, no arguments from me. Uh, so be careful when you’re reading these query plans. But anyway, this thing takes 10 seconds. And part of, you know, why this thing takes 10 seconds is we have this expression here.
Right? Expression 1 0 0 3, which matches up with where we’re looking for scores in the query down here. So like I said, SQL Server has to calculate this whole thing first.
Sometimes you’ll see filter expressions for having clauses. Sometimes you’ll see them if you’re filtering on like a windowing function, like row number, rank, dense rank, blah, blah, blah. Um, other times you’ll see filters if you do goofy things like left join where some column is null.
Uh, other times if you write really complicated predicates, SQL Server won’t be able to push those predicates down to when you touch tables. But like, this is like a legitimate need for having to calculate the whole result and then filter out what we don’t want. So let’s create an index view.
That takes care of our problems there. Right? We’re going to create this. And shockingly, creating this index is going to take just about the same amount of time that running that query took. That might be something you want to keep in mind if you’re going to create an index view.
Because creating the end, the query plan for creating the index view is going to look shockingly like the query plan for running the query. Right? Because we have to do all this stuff.
So you have to be real, real good and mindful when you’re, when you’re creating index views that, um, the query plans that create the query, the query plan to create the index view was pretty good. Because like we talked about in the last video, modifications are going to start getting painful. Right?
When we have to maintain that index view, modification queries that used to be real fast might get real slow. So you have to pay good attention so that when you add index views to your workload, the, the query that creates the, that maintains the index view is not awful because your modification queries are going to be awful. So that’s on you though.
This is the thing you have to know and understand. This is not the fault of the index view. This is the fault of you. You have to pay attention to this stuff. You are the query tuner. When you introduce things into the database to make queries faster, you have to make reason, take reasonable care and caution that they do not make other queries much slower.
Okay. That’s, that’s a you. So now when we run this query though, this query will be very fast.
Right? If we look at the execution plan for this, now hitting that index view and doing all that stuff takes 73 milliseconds. And of course, SQL Server chimes in and says, Hey, I think I can reduce the pain of running this by 50%.
Do we need this to be 50% faster? We just went from 10 seconds to 73 milliseconds. Now SQL Server chimes in and is like, Oh, I’ve got an idea.
Stick with me. Another index. Eh? Eh? Eh? Eh? Come on. So, uh, coming back to something I’ve said a lot. If, if you’re the type of nitwit who spends all of their time looking at missing index requests and just adding those in, uh, God should smite you.
That’s, that’s the end of it. All right. Cool. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. Uh, I hope you had some, I hope you had some, some real fun in this one.
Because there was, there were a few fun moments, I think. There were some, there were some interesting things in here that, uh, uh, make, make my day to day life just a joy when working with SQL Server, just a real joy. Um, and, uh, let’s, let’s, the next video we’re going to talk about is, uh, uh, well, clearly, uh, you can see this is number nine.
Number 10 is going to be about, uh, using indexes to, to fix sorts and query plans. And we’re going to, we’re going to have some fun stuff to talk about there too. So, uh, thank you for watching and, and, and goodbye for now.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.