Trivially Yours
Video Summary
In this video, I dive into the world of trivial plans in SQL Server performance tuning, explaining why they matter and how they can impact query optimization. Starting off with an amuse-bouche-sized introduction to trivial plans, I explore their significance by discussing simple queries that don’t require cost-based decisions, such as selecting top 100 rows from a table without any indexes or filters. As the video progresses, I delve into more complex scenarios where SQL Server has choices to make, like joins and order by operations, highlighting how these can lead to fully optimized plans. I also touch on the implications of trivial plans for index tuning, parameterization, and constraint usage, emphasizing why understanding them is crucial for effective query optimization in real-world applications.
Full Transcript
Erik Darling here with the first, the very first, the firsteth video in the SQL Server Performance Tasting Menu series of videos. And no half-decent tasting menu starts off with anything other than an amuse-bouche. And of course, an amuse-bouche is French for a little delicious thing. Probably not. Not my finest moment. Duolingo just uninstalled itself after I said that. But it’s usually something about the size of this part of your palm, maybe. Alright? Very small and very delicious. So it is technically a small, delicious thing, but that is probably not how a amuse-bouche lines up directly to the French words that you see on your screen there. So anyway, we’re going to talk about trivial plans because they are a good way to whet the appetite. for more advanced things. And I always find them interesting because there’s some stuff that goes along with trivial plans that can kind of hamper query optimization. So let’s run this query right here. Select top 100 everything from users. And right now with no other indexes, I know there’s no choice of data source for this query.
And there’s also no cost-based decision for this query. There’s no join, where, group by, order by, nothing. There’s no cost-based choice for SQL Server to make here. So it just says, whatever. If I thought about this for 100 years, I would never come up with a better plan. It would never be anything better than what I did right here, right now, in just a few milliseconds. And if you go over to the select operator and either hit properties or F4, you will see this optimization level of trivial. And that is how you know SQL Server said, just whatever. Just run the thing. SQL Server’s way of saying, I just don’t care. It’s like when my wife gives me too many choices for something, I’m like, whatever. It doesn’t matter to me.
Everything is going to be within this area of the same. So it does not make a difference. But you can do some kind of funny things and still get a trivial plan. For instance, if I select the top 10,000 rows and I generate a row number here, this plan will still be trivial because we still didn’t have a cost-based choice to make.
The ID column is the primary key. It’s a clustered index. So SQL Server is still just like, it doesn’t matter what we do here. I don’t have to figure anything out. There’s nothing for me. I have no input. Do your best. Don’t worry about it. And the same is true if we tried to get a distinct top 1,000 values from the primary key.
SQL Server looks at the primary key and knows it’s already distinct. So we still have a trivial plan over here. Ooh la la. Fun stuff, right? Now, where we start getting past trivial plans is when we start running queries where SQL Server has choices to make.
Because those choices are all based on costs and those costs are all based on estimates. Those estimates are all based on hardware for a single shut up computer. Shut up from 1998. And it has nothing. Those numbers, those costing numbers have nothing to do with your hardware.
But if we run these two queries, now I want to point out that one of them is selecting the top 1,000 and the other one is selecting the top 5,000, we will get slightly different query plans. All right. Well, I hope this video doesn’t get DMCA’d because someone’s playing whatever that is in the background. Anyway, SQL Server, some choices to make here.
For the 1,000 row query, it shows a hash match flow distinct. And for the top 5,000 query, it shows a full-on hash match aggregate. And what does that mean?
Well, I’m not going to get into how those different operators work right here, right now, because that’s not really what this is about. But we can see how the number of rows that we select influences the operators that SQL Server might choose in different circumstances. So, let’s go on and let’s look at some other things that will get full optimization.
One of them is a join. If we need to join one table to another, SQL Server will have many cost-based decisions to make, like, which of the three joins do I pick?
Is there a particular join within that group that will be better? Do I need to sort data for a merge? Do I use a bitmap for a hash match? Do I want to do an optimize nested loops? Do I want to apply nested loops?
There’s all sorts of things for the optimizer to think about. Now, another thing that, of course, will get us a fully optimized plan is if we order by without a supporting index. So, if I select these, the top 100, SQL Server had choices to make.
And one of those choices for this query was, do I need to go parallel? If I go parallel, will this be faster? Sure. Something to keep in mind is that if you have a parallel plan, you do not have a trivial plan.
Going parallel requires full optimization. Another thing that will get you to a fully optimized plan is a subquery, because a subquery is effectively, for the most part, will effectively be optimized as a join.
Some really simple subqueries will also get you full optimization. Take these two queries here. We have a select top 100 ID from users, and then we have a select top 100 select u.id as ID from users.
And if we go look at these two plans, they’re going to be just about the same, with the exception of this kind of goofy compute scalar down here. But if we look at the details a little bit more closely, this top plan is going to be trivial.
And this bottom plan is going to be fully optimized. So we have a trivial top and a full bottom. What does it all mean?
What does it all mean? If only someone would tell me. Some even simpler subqueries will also get us full optimization. So there’s no logical difference between these two queries right here.
We have a count, and we have a count where 1 equals select 1. There is no mathematical way for 1 to not equal select 1. But because we have this kind of goofy subquery, SQL Server will, for the top query, give us a trivial plan.
And for the bottom query, give us a fully optimized plan. So that’s fun and interesting. Now that we understand that there are some very goofy rules to what gets you full optimization, let’s ask ourselves a question.
Why does trivial optimization matter? Why do trivial plans matter? Why is Erik Darling talking to me about trivial plans? Well, it’s because there’s some stuff that goes along with trivial plans.
And there are some optimization phases that are missed with trivial plans. For example, if I run these two queries, and, you know, again, not to say that missing index requests are the end-all, be-all of index tuning. Far from it.
But if I were a developer and I were, you know, not very well-versed in SQL Server, and if I, I don’t know, was designing queries for a system, I might run a simple query and say, well, do I need an index?
Would an index help me here? And for trivial plans, SQL Server does not ask for indexes. We do not get to the phase of query optimization, index matching, where SQL Server starts thinking about alternative indexes that might help.
For this top query, notice that there’s a little at one here, which means that SQL Server has introduced something called simple parameterization. Simple parameterization means that SQL Server thinks that no matter what value ends up here, no alternative plan is going to be better or worse within reason.
And we can see that there’s another difference between these two queries, and that is that the bottom one, SQL Server has asked for an index. Now, I don’t know that I agree with this index.
It is a very big index as far as indexes go. It’s every, it’s, it’s just on the reputation column, and it includes every single column in the user’s table. At this point, one might ask, gosh darn it, did I choose the wrong clustered index here?
What happened? Why would SQL Server do that to me? But this is one of those things that’s missed with a trivial plan. Another place where trivial plans are kind of a bummer is when we get into adding things like constraints over our data.
Now, SQL Server likes constraints. Not as much as I do. I’m kidding.
Don’t be gross. Anyway, if we are going to be dutiful custodians of our data, we might want to tell SQL Server what is a valid range of search values in here.
And so if I create this constraint on the user’s table that tells SQL Server that every value in here will be between one and two million, we want SQL Server to be able to use that information effectively during optimization phases.
Now, to make sure, because I don’t want you to think that I’m playing tricks on you here, this constraint is not not trusted. It is, in fact, trusted.
We have a double negative. Whoever named this column may have been hallucinating. They may have been on a lot of Robitussin or something, but we have a zero for is not trusted.
So it is not not trusted. It is trusted. Trust me. You’ll have to trust me on that. And because of that constraint, we might expect this query to bail out.
If we select some columns from the user’s table where a reputation equals zero, we want SQL Server to use that constraint, to use that metadata and say, there’s no zeros in here.
Why would we go? Why would we need to do anything? But that’s not what happens. SQL Server scans the entire clustered index. We do 44,440 logical reads against our clustered index.
And if we look at the execution plan, we can verify that we did, in fact, scan the clustered index. Of course, if we add our sort of silly one equals select one subquery over here and rerun that, over in the messages tab, we don’t have anything about logical reads or physical reads.
We don’t have that statistics IO output over here because we didn’t do any. We just have a constant scan. SQL Server looked at the available constraints on the table and said, we can’t possibly have a zero here.
So what happened? This wasn’t just the fault of the trivial plan. This was the fault of the simple parameterization. SQL Server needs to come up with an execution plan where whatever plan it uses for at one when it equals zero has to be safe for at one when it equals one through two million or beyond.
So SQL Server, for a lot of the same reason why filtered indexes don’t get used when you parameterize those queries is because SQL Server needs a plan that’s safe for anyone, for any one value that you might put in there.
So SQL Server says, I’m afraid I can’t do that. And that’s kind of weird. Anyway, trivial plans are fine if you don’t care.
If you don’t care, if the optimizer doesn’t care, no one cares. Trivial plans, probably fine. If you’re not planning on adding indexes, constraints, query tuning, if you’re, I don’t know, one of those people who just only uses AdventureWorks and that’s the only place you do any query tuning, it’s probably fine.
Don’t worry about it. Simple parameterization is, well, I suppose in a way it’s good for plan reuse, but boy, it sure does hurt when we want SQL Server to, you know, give us feedback about indexes or missing indexes or, you know, use constraints that we created.
We went through all that pain of creating constraints. Very sad when SQL Server doesn’t do what we want it to do. Anyway, that’s all I have to say about trivial plans.
Thank you for watching, and I will see you over in the next video where we will talk about something totally different. Ha, ha, ha, ha.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Related Posts
- The SQL Server Performance Tasting Menu: Performance Issues With Parallel Query Plans
- The SQL Server Performance Tasting Menu: How DISTINCT Queries Can Hurt Performance
- The SQL Server Performance Tasting Menu: How You Compare Date Columns Can Hurt Query Performance
- The SQL Server Performance Tasting Menu: How Multi-Statement Table Valued Functions Hurt Performance
It’s interesting that for a “SELECT COUNT(*) FROM table” query, SQL Server will use a TRIVIAL optimization level, even though it still has to make a decision. It has to decide which of the indexes on the table to scan – either the clustered index or one of many possible nonclustered indexes.
Sure, but as far as costing decisions go, you’re still maintaining the same plan shape, so it’s more a matter of picking the “smallest” index to read from. I’d consider that a trivial decision compared to other stuff.