Check Constraints And Query Performance In SQL Server

Check Constraints And Query Performance In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the world of database constraints and their impact on query performance, particularly focusing on how tweaking these constraints can lead to more efficient execution plans. While most viewers might not have constraints in their databases, I explore a scenario using Stack Overflow data to illustrate why constraints are important for maintaining data integrity and optimizing queries. By adding and modifying constraints, we see firsthand how SQL Server processes queries differently, leading to significant improvements in plan shape and performance. This video is part of my ongoing series on database optimization and query tuning, aimed at helping those who want to improve their skills in these areas. If you found this content amusing or informative, I’d love to hear your thoughts—thumbs up and helpful comments are always welcome!

Full Transcript

Erik Darling here with Darling Data. In today’s video, we’re going to talk about how you can tweak constraints in your database to improve query performance. Now, this video is admittedly a pretty good bit of narrowcasting. I know most of you out there, you do not have constraints in your database. If it’s a data warehouse, I’m with you, though. Like, check constraints, unique constraints, foreign keys, and data warehouses. The domain of morons. Do not mess with your database load times. Your data warehouse load times. And your OLTP databases, OLTT, not a thing. Oh, your OLTP databases, where, you know, referential integrity and things like that are pretty important, you should have constraints like uniqueness and check and foreign key because they’re good for your data. They’re good little data vitamins. You should have your data on those vitamins. They can get big and strong like Fred Flintstone. So, I mean, this is just kind of a funny thing. It amuses me. And if it amuses me, I record it. And hopefully it amuses you, too. That’s why at the end of every video, say, I hope you enjoyed yourselves. Because, oh, crap, I hope it’s funny to someone.

If it’s just me laughing at my own jokes, boy, that would be so depressing. Oh, dear. Oh, dear. I’m spiraling. Spiraling real hard. So, I’ve got my Stack Overflow database, and I’ve already created this index. And this index is on ZoomIt. I’m having real ZoomIt problems. I might have to, like, restart my computer or something because ZoomIt is just being acting a wild. Wild fool on me. This index is keyed on reputation and upvotes, and it includes display name. Now, what I want to point out about this index before we go any further is that this index does, in fact, have a standard, let’s see, is that, well, almost 200 row history. I’m 194 rows. And you can see all of the histogram stuff in here. This is our 194th row. Maybe if we, oh, I mean, we created the index, so that is a full scan.

There is our 194th row. This is our highest value, 1047863. 1047863. Oh, that’s seven digits. That’s John Skeet. And then we have various and assorted reputations up here. And the very first reputation in the table has a range high key of one, right? So SQL Server knows that the lowest range high key that it got from this data is the number one. The thing that this doesn’t guarantee in any way, shape, or form is that the lowest number that will always be in here will always be the number one, right?

SQL Server has no logical guarantees about that because we don’t have any constraints that enforce that. So what we’re going to do next is look at a query plan. And when we’re going to look at this query plan, we’re all going to gasp. We’re going to be shocked and awed and horrified at how disgusting this query plan is. I mean, it runs pretty quickly because let’s, like, look. They don’t all have to be awful. Some of them can just get the point across.

Some of them, you really just… See the pictures up there? There’s a seek and a sort and a parallel gather streams and a top. Sometimes you just need the pictures to tell the story. In a way, you do kind of miss the old days when there were no operator times in query plans because then you could just, like, take a picture of something and be like, trust me, it was awful. But this, you know, keeps you a little bit more honest. So me being honest with you, I just want you to pay attention to the operators in the plan and the plan shape.

Namely, we have a parallel plan. Our plan has a sort in it. Those are the two things that are important here. Now, if you’ve watched my other videos about indexes and indexing and how indexes put SQL Server data in order, you would think that, you know, having a less than or equal to predicate on reputation would make it fairly easy to… For SQL Server to know that, like, it doesn’t have to sort upvotes, right? We order by upvotes here, but we have, like, an equality predicate.

And one is the lowest in the table. So, SQL Server, you should know. But it doesn’t know because we don’t tell it that. We have to over-communicate to databases. You know, they’re very neat. Databases are like big neat balls.

Like, you’ve got to really spend a lot of time coaching and coaxing. It’s a whole thing. So, if we add this constraint, where we say reputation greater than zero, SQL Server knows right now that every single reputation in the users table is greater than zero.

And so, you would think that SQL Server would logically be able to say, when this query runs, well, everything’s greater than zero. So, that really means that the only number that could show up in here is the number one. I mean, yes, theoretically, technically, scientifically, there are numbers between zero and one.

But this is an integer column. But we don’t have that level of precision. But SQL Server, still not very confident. And we end up with the exact same execution plan with the seek and the sort, and it’s all parallel.

It’s very interesting, isn’t it? If we want to fix that, if we want to change that for good, we need to get rid of the constraint as it exists here. And we need to change the constraint. Constraint? Constraint? I don’t know what a constraint is.

I don’t think that’s a word. We need to change our constraint to be reputation is greater than or equal to one. And when we do that, now SQL Server changes its whole mind.

We have a single-threaded index seek with no sort and no need for a parallelism gather stream because there’s no parallelism. So, what’s kind of funny here is if you look at this index seek, this just says seek predicate reputation equals one. So, we have that equality predicate now on reputation.

And we have the SQL Server with that knows that upvotes is all in the order that we want it to be in. So, we don’t have to actually physically sort anything. Upvotes is all in the right order.

If we go back a step, oh, dear, what happened? That was a bit scary. That was like one of those end-of-the-world movies. If we go back a bit and we look at this constraint or look at the query plan with the greater than zero constraints in place, the index seek looks a little bit different.

Right? We still have the seek predicate, but it’s less than or equal to one. So, just having that zero in there, just having greater than zero doesn’t really help us. Having the greater than or equal to one gives us a much tidier query.

So, let’s be honest here. Very few of you have constraints. Very few of you pay this much attention to query plan details and maybe to the way constraints are written.

And how that can be changed to improve query performance. Very few of you would even probably pay attention to that query plan that we just saw. Why?

It runs in 100 milliseconds. Are you going to tune that? I don’t know. Do you need to? Maybe. Could you do it by changing a constraint that you probably don’t have on your table anyway? Yes.

That’s the magic of databases. These are the kind of little things that matter. I’ve said it before in other videos. You know, the further along you get in your database.

I mean, I gear myself towards the query tuners. The further you get along in your query tuning career, the more these small details and these little minutiaes make a difference. So if you want to improve, leave the query performance stuff aside for a little bit.

If you just want to improve yourself and you want to improve your query tuning abilities, these are the kind of things that you’re going to have to start paying attention to. Now, let’s say we’re starting from a place where you have no constraints on this table. That’s the first thing you’re going to want to go after, right?

Because if this is an OLTP environment, not a data warehouse. Data warehouse, constraints, no. Get out. You’re in an OLTP environment and there’s like specific domain rules and regulations around some of the columns in your database. Overcommunicate to SQL Server what can and should be in those columns.

It will help you in the long run. And then further on, when you start to have really insane problems in databases, pay attention to how those constraints are written. Because to SQL Server, which apparently does not trust integers, there’s a big difference between greater than zero and greater than equal to one.

So just stuff to keep in mind. Stuff like this I find all the time. Stuff like this I find and it surprises even me.

It just kind of takes a little bit of like the curiosity to tinker and, you know, toggle with stuff a little bit and try to figure out exactly what’s wrong with SQL Server today. Because it’s always something. Always something wrong with SQL Server.

It’s sort of like a Munchausen type thing. I’m not a doctor, but it sounds like a reasonable diagnosis. Anyway.

Yeah. One, get some constraints. Two, write your constraints right. Important lessons there.

Right? For you. Thank you for watching. I hope you enjoyed yourselves. I found this particular scenario funny. I hope you also found it funny.

If you didn’t, well, I don’t know. Maybe the next one will be better. I can’t make any promises, though.

If you like this video, thumbs ups and helpful, not hurtful comments are always appreciated. And, of course, if you like this sort of SQL Server content generally, and you would like to get notified whenever I talk about things that amuse me, well, you can subscribe to the channel, and you can get the notification along with, like, 4,000 other people.

And like we’ve said before, 4,000 people have never been wrong. Never once. All right.

Thank you for watching. I need to go do something else for a moment. What it is, you’ll never know. You can only guess. You can only dream. You can only speculate. You can only be conjecture.

But I’ll know. I’ll know.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.