All About SQL Server Stored Procedures: Conditional Logic
Video Summary
In this video, I delve into the challenges of conditional logic in SQL Server stored procedures, specifically focusing on two key areas: branching within store procedures based on different conditions and complex where clause or join predicate logic. These issues can lead to poor query performance due to bad cardinality estimations and suboptimal execution plans. While discussing these problems, I also touch upon the broader context of how SQL Server compiles and executes stored procedures, emphasizing that even unexecuted branches are considered during compilation. To make this concept more tangible, I walk through a practical example using a simple store procedure with conditional logic to illustrate the issues at hand. This video is part of an ongoing series aimed at providing comprehensive guidance on optimizing SQL Server stored procedures, and it sets the stage for future videos where we will explore solutions that address these challenges while also covering related topics like parameter sniffing and query sargability.
Full Transcript
Erik Darling here with Darling Data, and we are in this bang around video going to be talking about how conditional logic can be kind of lousy in SQL Server store procedures. Of course, this is the topic that I’ve been on about a few videos in the past, but again, since we’re putting together a playlist of all the stuff that pertains to store procedures, in one convenient place, rather than in some disparate videos that maybe you’ll get lucky and find on your hunt for relevant material, we are going to have this thematically integrated, synergistically, across all store procedure verticals. So, I hope that you enjoy this one. This is going to cover two aspects of conditional logic. One is, if I’m going to branching in store procedures, and the other is sort of conditional where clause logic. Just keep in mind that the conditional where clause logic is just as applicable to conditional join clause logic. We’re just not going to explicitly talk about it. The, as far as like solutions and fixes for this, those will be covered in a later video because the solution and fix to this is also useful in other scenarios that I would like to cover before that. So, I want to cover before that so that you see a more sort of complete version of things when we are done. But before we do that, oh boy, I think you know what time it is. It’s time to have a weird rib mark on my face. It’s great. It’s awesome. Hair and makeup is fired on this one. Let’s talk a little bit about you and I and our, the dance we do together. If you want to support this channel and the work that I do to bring you this amazing, long form video content, you can sign up for a membership down in the video description for as few as $4 a month. That is the cost threshold for an espresso here in New York City. If you have already spent all your money on various performance enhancing substances, I understand. I know what it’s like. I’ve been thinking, after I watch some of these videos, I’ve been thinking about getting some performance enhancing materials for my hair, which if you find this is upsetting as I do, maybe that $4 a month could go towards some sort of growth product in there. But we haven’t gotten quite there yet. When my wife tells me to shave my head, then maybe we’ll reassess that. You can like, you can comment, you can subscribe. Those are all free. And so is asking questions for my office hours videos, in which I take time out of my busy consulting schedule to answer five questions.
of yours for free. If you need the type of help with SQL Server that just can’t be can’t possibly be covered by the YouTubery that I that I that I put out here, you can of course, you can call me, you can say, Hey, Eric, I need some help with my SQL Server. And I’ll say no problem. My rates are reasonable. And we can do all of this stuff together. And we can we can we can we can check out your healths, we can check out your performances, we can fix your health healths and performances. If you’re having an emergency, we can fix that too. And we can also train your developers. So that you have fewer emergencies. And of course, by we I mean me, that is my responsibility. That’s what I am. That’s what you are paying me for. There is no we in that really. It’s just, just, just me and the chickens. If you want some training, perhaps you feel like it is time for you to just get better at SQL Server performance tuning. On your own time, that’s cool too. You can get all of my training content for about 150 US dollars for life.
Woo. And of course, you can get all that via the link in the also in the video description because I like to make things easy on you. I care about you. I realize life is hard enough. Without you having to figure more things out. So there we have that. SQL Saturday, New York City coming up in May seems very far away here in the the doldrums of February, but March, you know, it’s gonna be baseball weather. It’s gonna be nice to probably catch a Mets game. Pretty, pretty nice out there. Probably the I think city field is the nicest part of flushing. So, or do they call that something else now? Is that like Queens Meadows or something? Really, really try to obfuscate some of the some of the geography here in New York. It’s like, where do you live? East Williamsburg? No, you live in Flatbush or Bushwick or something. You do not live in there’s no such thing. Anyway, regional, regional humor doesn’t always cut over well to these international audiences. With that out of the way, let’s talk about conditional logic and store procedures. Now, there are two forms of conditional logic that I personally end up having to clean up quite a bit. And that is if branching to run different queries at different times depending on different circumstantial parameter, local variable gathering, setting, contextual things happening. And complicated join and where clause logic, which is most often like, if this parameter equals zero, then do this. Or if this parameter equals one, then do this.
And all that is a mess for SQL Server to untangle. The problems with both are fairly similar in terms of performance, both with bad cardinality estimation, and just general query query plan problems, usually stemming from cardinality estimation problems, but sometimes from other places. Now, keep in mind, all of this pertains specifically to queries executed, let’s just say natively, or naturally, or naturally within the course of a store procedure running. If you have a separate, if you have code separated out into anything else that requires its own execute, meaning its own execution context, whether it’s executing another store procedure, or executing dynamic SQL, that does not fall into this purview. This is a completely different thing. And doing stuff like that is often a solution for the problems I’m going to discuss here. But since those solutions work for a bunch of other things that cause problems in store procedures, I’m going to save the fully fleshed out solution for the video specifically about that. I want to cover the other stuff that that stuff fixes first along the way.
Now, of course, SQL is a declarative language, you declare what you want. And SQL Server goes off and tries to get that for you. But there are procedural elements for it. There are all sorts of control for logic things, there are loops, there are while’s, there are ifs, there are then’s, there are else’s, there are begins, there are ends. There’s all sorts of things. There’s all sorts of things. There’s all sorts of stuff you can do in there that makes SQL Server act sort of like a real programming language when it is a structured English query language. Strange, isn’t it? But when you’re developing store procedures, the thing that you really need to understand is how SQL Server builds query plans, which is, like I said, everything in the store procedure is like the batch, and that all gets compiled and planned out on execution, except for stuff that is tucked away under separate execution contexts. So by that I mean, you’re chugging along in your store procedure, and then you execute another store procedure. Executing that other store procedure is not part of the initial compilation.
That only happens if you actually execute that store procedure along the way. So, we already talked about that. Good, good, good. If any of this sounds familiar to you, which it just might, because you watch my videos, and you pay attention, and you care about the things that I say, you’ve probably heard me talk about similar problems around parameter sniffing, local variables, and even query sargability, which is SQL Server’s ability to take the search arguments you provide to your queries, and come up with a good way of using indexes based on those.
But, let’s see. It did a separate execution statement. We talked about that. All right. So, let’s talk about a good way to demonstrate this initial starting problem, right? So, I got this store procedure here called I live. I don’t know why I call it. I think that’s a Team Fortress 2 reference.
I believe that heavy weapons guy says, I live, when he does something in the game. But, it’s been a long time since I’ve had any time to play video games, so I forget most of these things. But, just keep in mind here that we have a parameter that is a bit.
By default, it is null. If the bit is true, we are going to run this query, where we select everything from a table called dinner plans. If that is false, then we will just select the top one.
Now, I’ve got this hidden away in there, because I want to show you an easy way to sort of understand just exactly how this stuff works. And, of course, if we pass in a null decider, our store procedure asks us nicely to just make a damn decision.
So, what I want to do first is unquote this, and I want to create or alter this procedure. And this happens just fine, mind you. But, if I try to run this branch for true, SQL Server says, well, we don’t have an index with ID2 on the table dinner plans.
It’s specified in the from clause, and it doesn’t exist. So, that means that even though we try to execute the store procedure in a way that would run this query, SQL Server attempted to create an execution plan for this query.
All right, so even though we only explored this branch, we didn’t explore this branch when we executed the store procedure. SQL Server was still like, well, hey there. We need to compile a plan for this.
So, just the if block is not enough to get us around various compilation things or compilation processes. So, let’s recreate this with that coded out so we don’t have to deal with it anymore. And what I want to show you is that right now we’ve got nothing in the plan cache for this store procedure.
And if I ask for an estimated plan for this, the estimated plan that we are going to see is going to include a query plan with a conditional for our first select with no top, our second select with a top, and our third select, which is a conditional. This is a select without a query.
All right, select without query. So, that’s what you see when you just say select some expression. There is no from that has a query.
So, the query operator we get is just that fine dandy little thing there. Now, keep in mind with just with that estimated plan, we still don’t have anything actually in the plan cache for this. But if I go and run this, and I say, hey, decider, are you true?
And decider says yes, then we get six rows back there. And now we have, of course, one execution of the store procedure. And if we run it for null, we’re going to get please just make a decision.
And now we have a second execution of the store procedure. And if we run it for false, we now see our top query plan. And now we have three executions of the store procedure.
But note the plans within this, right? This shows us all of the stuff that happened in there. And if SSMS would be nice and just cooperate with me resizing these things, I don’t understand why there is so much white space in these to begin with.
But hey, maybe there’s a good reason for it. But we’ll see all of the queries that we executed in the final query plan in the plan cache for this. All right.
So perhaps not the most intuitive thing in the world, perhaps not the best time. So if you want to understand a lot more about this, I don’t know why that’s out there being all giant and long and annoying.
There will be a link to a playlist that I actually created the playlist for this because there was just three videos hanging out before. And I thought, hey, Eric, you know, you often forget to add links.
Maybe if you only had one link to add, maybe if ZoomIt would work. Maybe if you only had one link to add, we’ll do a better job of remembering that. So I’ll do my best there for you, for you nice folks at home who just want to click on things and don’t want to do any work.
So that’s the first problem is the if branch, right? That’s one set of conditional logic that you’ll see in your, well, that I see in a lot of store procedures I have to fix.
What you might see in other places, and I, of course, see in quite a few store procedures that I end up having to fix, are conditional logic and predicates. And that goes for join and where clause predicates both.
This is not isolated to one or the other. And I know that it is a disappointing thing to hear because it is often quite fun to dunk on SQL Server for doing a bad job of this or that, or the optimizer doesn’t do this too well, doesn’t do that too well, blah, blah, blah, blah, blah.
Of course, there’s all sorts of stuff, but query optimization is a truly difficult and demanding task. It is hard to figure out a sane and rational way to handle all of these different things without being quite a burden or a nuisance to query compilation time.
You know, there are a lot of people out there who just seem to think, hey, this is a big, expensive piece of enterprise software. I should be able to do whatever I want, and it should be able to figure it out immediately.
The problem is you do a lot of real, just bad stuff. I can’t imagine having to design software to account for all the bad things that people do out there.
There’s just too many of them. At some point, some rules have to be established and followed for this piece of software to work up to its capacity. So there’s just certain things you shouldn’t be asking it to do.
There’s certain things that even though they exist in the product and you are capable of doing them, like they are valid syntax, doesn’t mean you should go do all of them.
The query optimizer is quite good at applying its craft to a wide variety of queries written in a wide variety of ways. But at the base of things, the query optimizer is still just a computer program written by people like you and me.
Like you and me, except much smarter. Much smarter than you or I. I’ve heard them talk. They’re really, really, really smart. I sometimes feel bad for them that they’re forced to deal with us.
We are their consumers of their work. But when you think carefully about the goal of a generalized query optimizer, not a specific, overly specific query optimizer, remember specialization, as a wise man once said, is for insects.
These things need to be generalized so that they can be applied to a wide swath of things and problems that need to be solved and queries that need to be answered.
But you need to, the optimizer needs to come up very quickly because no one wants to sit around forever waiting with a good enough execution plan regardless of the surrounding hardware while still respecting the logical semantics of the query and within the confines of the available indexes, constraints, hints, and settings that you have applied to your database and database server.
As you add more and more complexity, as you change the scope of things that the optimizer has to do, the harder that job gets. If you think of it sort of like planning a car trip, like logistics problems, finding the shortest set of routes between like 15 different destinations, the more destinations you add in, more things you add on top of it, right?
More difficult that becomes. Then you have to throw in all the unexpected stuff. There may be things that you’re unaware of. There may be traffic, there may be construction, there may be weather, there may be a bunch of knuckleheads randomly gluing their hands to the road, just whatever.
It’s all chaos. It is all chaos out there in the world. And, you know, there are, so, and this isn’t just like limited to people who sit there and write T-SQL.
You know, ORMs specifically don’t provide you any good facilities for breaking queries up into multiple pieces that make query optimization easier.
You know, like you as a SQL developer, like writing a query, hand crafting a query, might at some point look at the vast length and girth and breadth of a query and say, hey, I don’t think this is going anywhere good.
I might want to like use a temp table here. I might want to maybe not like have to, at some point in my life, try to debug the logic and the correctness of an 18 page query.
It’s not a good time. So like you might, you might have that option. You might have that control and that sort of power in your hands, but ORMs do not really give you that.
So let’s get over into the Stack Overflow database. Now I’ve already got a couple indexes created that I don’t want to spend time waiting to recreate. And I want to show you a few different anti-patterns.
Now, this first one is dealing with parameter nullness or not nullness in if branching. Like we talked about above, this has very, very few upsides.
This does have one upside compared to the next thing I’m going to show you. And that is that at least for when we do something with a query, SQL Server will be able to use indexes effectively for these branches, which is not so true with other ways that this query could be written.
But we do run into a problem with cardinality estimation because like I said before, when you compile, when you like actually compile a plan for these store procedures, cardinality estimation and all that other stuff is done based on the initial set of compile parameters.
If you throw a recompile hint here, either at the store procedure or statement level, like you can solve some of that, but then, you know, and that might be fine.
I’m not anti-recompile. I use it all the time. It’s great. It solves a lot of problems real quick. So, but if you look at the estimated plan for this, you will see all three of the query plans that we have available to us in this.
One of them hits P0. One of them hits P1. The other one hits both. I don’t know. Is that good? Is that bad? We’ll find out. So if we run this one, first compilation, we seek, everything’s good.
We get a good cardinality estimate. Everything is fine here. But if we run this one, we will start to see the cardinality estimation thing maybe be not so great because now SQL Server is guessing that we were getting one row back from this.
Why? Well, because if we look at the parameter list, we can see the compile and runtime values are two different things. It was initially compiled with a null.
That null came from this compilation right here. And this is not something that you can fix by inside of the store procedure changing the parameter value to something else because that is not what SQL Server will use.
So for example, if I say something like if at owner user ID is null, set, let’s do this, begin, end, we could spell end right.
That’d be a nice start, right? And we could say set owner user ID equals 22656. Let’s put this down a little bit lower though.
Let’s put that right here. Right? Because we might say, hey, we don’t necessarily want to do anything else.
But let’s just take this one out so we don’t end up doing anything too, too goofy. But if we rerun like this and we say run this one, right, we still get that one execution plan back, right?
And this one is going to be fine, right? If we look at the compile and run time values here, oops, let’s get that open, we get the same thing for this, right? But let’s say we go over here and we run for this one again.
The execution plan is still going to show the bad cardinality estimate and we are still going to show the compile time value of null here because we didn’t actually, like, like, doing that here didn’t actually affect anything, right?
And, like, I had to do a little bit of surgery here just to not, like, have weird duplicate stuff happen, but I just want you to be aware that, like, modifying a parameter or even, like, giving a default value for a parameter up here isn’t going to change anything for if someone, when someone passes in an actual value, right?
So it’s what you pass, like, what you compile the stored procedure with as a passed in value here that matters. Now, the other anti-pattern that comes along with this is, of course, you know, rather than making, like, like, an if branch for every single possible, like, thing that could happen in the stored procedure, this gets very long and unwieldy if you have lots of different parameters and lots of different combinations, but if you do something like this where you’re, like, oh, I’m just going to, you know, figure it out when I run the query, things get a lot worse here for different reasons.
And it doesn’t matter how you express this logic, it doesn’t matter if you use is null or coalesce instead of doing this, it doesn’t matter if you use a case expression to figure out what’s null or not, and it doesn’t matter if you use null or not null checks on the parameters.
That has, none of those are going to help you or change anything here. But when you do something like this, I think the situation gets a lot worse. Now, let’s run this query, and what we get is, well, I mean, this may be not great, right?
Clustered index scan that takes 1.5, or sorry, nonclustered index scan that takes 5, like, almost 1.6 seconds. Not so great, but, you know, we get the, we get what we sort of expected.
Our parameter list for these two things, we have a compile time value for owner user ID and a runtime value of null, right? That’s there. But then for parent ID, we got 184618, 184618.
All right. So, one other important thing here is, like I said, with the if branch, there is an important thing that makes us better, or make that better, is that we could at least seek into the index that we care about.
This one, notice our predicate looks like this, and we scan that index, right? So, we scan all 17 point something million rows of it here. That’s not, this is not a great setup.
Note also that we use the index on P0, and the index P0 coming back up here is the one that’s on parent ID, right?
Now, if we go and execute this for a null parent ID, an owner user ID 22656, we’re going to be sitting here and waiting for a little bit.
Why? Well, I mean, you can, so, coming, like talking about indexing a little bit while we wait for this to finish, you can partially alleviate performance problems with this, with compound indexes.
I would normally never sign up for indexes that look like this with just a single key column. If we had composite indexes on like parent ID, owner user ID, owner user ID, parent ID, maybe something like that, we could, we could partially alleviate the performance problem that we’re seeing now, but not any of the cardinality estimation problems.
So, if you have bad indexes, you’re going to make bad query, you’re going to make query anti-patterns even worse performance problems, right? You’re going to just magnify your issues.
So, if we come down here now, this thing all finished, if we look at what happened, SQL Server, well, we spent three seconds here now, rather than 1.6 seconds, we spent almost 30 seconds in total in this query plan.
That’s 27 seconds. Why? Well, we fully scanned this. We didn’t actually have a predicate to apply here because parent ID is null, right? We didn’t actually filter any rows out here, but we still use this index.
And then we do a key lookup back to the clustered index to evaluate this predicate. And this is where we finally filter out rows, right? You look down here, this is the 27,901 rows that we normally get back when we look at 22656 values in the post table.
So, that’s just a real, real bad time. So, that’s not good. The smarter pattern here is, of course, to use dynamic SQL, which we’re going to talk about in a future video.
But like I said before, it’s also dealing, it’s also very useful for dealing with some other problems that you might run into, like performance or just architecturally in-store procedures.
So, I want to make sure that we cover those and then I show you, like, what dynamic SQL is actually good for across a wide variety of problems. And again, since this playlist is a learning pathway, you will have to follow the pathway to get, to get, to get full knowledge.
So, to get your full learning done. Watch out for that. Anyway, this has gone on long enough. Thank you for watching.
I hope you enjoyed yourselves. I hope you learned something. And I hope I remember to put the links, the link to the, the other playlist in the, in the video description. We’ll see, we’ll see how I do there. And of course, I will see you in the next video where we will talk about more problems in SQL Server Store procedures.
All right. Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.