Other Thing
There are two forms of conditional logic that I often have to fix in stored procedures:
- Branching to run different queries at different times
- Complicated join and where clause logic
The problems with both are similar in terms of performance. You see, when smart people tell you that SQL is a declarative language, and not a procedural language, they’re usually trying to get you to stop using cursors.
And that’s not always wrong or bad advice, trust me. But it also applies here.
When you’re developing stored procedures, the thing you need to understand is that SQL Server builds query plans for everything in them first time on the first compile, and then after any causes of a recompile.
It does not compile for just the branch of logic that gets explored on compilation. No no. That would be too procedural. Procedural we are not.
There are two exceptions to this rule:
- When the branches execute dynamic SQL
- When the branches execute stored procedures
If this sounds familiar to you, you’ve probably hear me talk about parameter sniffing, local variables, SARGability, and… well, more things dealing with SQL Server performance.
Hm.
Problem 1: IF Branching
Like I mentioned above, the only way to get if branching to only compile plans for explored branches, is to tuck them away.
Probably the easiest way to demonstrate this is to create a stored procedure with logical branching that accesses an object that doesn’t even pretend to exist.
CREATE OR ALTER PROCEDURE dbo.i_live ( @decider bit = NULL ) AS BEGIN SET NOCOUNT, XACT_ABORT ON; IF @decider = 'true' BEGIN SELECT dp.* FROM dbo.DinnerPlans AS dp; END; IF @decider = 'false' BEGIN SELECT dp.* FROM dbo.DinnerPlans AS dp WITH(INDEX = 2); END; IF @decider IS NULL BEGIN SELECT result = 'please make a decision.' END; END; GO EXEC dbo.i_live @decider = 'true';
If you run this, you’ll get an error saying the index doesn’t exist, even though the code branch doesn’t run.
Where things get even weirder, but is well besides the point of the post, if you execute a store procedure that references a table that doesn’t exist, but not in the branch that executes, no plan will be cached for it.
You can see a stupid demo of that here. It’s probably not something you’ll run into a whole lot, but it’s probably worth noting.
If you’d like to read more about the performance problems IF branching doesn’t solve, you click on these links:
- SQL Server’s Not Very Mighty IF Branch
- SQL Server IF Branches And Query Performance Part 1: The Problem
- SQL Server IF Branches And Query Performance Part 2: Trying To Fix Parameters Doesn’t Work
- SQL Server IF Branches And Query Performance Part 3: Local Variables Still Suck
- SQL Server IF Branches And Query Performance Part 5: Does Dynamic SQL Work?
Problem 2: Conditional Logic In Your Predicates
You may love a big, messy, sloppy, join or where clause, but SQL Server’s query optimizer hates it. This may be disappointing to hear, but query optimization is a really hard job.
Feeding in a bunch of runtime complexity and expecting consistently good results is a shamefully naive approach to query writing.
The query optimizer is quite good at applying its craft to a variety of queries. At the base of things, though, it is a computer program written by people. When you think carefully about the goal of a generalized query optimizer, it has to:
- Come up very quickly with a good enough execution plan
- Regardless of the surrounding hardware
- Respecting the logic of the query
- Within the confines of available indexes, constraints, hints, and settings
As you add complexity to queries, various things become far harder to forecast and plan for in a generalized way.
Think of it like planning a car trip. The more stops you add, the harder it is to find the fastest route. Then throw in all the unexpecteds — traffic, construction, weather, people randomly gluing themselves to the road, breakdowns — and what have you got?
Chaos. Pure chaos.
While the idealism of writing everything in one big query seems attractive to SQL developers — stacking common table expressions, nesting view upon view and subquery upon subquery, and adding in all the OR logic one can possible surmise — it only opens the optimizer up to error, mis-estimates, and even missed opportunities.
The reality is that query optimizers across all database platforms have plenty of issues, blind spots, and shortcomings. Sometimes you need to write queries in a way that is less convenient to you in order to avoid them.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.
Related Posts
- Indexing SQL Server Queries For Performance: Equality and Inequality Predicates
- How To Write SQL Server Queries Correctly: Literally Everything You Need To Know About OUTPUT
- How To Write SQL Server Queries Correctly: Literally Everything You Need To Know About CTEs
- How To Write SQL Server Queries Correctly: Case Expressions
I think you’ve got a typo in the first sentence – “There are two forms of conditional log”. Feel free to delete this comment if/when addressed. Or don’t… I’m not your dad!
Yeah, there are a couple things I have to fix in here when I get a moment. Thanks!
Another great post. Thanks, Erik!