Off And On
I spend a lot of time fixing queries like this, so I figured that I’d write about it in hopes that I can save someone some time and pain later on.
Obviously, this extends to join clauses as well. A case expression there has just as much chance of causing performance issues.
This pattern crops up in some ORM queries that I’ve seen as well, but I haven’t been able to figure out the code that causes it.
Let’s look at why this causes problems!
Tractor Trailer
To give our query the best possible chance of not sucking, let’s create some indexes.
CREATE NONCLUSTERED INDEX p ON dbo.Posts ( PostTypeId, Score, OwnerUserId ); CREATE NONCLUSTERED INDEX u ON dbo.Users ( Reputation ) INCLUDE ( DisplayName );
With those in place, let’s look at a simple example query.
SELECT u.Id, u.DisplayName, s = SUM(p.Score) FROM dbo.Posts AS p JOIN dbo.Users AS u ON p.OwnerUserId = u.Id WHERE 1 = CASE WHEN p.PostTypeId = 1 AND p.Score >= 10 THEN 1 ELSE 0 END GROUP BY u.Id, u.DisplayName HAVING SUM(p.Score) > 10000 ORDER BY s DESC;
The plan reveals textbook symptoms of a lack of SARGability: an index scan with a predicate, despite a perfectly seekable index being in place:
Shame about that! But we can make things worse, too.
The Worser
If we involve a new column in the case expression, this time from the Users table, the predicate will be applied at a really unfortunate place in the query plan.
SELECT u.Id, u.DisplayName, s = SUM(p.Score) FROM dbo.Posts AS p JOIN dbo.Users AS u ON p.OwnerUserId = u.Id WHERE 1 = CASE WHEN p.PostTypeId = 1 AND p.Score >= 10 AND u.Reputation > 5000 THEN 1 ELSE 0 END GROUP BY u.Id, u.DisplayName HAVING SUM(p.Score) > 10000 ORDER BY s DESC;
Now all the filtering happens at the join, and the query goes from taking about 1 second to taking about 5 seconds.
If you write queries like this, you’re asking for trouble.
Why Can’t You Just Be Normal?
If we express that logic without a case expression, performance turns out much better. Shocking, right?
SELECT u.Id, u.DisplayName, s = SUM(p.Score) FROM dbo.Posts AS p JOIN dbo.Users AS u ON p.OwnerUserId = u.Id WHERE p.PostTypeId = 1 AND p.Score >= 10 AND u.Reputation > 5000 GROUP BY u.Id, u.DisplayName HAVING SUM(p.Score) > 10000 ORDER BY s DESC;
This query takes about 240 milliseconds, which is a nice improvement.
We’re able to seek into our Super Helpful Index™ on the Posts table. Now I know what you’re thinking — we’re hitting the clustered index on the Users table — that must be horrible.
But no; because the Nested Loops Join is of the Apply variety, it makes more sense to use it to seek to a single Id, and evaluate the predicate on Reputation second.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I’m apparently missing something. It seems pretty obvious, perhaps glaringly so, that a WHERE clause is the proper solution. And I totally agree that using CASE logic to accomplish that is bringing the wrong weapon to the fight.
I just can’t fathom what would cause someone to create this Rube Goldberg apparatus to solve such a simple filter requirement. Did anyone ever explain to you how they arrived at such an interesting use of T-SQL? I ask because I frequently encounter those whose only experience with SQL is via an ORM. They are learning to write SQL from the most basic starting point.
That’s the funny thing — I’ve seen ORMs generate this type of where clause.
Typically the logic is a bit more complicated than in the examples I’ve used here, but the general idea is the same.
Thanks!
I’ve seen EF generate SELECT TOP 1… FROM (SELECT TOP 1 …). That one is a classic joke on our Engineering team. But I haven’t been lucky enough to discover something like you example. Maybe time to do some extra QueryStore spelunking.
Weirdo T-SQL like this can arise as a result of copy ‘n’ paste coding, wherein the correct result wins, and the correct result returned in a timely manner loses.
I have seen it a few times in horribly convoluted code in horribly convoluted SPs which call horribly convoluted SPs that use functions that use views that use functions, etc.
In such cases, it’s likely going to be legacy code that someone gets landed with. Rather than spend the majority of their life refactoring, they take an easy shortcut and find a solution in other code then plant it, without examining too carefully, in the middle of the code equivalent of Hampton Court Maze, then tippy-toe away.