International
This post is especially interesting because it highlights the oddball performance situations you can run into when you write the type of All-In-One™ queries that the optimizer is typically not very good at handling, and also rely on expressions calculated at runtime as predicates.
I mean, it’s especially interesting if you’re into this sort of thing. If you’re not into this sort of thing, you’ll probably find it as interesting as I find posts about financial responsibility or home cooking.
I’ve seen query patterns like this while working with clients, and they’ve always ended poorly.
Anyway, on with the post!
Skeletons
To make sure we have a good starting point, and you can’t tell me that “scanning the clustered index is bad”, let’s create an index:
CREATE INDEX p ON dbo.Posts (OwnerUserId, Score DESC) INCLUDE (PostTypeId) WHERE PostTypeId IN (1, 2);
Now let’s take a look at this query, and what gets weird with it.
WITH top_questions AS ( SELECT p.OwnerUserId, QuestionScore = p.Score, tq = ROW_NUMBER() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.Score > 1 ), top_answers AS ( SELECT p.OwnerUserId, AnswerScore = p.Score, ta = ROW_NUMBER() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) FROM dbo.Posts AS p WHERE p.PostTypeId = 2 AND p.Score > 1 ) SELECT TOP (1000) tq.OwnerUserId, tq.QuestionScore, ta.AnswerScore FROM top_questions AS tq JOIN top_answers AS ta ON tq.OwnerUserId = ta.OwnerUserId AND tq.tq = ta.ta ORDER BY tq.QuestionScore DESC, ta.AnswerScore DESC;
The non-SARGable portion is, of course, generating and joining on the row_number function. Since it’s an expression that gets calculated at runtime, we have to do quite a bit of work to execute this query.
Community Board
The query plan for this is all over the place, and also bad. Parallel merge joins were a mistake.
The portions of the query plan that are particularly interesting — again, if you’re into this sort of thing — is that there are four Repartition Streams operators, and all of them spill. Like I said above, this is the sort of thing you open yourself up to when you write queries like this.
In all, the query runs for about 50 seconds. This can be avoided by hinting a hash join, of course, for reasons explained here.
But good luck figuring out why this thing runs for 50 seconds looking at a cached, or estimated execution plan, which doesn’t show you spills or operator times.
Breakup
One way to avoid the situation is to materialize the results of each CTE in a #temp table, and join those together.
WITH top_questions AS ( SELECT p.OwnerUserId, QuestionScore = p.Score, tq = ROW_NUMBER() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.Score > 1 ) SELECT * INTO #top_questions FROM top_questions; WITH top_answers AS ( SELECT p.OwnerUserId, AnswerScore = p.Score, ta = ROW_NUMBER() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) FROM dbo.Posts AS p WHERE p.PostTypeId = 2 AND p.Score > 1 ) SELECT * INTO #top_answers FROM top_answers; SELECT TOP (1000) tq.OwnerUserId, tq.QuestionScore, ta.AnswerScore FROM #top_questions AS tq JOIN #top_answers AS ta ON tq.OwnerUserId = ta.OwnerUserId AND tq.tq = ta.ta ORDER BY tq.QuestionScore DESC, ta.AnswerScore DESC;
Breakdown
The end result takes about 10 seconds, and has no exchange spills.
Infinito
For completeness, hinting the query with a hash join results in just about the same execution time as the temp table rewrite at 10 seconds. There are also very strong benefits to using Batch Mode. The query as originally written, and with no hints, finishes in about two seconds with no exchange spills, and I absolutely love that.
In tomorrow’s post, we’ll look at how we can sometimes adjust index key column order to solve SARGability issues.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.