Mistakes!
Strings cause nothing but problems in databases.
- Standard vs. Unicode
- Odd collations
- Inflated memory grants
- Oversized columns because of truncation
- String splitting
And of course, sometimes they can cause plans to be costed strangely.
Strong Tradition
Working with the queries we’ve been tinkering with in all the posts this week, let’s look at a slightly different oddity.
/*Q1*/ SELECT u.Id, u.DisplayName, u.Reputation, ca.Id, ca.Type, ca.CreationDate, ca.Text --Text in the select list FROM dbo.Users AS u OUTER APPLY ( SELECT c.Id, DENSE_RANK() OVER ( PARTITION BY c.PostId ORDER BY c.Score DESC ) AS Type, c.CreationDate, c.Text FROM dbo.Comments AS c WHERE c.UserId = u.Id ) AS ca WHERE ca.Type = 0; /*Q2*/ SELECT u.Id, u.DisplayName, u.Reputation, ca.Id, ca.Type, ca.CreationDate -- No Text in the select list FROM dbo.Users AS u OUTER APPLY ( SELECT c.Id, DENSE_RANK() OVER ( PARTITION BY c.PostId ORDER BY c.Score DESC ) AS Type, c.CreationDate, c.Text FROM dbo.Comments AS c WHERE c.UserId = u.Id ) AS ca WHERE ca.Type = 0;
The first query has the Text column in the outer select list, and the second query doesn’t. Please read the comments for additional clarity.
Big Plans
The plan without Text in the outer project goes parallel, and the one with it does not.
But why?
Forcing The Issue
Let’s add a third query into the mix to force the query to go parallel.
/*Q3*/ SELECT u.Id, u.DisplayName, u.Reputation, ca.Id, ca.Type, ca.CreationDate, ca.Text --Text in the select list FROM dbo.Users AS u OUTER APPLY ( SELECT c.Id, DENSE_RANK() OVER ( PARTITION BY c.PostId ORDER BY c.Score DESC ) AS Type, c.CreationDate, c.Text FROM dbo.Comments AS c WHERE c.UserId = u.Id ) AS ca WHERE ca.Type = 0 OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));
Things are pretty interesting, here.
The parallel plan is actually costed cheaper up through the Filter operator. In the serial plan, the entire subtree costs 35,954. In the parallel plan, it’s at 35,719.
At 200 query bucks cheaper, we’re in good shape! And then… We Gather Streams ☹
Mortem
The Gather Streams pushes the final plan cost for the parallel plan up higher than the serial plan.
Even though the parallel plan finishes ~26 seconds faster, the optimizer doesn’t choose it naturally because it is a cheapskate.
Bummer, huh?
An important point to keep in mind is that in nested loops join plans, the inner side of the query doesn’t receive any cost adjustments for parallel vs. serial versions. All of the costing differences will exist on the outside.
That’s why only the last few operators in the plan here are what makes a difference.
And that’s what we’ll finish out the week with!
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I’ve tried to change the text column;
left(c.Text,1) as text
It helps, but not by much.
When you say that it helps but not by much, what do you mean?
Putting that, or
CONVERT(NVARCHAR(1), c.Text) AS Text
into the select list of the outer apply results in a parallel plan.