A Little About Why Parallelism Doesn’t Make Every SQL Server Query Faster

A Little About Why Parallelism Doesn’t Make Every SQL Server Query Faster


Video Summary

In this video, I delve into an intriguing scenario where parallelism in SQL Server doesn’t always yield the expected performance benefits. Starting off with a view called “parallel but serial,” which gets amusingly abbreviated as PBS by Redgate’s SQL Prompt tool (a nod to Sesame Street), I run through its execution plans side-by-side to highlight the nuances of parallel query processing. The key takeaway is the concept of skewed parallelism, where all rows end up on a single thread despite running in parallel, leading to suboptimal performance. This video serves as both an educational exploration and a reminder that while SQL Server’s parallelism can be powerful, it’s crucial to monitor row distribution across threads to ensure optimal query execution.

Full Transcript

Erik Darling here with Darling Data. Exciting news on top of Scotch Sniffer Magazine wanting to partner up with me. I’ve also, I guess Wine Wanker Magazine also watches my YouTube channel and they’ve put in a competing offer to partner with me on future endeavors. This video was of course sponsored by Fiji Water. I guess not really because I can’t, I don’t know if this is actual Fiji Water or if this is the Fiji Water bottle that I refilled with New York tap water. Maybe, maybe it says something about my palate that I can’t tell the difference between a fresh bottle of Fiji Water and New York tap water, but quite frankly, water doesn’t get me drunk. I’m not going to invest that much time in figuring it out. This video, video is a little bit about why parallelism does not help every query in SQL Server. And what I have set up over here is actually a view called parallel but serial, which is oddly enough gets aliased by Redgate SQL prompt is PBS, which I find amusing because this is supposed to be a very informative and entertaining Sesame Street video.

But what I’m going to do is I’m actually going to run these. I’ve got query plans turned on so I won’t waste anyone’s time. And while that happens, it’s going to take around 45 or so seconds. So I’ve got a lot of dead space to kill here. Maybe I should have talked about the magazines and whatnot while these things were running. Might have been smart of me. Poor time management skills, I guess you could put in my yearly review. But while these things are running. But while these things are running, this is the definition of the view. I’m actually going to just take that thing off for a moment so that the code is a little bit more clear and easier to read. Reading a wall of green text is no easier than reading a wall of red text. But anyway, what we’re doing is flattening out dates into years and months, because that’s all we care about. And we are joining the post table to itself to find accepted answers and accepted answers and what not. We’re also sort of doing a third kind of weird cross join to find the max creation date of all posts, where the score is greater than zero. And we’ve got a little bit of filtering at the end to find dates that are less than the max creation date in the table, which I know looks a little bit silly. But I deal with a lot of reporting queries that hit like date, dimension tables kind of things like calendar tables and stuff.

that do things like this that end up with weird execution plans. And then of course, we have to group by because we are we are getting a count. So we have to aggregate this expression as well. But anyway, our queries finished, it took about 41 seconds, my guess was between 40 and 45. So I was close on the low end. That’s a nice change of pace for me. But if we look at the execution plans, we’re going to have the parallel plan up top, which takes 21.525 seconds, 21.525 seconds, 21.525 seconds. And we’re going to have the serial plan down the bottom, which takes 20.170 seconds. So the parallel plan takes about a second longer, for some reason, I’m not going to invest too much time in figuring out where that extra second goes. But what I do want to point out is that the plans have the same basic shape, right?

They both start with an index seek into posts down here. Now we have our index scan of posts that goes into a loop join to seek into posts, which is the same thing that happens here. We have a key lookup, we have a key lookup, we have another nested loops join back to posts again. And the only thing that’s really different are some of the repartition streams operators that come along with a parallel plan. All right. Now this is an important thing that you need to check when you have a parallel plan. And you’re like, wow, this parallel plan is running a whole lot more slowly than I think it should. What the heck is wrong with this parallel plan? One thing that you want to check and make sure is not happening to you is skewed parallelism. So if we right click on this nested loops join, and we look at the actual number of rows, we’ll see that absolutely all of the rows in this query end up on a single thread. And this will continue throughout the query.

If we click on different operators, all of the rows will be on one thread going throughout. I mean, clearly, like the worst of it is in here, where we’re dealing with a lot of rows. But you know, the story doesn’t get much better. Well, I mean, the story does get much better as rows get filtered out. But the story doesn’t get much better as far as like row distribution goes. When when when even even after the number of rows gets narrowed down, they’re still all on one single thread.

Now, the way that SQL Server assigns pages to parallel threads in a query is via a mechanism called the parallel page supplier, which uses a sort of a hashing function, which is like like modulus dop, you know, like modulus, you know, eight, or whatever for this query, since we ran it at max dop eight. And it starts assigning pages by looking by by using that and saying, well, this row goes to this thread, this row goes to this thread, this row goes to that thread. But in this case, we get incredibly unlucky and all our rows end up on a single thread.

Now, there are various ways to solve this. Sometimes actually forcing loop joins can be a good way to solve it. Sometimes forcing different join types like hash or merge joins can solve it, especially changing the plan shape and, you know, all that other good stuff. But a lot of the times, you know, you do like, you know, you may be using something like a temp table to pre stage some of the data would have been useful, like maybe instead of cross joining to this, maybe if we just stuck the max creation date into, into a temp table and then, you know, use that to filter things, that would have been more appropriate.

But this is something that I actually ran across it twice in the last, I don’t know, well, actually not this month, because it’s the first of the month. But I ran across this exact thing a couple times last month, that would be April for anyone counting. And there were, well, I mean, the stuff that I just talked about was kind of how I ended up solving the problems in those cases.

But yeah, that’s just sort of an unfortunate side effect of parallelism that not all row distributions are created equally across your parallel threads. There have been a couple of times in the past where using an odd DOP number like seven or three or something has helped with this a little bit. But in other cases, you do have to use different methods of solving the problem.

Anyway, this is a short video because I just need today to be done with so I can start resting my brain for tomorrow’s recording. So I’m going to say goodnight here. I’m going to say thank you for watching.

I hope you learned something. I hope you enjoyed yourselves. And of course, a reminder to like and subscribe. Like this video and subscribe to my YouTube channel so that you can keep learning all sorts of fun and interesting things about SQL Server. Anyway, that’s about it.

Again, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.