Batch Mode Sorts and Row Mode Indexes in SQL Server

Video Summary

In this video, I dive into the intricacies of batch mode sorts and rowstore indexes in SQL Server, sharing insights from a recent client-facing issue that highlighted the challenges and nuances involved. We explore how forcing parallel plans can sometimes lead to unexpected results, such as the need for additional sorting even when an index perfectly aligns with query requirements. I also discuss the trade-offs between optimizing for speed versus memory usage, providing practical advice on when to use the `MAXDOP` hint to achieve better performance in specific scenarios.

Full Transcript

Erik dishwaskeling here with Darling Data. And today’s video, we’re going to talk about batch mode sorts and rowstore indexes. And hopefully I said that right as I’m reading it from the teleprompter in front of me. Great. So this was an issue that a client of mine had trying to like, like various parts of their workload, you know, row mode execution just doesn’t cut it. You need to get batch mode stuff involved. But sometimes, especially batch mode on rowstore is a tricky thing to sort of get happening in the way that you want it to happen. So we’re going to talk about a little bit of that. And we’re also going to talk about, of course, the little caveat with batch mode sorts and rowstore indexes. So we have much on our plate. Now, I did mention that this was a client facing problem. And, you know, I do have clients. I do have nice people who pay me for my time to do things for them and have a client-facing problem. Help them with things. If you would like to be one of those lucky people, there is a link down in the video description where you can hire me to do SQL Server stuff. It’s a crazy scheme, I know, but it works. Trust me. You can also, by training, become a channel member, ask me office hours questions. And of course, if you find this channel content just so groovy and ghouly that you want to make sure that it spreads far and wide like the herpes simplex virus, you can, of course, like, subscribe and tell a friend. Get me in that sweet YouTube algorithm or whatever.

Past Dita Community Summit. Oh, boy. Oh, it’s so close. Two days of T-SQL pre-cons. Me and Kendra Little. The entire event is in Seattle, November 17th to 21st. So you should be there. Hopefully you bought tickets and all that stuff at this point because we’re only a few weeks away and sometimes planning this sort of travel takes some time. You got to start early, right? My wife tell you when she books travel, it’s like a million years in advance. So I had chat. So like by the time this video publishes right now, it’s still October. By the time this video publishes, it will no longer be October. So I had said chat GPT. My birthday is in November. Can you give me a birthday themed drawing? Just like iterate on the one that we’ve been using and it left in some Halloween this right? We still have a ghost and a vampire thing, but it is a festive birthday and I am kind of a spooky, spooky person. So we’ve got, we’ve got a ghost and a vampire showing up to my party. So anyway, let’s go talk about the stuff that we wanted to talk about. See, I’m so spooky. I have a black background sometimes. All right. So I think I already did this, but let’s just make sure. Yeah. Wonderful. All right.

We’ve got an index. So what I want to show you first is, and I don’t want to run this like actually like at real time run this because it takes a long time. So just really what I just want to show you is the estimated execution plan for this. Now, this is a query where I am forcing a parallel plan. And since if you watched my video about what Microsoft is doing to create their vector disc and indexes, you should know that the enable parallel plan preference hint is used in that code.

So it’s safe for production now, right? Because Microsoft uses it in production, so you can use it in production. So now it’s safe. They’ve blessed it. So what I want to show you specifically here is that when this query executes in row mode, right? And if we, let’s buy ourselves a little bit more query plan real estate up here, we can generally visually infer that this plan ran in row mode.

Because like, like we have a repartition streams and repartition streams there, they don’t like none of the parallel exchanges support batch mode segment and sequence project aren’t used in batch mode. We get a window aggregate operator for windowing functions when we execute in batch mode.

The filter, of course, could be in batch mode. Actually, I mean, it probably would be if I didn’t disallow it. But and then the gather streams is, of course, another parallel exchange operator. So that can’t be batch mode either. So this is a fully row mode plan. But the thing that I want to point out here is that because we have an index, right? And if we scroll back up here, I should probably sell a little bit of my query plan real estate. We have a non-clustered rowstore index that not only fully covers our query, but supports the window function specification exactly right.

So user ID and sort descending, which is what we’re asking for in here, right? So because we have this, we don’t have a sort operator in here where SQL Server has to like, like, like basically rearrange data from the way it’s stored somewhere to the way that the window function needs it to create its row number, right? So we don’t have to do that there. Now, next is me levels like, you know, try like, hey, like, get that out of the way. Like that’s, that’s gonna, that’s gonna come in handy later. The thing about the sort. So keep that in mind. Now I’m like, okay, well, we want batch mode, right? We’re like, like, we’re gonna use our auxiliary columnstore helper table, we’re gonna do this left join to it. It’s got a clustered columnstore index, it’s got no rows in it, we’re gonna do this thing so that we get a batch mode thing. And we’re gonna say no, but we’re gonna use this now safer production query hint in there. The thing is, though, if we get the estimated plan for this, it’s the same as last time, right? And like, we zoom around a little bit, we’ve got row and like, you know, the like, none of this stuff can be in batch mode, like, like repartition streams, segment sequence project gather streams, but in our filter operator is also still in row mode.

Now this was kind of an interesting one. And what it comes down to for batch mode on rowstore in this query plan specifically, is batch mode on rowstore takes one look at the text column in the comments table and says, hell no. The text column is in VARCAR 700, right? So if we remove that from the query select list, right, we’re gonna specifically name our columns in here. And we’re gonna say no text column, right, we’ve removed it from existence. And we run this, right, we all let’s just get an estimated plan for this. Now we see a much more batch mode plan. Right? But we now we have a sort, we’re using that same nonclustered index, right? So the data from the index is in order. But when we read from this, we’re reading from it in batch mode. Okay. So let’s run the query. And let’s look at how this thing fares. So let’s start over here on the left. And we get a do do do do memory grant of 1418 megabytes. So that’s about 1.4 gigs, I think, depending on how you like divide if it’s by 1000 or 1024, you might, you might have a slightly different take on exactly what that is. But let’s just call it 1.4 gigs for the sake of 1400 megs. Yeah, it’s about 1.4 gigs there. All right, all right, cool, we did the math. But this query finishes in about 1.3 seconds. But the thing is that, why is this sort here? And it’s kind of an interesting thing, because this only happens in parallel plans, right? So if we come down here, where we’re going to do almost the same thing, except we’re going to force this query to run at max.1, right, we’re going to say, hey, and just keep in mind for this query, we didn’t even need this to get batch mode on rowstore.

If we check the estimated plan here, right, we’ll see that we do, we have a very similar plan, but without the sort, right? So reading from the row mode index, with a max.1 plan, we don’t have to sort data here, we can actually rely on the order that this data came from.

So, but if we run this, at max.1, okay, so this whole thing, you know, like it’s a fully batch mode plan. So each operator only has the operator time for itself. So we have 5.2, and then another 137 milliseconds, and then another 35 milliseconds. So let’s just say it was about five and a half seconds total. But the memory grant for this is 120, 1024 KB, which is like one meg. So this is an interesting thing. And this is, this is specifically just a product limitation for the SQL servers implement implementation of batch stuff. It can’t like use the, like in a parallel plan, where batch mode is used like this. It cannot trust the order that things come out of the index. And it can, the index is like the, it’s not the, not the, not that the index is out of order, but the order that stuff gets read in, in batches might not be true to like the, the sort thing. So like it might not maintain things correctly. So we still have to sort data here. So the question for you becomes, what do you want to optimize for? If you want to optimize purely for query speed, well, it’s probably fine to have the, the, this thing run in parallel and have a sort in it and use like 1.4 gigs of memory.

If you want to optimize for memory usage and you don’t, and you’re willing to sacrifice some time for that, you can of course force the plan to run at max.1, not sort data and use less memory. But this was a very interesting thing. And I had to do a little bit of research on the batch mode sorting to make sure that I was correct in this. So, so I talked to some nice people who, who worked with the product and they said, yeah, that’s true. And I said, yeah, that’s great. So yeah, batch mode on rowstore, using windowing functions, when you get a parallel plan, right? Even if you have an index that perfectly puts your data in order for the windowing specification, the parallel plan sort of negates all that and you will still have to sort things. When a serial plan, right? You do not, like you see the, the can trust the order coming from getting stuff from the index and you do not have to sort things.

Reading from big tables with a single thread, even in batch mode does take more time, right? 5.2 seconds versus 829 milliseconds. So really depending on what you want to optimize for, you might choose to hint max.1, get rid of the memory grant, take a little bit longer, or say, screw it, use the memory, use the, use the extra threads, finish as fast as you can.

Anyway, that was about it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I’ll see you in tomorrow’s video where we will talk about, I don’t know what kit. Well, maybe I’ll just make it up as I go as usual. All right. Thanks for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.