Performance Regression With Batch Mode Sorts When Spilling To Disk

Gress and Regress


To vote for this issue, please click here.

Video Summary

In this video, I delve into an intriguing issue that caught my attention recently—specifically, how sort operators spill in row mode versus batch mode plans within SQL Server. This isn’t something you can easily fix; it’s more of a Microsoft problem they need to address. The video explores two queries with different memory grant hints, highlighting the surprising performance difference between sorts that spill and those that don’t. I also touch on some practical tips, like using Ctrl+R in SQL Server Management Studio to toggle result set visibility, which can be quite handy for keeping your workspace organized. Additionally, I discuss how batch mode sort spills can significantly impact query performance, making it crucial to always examine actual operator timings rather than relying solely on percentage-based metrics.

Full Transcript

Erik Darling here with Darling Data, coming to you on this crisp, clean, beautiful Friday. And in this video, I wanted to talk about something that I came across sort of recently, and there’s an issue open with Microsoft about it. It’s not something you can really do anything about. It’s an issue they have to fix. It has to do with when sort operators spill in in row mode versus batch mode plans. So I didn’t mean to spoil things there too much for you, but come on, grab the thing, do the thing, do it right. So there are two interesting things about this demo when we look at the row mode execution plans. First one is that when you look at how operators time leading up to this, you can see that we spent a little under a second, a little under a second, well, not exactly five seconds because repartition streams is weird. And then about 11, well, actually, let’s call it 12, 11.8. That’s, that’s close enough to 12 for me. About 12 seconds spilling out to disk. And then the entire query finishes in just about 12 and a half seconds. Now, when you compare that to a query down here, it doesn’t spill at all.

We still hit, we hit about 20 seconds here. Now this is weird, right? This is a strange circumstance. There’s a query that spills to disk that takes almost half as long as a query that doesn’t spill at all. It gets, in fact, if you look at the queries, I’m just going to get rid of that screen real quick. If you, if you use SQL Server Management Studio, and you ever get annoyed that, uh, like the results set is in the way, like the results set is in the way, just hit Ctrl and R. It’ll bring the results set window down and you can, you can just hit Ctrl and R again to bring it back up. It’s pretty, pretty useful, at least for me anyway. So, uh, this first query is hinted to have a max grant percent of one. And this second query is hinted to get 100% of the memory grant it wants. Now, the reason why the first query finishes so much faster is because I have really good I.O. hiding underneath my laptop at some Lenovo, Opel, SSD deal. Uh, and I.O. there is really fast. So when the sort starts writing out to disk, it can do, it can do two things. It can write to disk in small chunks and it can, it can build the sort in small chunks.

And doing a whole bunch of small things is sometimes a lot faster than doing one big thing. Um, the first time that I ever was, uh, sort of introduced to that within the context of databases was not with a loop or a cursor. It was, uh, an Adam mechanic session on cross apply where he was using cross apply and there was a row number thing inside it. And you still had to sort data, but you only had to sort data for a specific thing inside the cross apply rather than doing one big sort and bringing the whole thing over along.

So in this case, it is faster. I know this is going to sound crazy to you. It is faster for a sort to spill to disk and build the sort up in tiny chunks reading from disk than it is to do the entire sort in memory. Now you are probably not as lucky as I am. You are probably using some sand catastrophe where, uh, I mean, the disks might be good, but you know, the, the storage network from the, the server to the, to the disks might not be great. Uh, you might be on cloud storage, which is, uh, uh, pathologically wretched and the prices that you get charged for it should, there should be, there should be an SEC violation or something attached to those.

Uh, but where things get even more interesting is with batch mode. Now you might think, wow, batch mode super fast, right? Cause it’s good for all those big, crazy data warehousey, uh, insane aggregation queries.

Those like, like, like, like far edge of the world workloads where, you know, you have to aggregate the Kajagoogoo billion rows. But it turns out batch mode sort spills. Uh, so let’s take a look at these two queries, right?

Now, in row mode, the batch mode, I’m sorry, in row mode, the sort that spilled was wicked fast. And the sort that didn’t spill, well, not awful, was still 20 seconds. It was still almost twice as slow as the sort that spilled.

With the batch mode plan, have a gander at this. This sort spilled for a full four minutes and 11 seconds. That’s, that’s wall clock time.

That’s time on, that’s time out of your life. That’s time out of whoever ran this query, which is me. It’s time out of my life. And then the, the sort that didn’t spill only took about 3.2 seconds. So there are a couple messages here.

One, uh, sort spills in row mode might not be the end of the world for query performance. Remember to always get those actual execution plans and always look at operator times. If you are the sort of foolish person who looks at percentages, God help you.

You’re, you’re living in a far distant past of irrelevant query metrics. Sorry to say it. Second message is, if, uh, you are the type of futuristic forward-looking person who has embraced batch mode, either via fancy tricks, like, like empty filtered columnstore indexes or empty columnstore objects that you do funny little joins to, or you’re on, uh, 2019 and up and you’re using batch mode on rowstore, you might have to be very, very careful when you see batch mode spills.

Now, this, you should still always, of course, be getting the actual execution plan and looking at the actual operator timings on these things. Uh, there’s a very big difference in what’s actually, what’s actually good and bad in this query, percentage-wise, when you, when you look at, look at the actual operator times.

Uh, there’s something else profound and prolific that I was gonna say. Uh, so yeah, this is a, this is a Microsoft problem. Uh, this is something that, uh, they’re going to have to address.

Uh, the reason why the batch mode sort spills, uh, take as long as they do, apparently has to do with, uh, well, a couple issues. One is the, the way that, uh, some, uh, string data, deep data is stored in batch mode or dealt with during the sort, something like that. And the other is, uh, to do with, uh, how the, the batch mode operation, uh, reads the sort from disk, which is one big write, one big read, which apparently is, is not good for performance, I hear.

So, uh, there’s that. Well, anyway, uh, I’ve got other videos to record. And, uh, since I switched over to Camtasia and I’ve got this, this crazy partner app to Camtasia called Audiate.

Uh, after I, after I stop recording this, I get to go, uh, edit the audio portion of this so that you don’t have to hear any of my biology or anything else, which you probably don’t want to hear, which I don’t even want to hear. So, anyway, if you enjoyed this video, in the least, there was a slightest bit of joy or interest brought to you by this video, give me the old, that old thumbs up.

Uh, if you like this sort of SQL Server content, uh, you could pretty please subscribe to my channel. It’s a great channel. It’s a great channel.

Especially now that I have reliable recording software. Even if I have to do more work for you on the back end, it’s, it’s a reliable product that I get to, I get to share with you. Uh, I hope you enjoyed yourselves.

I hope you learned something. I hope you laughed. I hope you loved. I hope you lived. I hope you don’t buy home decorations from Pier 1. Uh, I don’t know.

Every once in a while they actually have some decent Halloween stuff though. I think we got some real spooky Halloween stuff in years past that I think we still, I think, didn’t break, which is amazing for Halloween stuff. Usually, on November 1st, that stuff just starts to like crumble.

It’s got a timer on it. Anyway, thank you for watching and, uh, we got another one to record. Actually, maybe, maybe one or two to record after this. We’ll see, we’ll see how motivated I am.

It is after all, Friday. Alright. Uh, I’m gonna very smoothly and suavely hit the stop button here. I’m not even looking at it.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.