A Little About Row & Range Window Functions vs Batch Mode in SQL Server

A Little About Row & Range Window Functions vs Batch Mode in SQL Server


Video Summary

In this video, I delve into the nuances of window functions and how they interact with rows and range modes in SQL Server. I explore the differences between using `ROWS UNBOUNDED PRECEDING` versus `RANGE UNBOUNDED PRECEDING`, explaining why the former was historically slower due to on-disk spooling, while the latter could be more efficient. By comparing these methods through practical examples and query plans, I demonstrate how modern SQL Server versions and features like batch mode can significantly improve performance for window function queries. Whether you’re working with large datasets or just looking to optimize your T-SQL code, this video offers valuable insights into leveraging these powerful tools effectively.

Full Transcript

Hey, it’s Erik Darling, Darling Data, your number one source for SQL Server content on the internet. I hope. If you can go anywhere else, I mean, I can’t say what you’ll get will be correct or high quality. So, careful out there. Anyway, today’s video, we’re going to talk about rows and range versus batch mode. Now, this all sounds terribly foreign to you. You may not work with window functions very much. But back to the end, I’m going to talk about rows and range versus batch mode. Now, if this all sounds terribly foreign to you, you may not work with window functions very much. Back in the, let’s say, pre-batch mode days of window functions, SQL Server has two different ways of sort of giving you a partition clause or giving you a window of things to look at. You can look and specify a certain set of row elements to be in there, or you can specify a range of row elements to be in there. Now, back in the bad old days, you had to be very careful about specifying the rows. Because rows was a lot faster. If you use range, you got this terrible on disk window spool. There was very, very, there was not very, very, sometimes very, very slow. Let’s just say it was much slower than range.

But now, with batch mode, all that kind of goes away. So we’re going to look at examples of that in this video here today. But before we do, if you would like to sign up for a channel membership to support the best SQL Server content on the internet, there is a link in the video description below. And if you want to ask me questions privately that I will answer publicly on my Office Hours episodes, there is a link right up there for you to do that exact thing with. Otherwise, you can just do the usual YouTube stuff and keep me happy in different ways. If you need SQL Server consulting help, I am here for you, baby. I got you no matter what you need. You know, within reason. I mean, I don’t want to like carry a server somewhere for you. You got to do what you got to do. I’ll rent a dolly if I have to. And as always, my rates are reasonable.

My performance tuning training is about $150 for life. If you go to that link and use that discount code, that’s the everything bundle. And if you would like to pick up my new T-SQL course, all of the beginner content is recorded and published and available. It is currently on sale for the pre-sale price of $250, which will double in value to $500 once the advanced material is out about there in the world. And some other fun summer stuff coming up for me. Redgate is taking me on tour with the PASS mini events. That’s going to be New York City, Dallas, and Utrecht.

And of course, all that leads up to PASS Data Community Summit, November 17th to the 21st. So I look forward to seeing you at as many of those events as humanly possible. But with that out of the way, let’s talk about this window function stuff. Now, what I’m talking about when I say like rows versus range is specifically this part of the window function, where you say rows unbounded proceeding.

Now, I don’t technically have to say and current row because this is it’s by default when you do that. So if you just say rows unbounded proceeding, then you will just get like a running sum of all the prior rows up to the current row. Now, I’ve got this query hinted.

Well, let’s make sure. Query plans are turned on. I’ve got this query hinted to you as optimizer compatibility level 140. The reason for this is because my current database, which is on SQL Server 2022, is in database compatibility level 160. Being on developer edition, which is an enterprise equivalent SKU, and being on SQL Server 2022 in database compatibility level 160 means that I get the batch mode on rowstore feature automatically without having to do things, do any strange things there.

If my database compatibility level 140, I would have to use this hint to say 160 in order to show it off. So we’re just doing this in reverse this time. But when I run this query and we look at the rows unbounded proceeding, we are going to get back some results.

And they take a second because there’s a lot of rows. There’s about a million rows that come out of this. But this all takes just about a second to run.

You can see over here, just one second. It makes life easy. And, you know, things in here are not too terrible. If we run that same query, but we use the range unbounded proceeding, we’ve changed this from rows to range, then things are going to look a little bit different.

And if we look at the query plan, you can maybe felt that that took a little bit longer. That took twice as long at two seconds. Now, part of the reason why does require us running both of these at the same time.

I’ve turned statistics.io on for this because it’s a slightly easier way of seeing where that other second of time went. So now, if we look at the query plans next to each other, we will see just about that same pattern here, right, where it’s one second versus two seconds. And so the range one is twice as slow.

And if we look at the statistics.io results, notice that there’s this work table. Let’s frame that up a little bit better. There we go. And for the rows query, there is a work table with a scan count of zero and a logical reads of zero.

But for the range, there is a work table with a scan count of 1058840 and a reads of 5430748. So in the first query, we use rows, that work table didn’t actually end up getting like it was there, but nothing really happened to it. It’s almost like with a many to many merge, when you actually have a many, many, many, and you end up like there’s a work table, like when you’ll see a lot of activity there.

But when SQL Server is like, I think this is a many to many merge, but then there’s not actually a many to many. The work table doesn’t actually see anything. Now, let’s contrast this situation with database compatibility level 160, where we get the batch mode on rowstore feature to kick in.

I’m going to run this one, right? We’re going to take a look at how long this takes. And this, I mean, like we still have to display a million rows, so it may not feel quite as snappy as you’d hope, but it still runs very quickly.

We’re at just under a second in total for this. So we improved from the non-batch mode version by about, I don’t know, a couple hundred milliseconds or something. That’s not really the thing that is my big deal for this query.

The bigger deal for these queries is that the range unbounded proceeding version does not suffer and take twice as long when things run in batch mode. Now, this one takes just about the same amount of time as the previous one. So to talk a little bit about why, the main difference between these, and what I want to do is just get the estimated plan for this.

And if you look at what happens in here, we have the clustered index scan, we have a repartition streams, we have a sort, we have a segment and another segment, and then a window spool. The window spool over there, let’s clear these out so it’s just one thing in the middle that we care about. The window spool is where that work table occurs.

When this is on disk with range and using like a row mode execution, then this is a bad time, right? This is where that extra second in that work table lives. When we run these in batch mode, right?

And if I run these together, we’re going to get a different query plan. It’s a lot more compact and it has different operators in it. So if we run both of these, they’re both going to take around 800 milliseconds to run. Again, there’s like a million rows that come out.

So things are, you know, not quite perfect. But if you look at these, we still have the clustered index scan, and then we have a sort to put data in order. But then we have this window aggregate.

And this window aggregate doesn’t really have like the on disk thing for it when you, like, depending on if you use the default, which is range, or if you like go out of your way to specify rows. And if we look at the messages tab, and we look at like the work table for both of these, we’re going to see a scan count of zero in here and a logical reads of zero from the work table in here. So neither one of these did any of that like on disk window spool work, because in the query plan, we have the window aggregate rather than the window spool operator.

So this is like, you know, this is like a sign that your window function query ran in batch mode. You see the window aggregate and a lack of like the usual, like, you know, like, you know, segment sequence project, stuff like that. And then the window spool operator, which is where that hidden work table occurred in the other queries.

So if you are using, you know, a SKU of SQL Server and a database compatibility level where you can get batch mode, or if you are allowed to play some tricks on your queries, which I’ve explained in many other videos about how to get batch mode to happen for your queries without that, then you have a little bit less to fear with window functions if you need the range behavior, because the range behavior is slightly different from the rows behavior. So if you’re getting batch mode, you can be slightly less afraid of your queries, your window function queries that may require the range behavior over the row behavior.

If you are using, if you are getting batch mode execution, because you no longer have to deal with that pesky on disk window spool. So some, some, some light at the end of the tunnel out there for the, the big data analysis folks. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video where we’ll talk about something else interesting. I promise. All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.