Considerations For Paging Queries In SQL Server With Batch Mode (Don’t Use OFFSET/FETCH)

First Things First

The first SQL Server blog posts that I ever read while trying to solve a specific problem here these two:

They sort of changed my life a little, despite the author’s aversion to the letter Z. So that’s cool. Can’t have everything.

To this day, though, I see people screw up paging queries in numerous ways.

Selecting all the columns in one go
Adding in joins when exists will do
Sticking a DISTINCT on there just because
Thinking a view will solve some problem
Piles and piles of UDFs
Local variables for TOP or OFFSET/FETCH
Not paying attention to indexing

It’s sort of like every other query I see, except with additional complications.

Especially cute for a query slathered in NOLOCK hints is the oft-accompanying concern that “data might change and people might see something weird when they query for the next page”.

Okay, pal. Now you’re concerned.

Modern Love

A while back I recorded a video about using nonclustered column store indexes to improve the performance of paging queries:

While a lot of the details in there are still true, I want to talk about something slightly different today. While nonclustered column store indexes make great data sources for queries with unpredictable search predicates, they’re not strictly necessary to get batch mode anymore.

With SQL Server 2019, you can get batch mode on row store indexes, as long as you’re on Enterprise Edition, and in compatibility level 150.

Deal with it.

The thing is, how you structure your paging queries can definitely hurt your chances of getting that optimization.

Saddened Face

The bummer here is that the paging technique that I learned from Paul’s articles (linked above) doesn’t seem to qualify for batch mode on row store without a column store index in place, so they don’t make the demo cut here.

The good news is that if you’re going to approach this with any degree of hope for performance, you’re gonna be using a column store index anyway.

The two methods we’re going to look at are OFFSET/FETCH and a more traditional ROW_NUMBER query.

As you may have picked up from the title, one will turn out better, and it’s not the OFFSET/FETCH variety. Especially as you get larger, or go deeper into results, it becomes a real boat anchor.

Anyway, let’s examine, as they say in France.

Barfset Wretch

This is the best way of writing this query that I can come up with.

DECLARE 
    @page_number int = 1,
    @page_size int = 1000;

WITH 
    paging AS
(
    SELECT 
        p.Id
    FROM dbo.Posts AS p
    ORDER BY 
        p.LastActivityDate, 
        p.Id 
    OFFSET ((@page_number - 1) * @page_size) 
    ROW FETCH NEXT (@page_size) ROWS ONLY
)
SELECT 
    p.*
FROM paging AS pg
JOIN dbo.Posts AS p
    ON pg.id = p.Id
ORDER BY 
    p.LastActivityDate,
    p.Id
OPTION (RECOMPILE);

Note that the local variables don’t come into play so much here because of the recompile hint.

Still, just to grab 1000 rows, this query takes just about 4 seconds.

SQL Server Query Plan — what took you so long?

This is not so good.

Examine!

Hero Number

The better-performing query here with the batch mode on row store enhancement(!) is using a single filtered ROW_NUMBER to grab the rows we care about.

DECLARE 
    @page_number int = 1,
    @page_size int = 1000;

WITH 
    fetching AS
(
    SELECT 
        p.Id, 
        n = 
            ROW_NUMBER() OVER 
            ( 
                ORDER BY
                    p.LastActivityDate, 
                    p.Id 
            )
    FROM dbo.Posts AS p
)
SELECT 
    p.*
FROM fetching AS f
JOIN dbo.Posts AS p
    ON f.Id = p.Id
WHERE f.n > ((@page_number - 1) * @page_size)
AND   f.n < ((@page_number * @page_size) + 1)
ORDER BY 
    p.LastActivityDate,
    p.Id
OPTION (RECOMPILE);

Again, this is about the best I can write the query. Maybe you have a better way. Maybe you don’t.

Mine takes a shade under 2 seconds. Twice as fast. Examine!

I’ll take twice as fast any day of the week.

Compare/Contrast

The OFFSET/FETCH query plan is all in row mode, while the ROW_NUMBER query has batch mode elements.

You can see this by eyeballing the plan: it has a window aggregate operator, and an adaptive join. There are other batch mode operators here, but none have visual cues in the graphical elements of the plan.

This is part of what makes things faster, of course. The differences can be even more profound when you add in the “real life” stuff that paging queries usually require. Filtering, joining, other sorting elements, etc.

Anyway, the point here is that how you write your paging queries from the start can make a big difference in how they end up, performance-wise.

Newer versions of SQL Server where certain behaviors are locked behind heuristics (absent column store indexes being present in some manner) can be especially fickle.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

3 thoughts on “Considerations For Paging Queries In SQL Server With Batch Mode (Don’t Use OFFSET/FETCH)”

Pingback: Another Knock against OFFSET/FETCH – Curated SQL

Have a go with:

WITH OnePage AS
(
    SELECT
        Pages.*
    FROM 
    (
        SELECT
            P.Id, 
            P.LastActivityDate
        FROM dbo.Posts AS P
        ORDER BY 
            P.LastActivityDate, 
            P.Id
        OFFSET 0 ROWS
        FETCH NEXT @page_size * @page_number ROWS ONLY
    ) AS Pages
    ORDER BY 
        Pages.LastActivityDate DESC, 
        Pages.Id DESC
    OFFSET 0 ROWS
    FETCH NEXT @page_size ROWS ONLY
)
SELECT 
    P.* 
FROM OnePage AS OP
JOIN dbo.Posts AS P
    ON P.Id = OP.Id
ORDER BY 
    OP.LastActivityDate, 
    OP.Id
OPTION (RECOMPILE, MAXDOP 8);

Erik Darling says:

July 12, 2022 at 11:54 am

Not sure I trust that code formatting, but I’ll try it out.

Comments are closed.