Batch Mode – Page 3 – Darling Data

Query Tuning SQL Server 2019 Part 1: Changing Databases

Posted on June 15, 2020May 14, 2022 by Erik Darling

Teeth To Grit

I’ve always had trouble standing still on SQL Server versions, but most companies don’t. Hardly anyone I talk to is on SQL Server 2017, though these days SQL Server 2016 seems more common than SQL Server 2012, so at least there’s that. Mostly I’m happy to not see SQL Server 2014. God I hate SQL Server 2014.

Despite the lack of adoption, I’ve been moving all my training material to SQL Server 2019. Heck, in a few years, my old posts might come in handy for you.

But during that process, I kept running into the same problem: The demos generally still worked for the OLTP-ish queries, but for the report-ish queries Batch Mode On Rowstore (BMOR, from here) was kicking butt (most of the time anyway, we’re gonna look at some hi-jinks this week).

The problem, so far as I could tell, was that the Stack Overflow 2013 database just wasn’t enough database for SQL Server 2019 (at least with my hardware). My laptop is quad core (8 with HT) @2.9GHz, with 64GB of RAM, and max server memory set to 50GB. The SO2013 database is… just about 50GB.

While it’s fun to be able to create performance problems even with the whole database in memory, it doesn’t match what lot of people are dealing with in real life.

Especially you poor saps on Standard Edition.

My options seemed to be:

Drop max server memory down
Use a VM with lower memory
Use the full size Stack Overflow database

Flipping and Flopping

Each of these has problems, though.

Dropping max server memory down is okay for the buffer pool, but SQL Server (it seems especially with column store/batch mode) is keen to use memory above that for other things like memory grants.

A lot of the interesting struggle I see on client servers between the buffer pool and query memory grants didn’t happen when I did that.

Using a VM with lower memory, while convenient, just didn’t seem as fun. Plus, part of the problem is that, while I make fun of other sample databases for being unrealistically tiny, at least they have relatively modern dates in some of them.

I was starting to feel really goofy having time stop on January 31st, 2013.

I suppose I could have updated all the CreationDate columns to modernize things, but who knows what that would have thrown off.

Plus, here’s a dirty little secret: all the date columns that start with “Last” that track stuff like when someone last logged in, or when a post was last active/edited, they don’t stop at 2013-12-31. They extend up to when the database was originally chopped down to size, in 2017 or so. I always found that a bit jarring, and I’d have to go and add time to them, too, to preserve the gaps.

It all starts to feel a bit like revisionist history.

The End Is Thigh

In the end, I settled on using the most recent version available here, but with a couple of the tables I don’t regularly use in demos cut out: PostHistory, and PostLinks. Once you drop those out, a 360GB database drops down to a much more manageable 150Gb or so.

If you’d like to get a copy, here’s the magnet link.

SQL Server Database Properties — Four users, huh?

The nice thing is that the general cadence of the data is the same in many ways and places, so it doesn’t take a lot to adjust demos to work here. Certain Post and Vote Types, User Ids, Reputations, etc. remain skewed, and outliers are easy to find. Plus, at 3:1 data to memory, it’s a lot harder to keep everything safely in the buffer pool.

This does present different challenges, like index create time to set up for things, database distribution, etc.

But if I can give you better demos, that seems worth it.

Plus, I hear everything is in the cloud now anyway.

Alluding To

In the process of taking old demos and seeing how they work with the new database, I discovered some interesting stuff that I want to highlight a little bit. So far as I can tell, they’re not terribly common (yet), but that’s what makes them interesting.

If you’re the kind of person who’s looking forward to SQL Server 2019’s performance features solving some problems for you auto-magick-ally, these may be things you need to watch out for, and depending on your workload they may end up being quite a bit more common than I perceive.

I’m going to be specifically focusing on how BMOR (and to some extent Adaptive Joins) can end up not solving performance issues, and how you may end up having to do some good ol’ fashion query tuning on your own.

In the next post, we’ll look at how one of my favorite demos continues to keep on giving.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Scalar Aggregates Can Use Hash Matches With Batch Mode

Posted on February 27, 2020May 16, 2022 by Erik Darling

GREAT

Going Further

SQL Server 2019: A Heuristic Evening With The Optimizer

Posted on February 25, 2020May 14, 2022 by Erik Darling

The Land Of Do-Overs

Of the new things in SQL Server 2019 that I plan on presenting about, the Batch Mode on Row Store (BMOR, from here) enhancements are probably the most interesting from a query tuning point of view.

Things like Accelerated Database Recovery, Optimize For Sequential Key, and In-Memory Tempdb Metadata are cool, but they’re server tuning. I love’em, but they’re more helpful for tuning an entire workload than a specific query.

The thing with BMOR is that it’s not just one thing. Getting Batch Mode also allows Adaptive Joins and Memory Grant Feedback to kick in.

But they’re all separate heuristics.

Getting Batch Mode

To get Batch Mode to kick in for a Row Store query, it has to pass a certain set of heuristics, which can be viewed in Extended Events.

SELECT dxoc.object_name, dxoc.name, dxoc.description
FROM sys.dm_xe_object_columns AS dxoc
JOIN sys.dm_xe_objects AS dxo
    ON  dxo.package_guid = dxoc.object_package_guid
    AND dxo.name = dxoc.object_name
WHERE dxo.name = 'batch_mode_heuristics' AND dxoc.column_type = 'data'
ORDER BY dxoc.name;

Once we’ve got Batch Mode, we can use the other stuff. But they have their own jim-jams.

Getting Adaptive Joins

To get Adaptive Joins, you need to pass these heuristics.

SELECT dxmv.name,
       dxmv.map_value,
       dxo.description
FROM sys.dm_xe_map_values AS dxmv
    JOIN sys.dm_xe_objects AS dxo
        ON dxmv.object_package_guid = dxo.package_guid
           AND dxmv.name = dxo.name
WHERE dxmv.name = 'adaptive_join_skipped_reason';

SQL Server Extended Events — Try refreshing

No, those aren’t plain English, but you can decode most of them. They mostly deal with index matching, and cardinality making sense to go down this route.

Getting Memory Grant Feedback

There isn’t a lot in Extended Events to tell you when this will happen, but it is documented. And written about.

For excessive grants, if the granted memory is more than two times the size of the actual used memory, memory grant feedback will recalculate the memory grant and update the cached plan. Plans with memory grants under 1MB will not be recalculated for overages.

For insufficiently sized memory grants that result in a spill to disk for batch mode operators, memory grant feedback will trigger a recalculation of the memory grant. Spill events are reported to memory grant feedback and can be surfaced via the spilling_report_to_memory_grant_feedback XEvent event. This event returns the node id from the plan and spilled data size of that node.

We can still see some stuff, though.

SELECT dxoc.object_name,
       dxoc.name,
       dxoc.description
FROM sys.dm_xe_object_columns AS dxoc
JOIN sys.dm_xe_objects AS dxo
ON dxo.package_guid = dxoc.object_package_guid
AND dxo.name = dxoc.object_name
WHERE dxo.name = 'memory_grant_feedback_loop_disabled'
AND dxoc.column_type = 'data'

UNION ALL 

SELECT dxoc.object_name,
       dxoc.name,
       dxoc.description
FROM sys.dm_xe_object_columns AS dxoc
JOIN sys.dm_xe_objects AS dxo
ON dxo.package_guid = dxoc.object_package_guid
AND dxo.name = dxoc.object_name
WHERE dxo.name = 'memory_grant_updated_by_feedback'
AND dxoc.column_type = 'data'
ORDER BY dxoc.name;

Getting All Three

In SQL Server 2019, you may see plans with Batch Mode operators happening for Row Store indexes, but you may not get an Adaptive Join, or Memory Grant Feedback. If you have a lot of single-use plans, you’ll never see them getting Memory Grant Feedback (I mean, they might, but it won’t matter because there won’t be a second execution, ha ha ha).

It’s important to remember that this isn’t all just one feature, but a family of them for improving query performance for specific scenarios.

On Enterprise Edition.

In compat level 150.

Say, where’d I put that scotch…

Thanks for reading!

Going Further

Using Column Store Indexes To Improve Unpredictable User Search Queries

Posted on January 6, 2020May 16, 2022 by Erik Darling

And Cough

Going Further

SQL Server’s Batch Mode On Row Store Is Fickle

Posted on December 19, 2019May 14, 2022 by Erik Darling

Thanks Though

I’m excited about this feature. I’m not being negative, here. I just want you, dear reader, to have reasonable expectations about it.

This isn’t a post about it making a query slower, but I do have some demos of that happening. I want to show you an example of it not kicking in when it probably should. I’m going to use an Extended Events session that I first read about on Dmitry Pilugin’s blog here. It’ll look something like this.

CREATE EVENT SESSION heristix
    ON SERVER
    ADD EVENT sqlserver.batch_mode_heuristics
        ( ACTION( sqlserver.sql_text ))
    ADD TARGET package0.event_file
        ( SET filename = N'c:\temp\heristix' )
    WITH
        ( MAX_MEMORY = 4096KB,
          EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
          MAX_DISPATCH_LATENCY = 1 SECONDS,
          MAX_EVENT_SIZE = 0KB,
          MEMORY_PARTITION_MODE = NONE,
          TRACK_CAUSALITY = OFF,
          STARTUP_STATE = ON );
GO

The Setup

Let’s start with some familiar indexes and a familiar query from other posts the last couple weeks.

CREATE INDEX something ON dbo.Posts(PostTypeId, Id, CreationDate);
CREATE INDEX stuffy ON dbo.Comments(PostId, Score, CreationDate);

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0;

The query plan is unimportant. It just doesn’t use any Batch Mode, and takes right about 2 seconds.

If we look at the entry for this query in our XE session, we can see that the optimizer considered the heck out of Batch Mode, but decided against it.

Curiouser

If we add a hash join hint to the query, it finishes in about 800ms.

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0
OPTION(HASH JOIN);

All the operators in this plan except Gather Streams are run in Batch Mode. Clearly it was helpful.

And according to the XE session, we can see that decision in writing. Yay.

Alt Roq

If we modify our indexes slightly, we can get an Adaptive Join plan.

CREATE INDEX something_alt ON dbo.Posts(PostTypeId, CreationDate, Id);
CREATE INDEX stuffy_alt ON dbo.Comments(Score, CreationDate, PostId);

And, yes, this is about twice as fast now (compared to the last Batch Mode query), mostly because of the better indexing.

Is There A Moral So Far?

Yes, don’t count on Batch Mode to kick in for every query where it would be helpful.

If you want queries to consistently use Batch Mode, you’ll need to do something like this.

SELECT COUNT(*)
FROM   dbo.Posts AS p
JOIN   dbo.Comments AS c
    ON  p.Id = c.PostId
LEFT JOIN dbo.t ON 1 = 0
WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1
AND p.PostTypeId = 1
AND c.Score > 0;

But you have to be careful there too.

You might lose your nice parallel plan and end up with a slower query.

Huh.

Thanks for reading!

Going Further

In SQL Server 2019 Memory Grants Are Higher With Batch Mode For Row Store Queries

Posted on December 12, 2019May 16, 2022 by Erik Darling

Because I Got

Going Further

The Fastest Ways To Get The Highest Value In SQL Server Part 3

Posted on July 25, 2019August 29, 2023 by Erik Darling

Silent and Grey

In yesterday’s post, we looked at plans with a good index. The row number queries were unfortunate, but the MAX and TOP 1 queries did really well.

Today, I wanna see what the future holds. I’m gonna test stuff out on SQL Server 2019 CTP 3.1 and see how things go.

I’m only going to hit the interesting points. If plans don’t change, I’m not gonna go into them.

Query #1

With no indexes, this query positively RIPS. Batch Mode For Row Store kicks in, and this finishes in 2 seconds.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT MAX(Score) AS Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id

) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

Query #2

Our TOP 1 query should be BOTTOM 1 here. It goes back to its index spooling ways, and runs for a minute.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT TOP (1) p.Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id
	ORDER BY p.Score DESC

) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Query #3

This is our first attempt at row number. It’s particularly disappointing when we see the next query plan.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT p.Score,
	       ROW_NUMBER() OVER (ORDER BY p.Score DESC) AS n
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id
) AS ca
WHERE u.Reputation >= 100000
AND ca.n = 1
ORDER BY u.Id;

On its own, it’s just regular disappointing.

Serial. Spool. 57 seconds.

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Myself.

Query #4

Why this plan is cool, and why it makes the previous plans very disappointing, is because we get a Batch Mode Window Aggregate.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT * 
	FROM 
	(
        SELECT p.OwnerUserId,
	           p.Score,
	           ROW_NUMBER() OVER (PARTITION BY p.OwnerUserId 
			                      ORDER BY p.Score DESC) AS n
	    FROM dbo.Posts AS p
	) AS p
	WHERE p.OwnerUserId = u.Id
	AND p.n = 1
) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

It finishes in 1.7 seconds. This is nice. Good job, 2019.

With the index we get a serial Batch Mode plan, which finishes in about 1.4 seconds.

If you’re confused about where 1.4 seconds come from, watch this video.

Why Aren’t You Out Yet?

SQL Server 2019 did some interesting things, here.

In some cases, it made fast queries faster.

In other cases, queries stayed… exactly the same.

When Batch Mode kicks in, you may find queries like this speeding up. But when it doesn’t, you may find yourself having to do some good ol’ fashion query and index tuning.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Myself.

Going Further

SQL Server 2019: If You Want Adaptive Joins, You Need Wider Indexes

Posted on July 10, 2019May 14, 2022 by Erik Darling

What You Make Of It

We’ve got this query. Handsome devil of a query.

You can pretend it’s in a stored procedure, and that the date filter is a parameter if you want.

SELECT u.DisplayName, p.Score
FROM dbo.Users AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
WHERE u.CreationDate >= '20131201';

A long time ago, when we migrated Stack Overflow from Access to SQL Server 2000, we created indexes.

This one has worked alright.

CREATE INDEX ix_whatever 
    ON dbo.Posts(OwnerUserId);

But Now We’re On 2019

And we’ve, like, read a lot about Adaptive Joins, and we think this’ll be cool to see in action.

Unfortunately, our query doesn’t seem to qualify.

Now, there’s an Extended Event that… Used to work.

These days it just stares blankly at me. But since I’ve worked with this before, I know the problem.

It’s that Key Lookup — I’ll explain more in a minute.

Index Upgrade

First, let’s get rid of the Lookup so we can see the Adaptive Join happen.

CREATE INDEX ix_adaptathy 
    ON dbo.Posts(OwnerUserId, Score);

As We Proceed

Let’s think about what Adaptive Joins need:

An index on the column(s) you’re joining

This gives us a realistic choice between using a Nested Loops join to do efficient Seeks, or an easy scan for a Hash Join.

That index has to cover the query

Without a covering index, there’s too much for the optimizer to think about.

It’s not just making a choice between a Nested Loops or Hash Join, it’s also factoring in the cost of a Lookup.

This used to trigger the XE on eajsrUnMatchedOuter, meaning the outer table didn’t have an index that matched the query.

Why Revisit This?

When SQL Server 2019 comes out, people are gonna have really high hopes for their workloads automagickally getting faster.

While there are lots of things that it’ll likely help, it’s going to take a lot of work on your part to make sure your queries and indexes allow for the automagick to kick in.

Thanks for reading!

Going Further

Operator Time Oddities In SQL Server Query Plans

Posted on July 1, 2019May 14, 2022 by Erik Darling

SILENCE YOUR CELL PHONES

Thanks for watching!

Going Further

How Bad Cardinality Estimates Lead To Bad Query Plan Choices: Does SQL Server 2019 Batch Mode On Rowstore Help?

Posted on May 23, 2019May 16, 2022 by Erik Darling

Bad Fit

In the last post, I tried to play some tricks on the optimizer to have Batch Mode make a query faster.

It didn’t quite work out the way I wanted.

That isn’t to knock Batch Mode. It’s not supposed to fix everything, and I played a silly trick to get some operators to use it.

But it did make me curious if 2019’s Batch Mode for Row Store would improve anything.

After all, no dirty tricks are involved, which means more operators can potentially use Batch Mode, and maybe we’ll avoid that whole skewed parallelism problem.

I mean, what are the odds that I’ll get the exact same plans and everything?

Spolier

The Merge Join plan is still wack as hell.

The forced hash join plan got better, though.

Still Skewy?

The skewed parallelism isn’t as contagious going across without the separate Filter/Bitmap operator.

With both the Hash Join and the Scan of Votes occurring in Batch Mode, the Bitmap is able to be pushed to the storage engine.

Note that there’s no explicit Bitmap filter anywhere in the plan, but one still shows up in the predicate of the Scan of the Votes table.

With 3 out of 4 threads doing… vaguely even work, we end up in a better place, especially compared to only one thread doing any work.

At least now I can show you the Batch Mode Sort lie about being parallel.

Three threads with rows at the Hash Join go into the Sort on a single thread.

You Know…

As much as I love these new bells and whistles, I’m gonna go back to good ol’ fashion query tuning next.

The problem here is still that a bad estimate causes a Merge Join to be chosen in the first place.

When we force the Hash Join, query runtime is reduced at the expense of a quite large memory grant.

This is likely preferable for most people (especially with Batch Mode Memory Grant Feedback), but it doesn’t address the root cause.

Thanks for reading!