Thanks Though
I’m excited about this feature. I’m not being negative, here. I just want you, dear reader, to have reasonable expectations about it.
This isn’t a post about it making a query slower, but I do have some demos of that happening. I want to show you an example of it not kicking in when it probably should. I’m going to use an Extended Events session that I first read about on Dmitry Pilugin’s blog here. It’ll look something like this.
CREATE EVENT SESSION heristix ON SERVER ADD EVENT sqlserver.batch_mode_heuristics ( ACTION( sqlserver.sql_text )) ADD TARGET package0.event_file ( SET filename = N'c:\temp\heristix' ) WITH ( MAX_MEMORY = 4096KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 1 SECONDS, MAX_EVENT_SIZE = 0KB, MEMORY_PARTITION_MODE = NONE, TRACK_CAUSALITY = OFF, STARTUP_STATE = ON ); GO
The Setup
Let’s start with some familiar indexes and a familiar query from other posts the last couple weeks.
CREATE INDEX something ON dbo.Posts(PostTypeId, Id, CreationDate); CREATE INDEX stuffy ON dbo.Comments(PostId, Score, CreationDate);
SELECT COUNT(*) FROM dbo.Posts AS p JOIN dbo.Comments AS c ON p.Id = c.PostId WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1 AND p.PostTypeId = 1 AND c.Score > 0;
The query plan is unimportant. It just doesn’t use any Batch Mode, and takes right about 2 seconds.
If we look at the entry for this query in our XE session, we can see that the optimizer considered the heck out of Batch Mode, but decided against it.
Curiouser
If we add a hash join hint to the query, it finishes in about 800ms.
SELECT COUNT(*) FROM dbo.Posts AS p JOIN dbo.Comments AS c ON p.Id = c.PostId WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1 AND p.PostTypeId = 1 AND c.Score > 0 OPTION(HASH JOIN);
All the operators in this plan except Gather Streams are run in Batch Mode. Clearly it was helpful.
And according to the XE session, we can see that decision in writing. Yay.
Alt Roq
If we modify our indexes slightly, we can get an Adaptive Join plan.
CREATE INDEX something_alt ON dbo.Posts(PostTypeId, CreationDate, Id); CREATE INDEX stuffy_alt ON dbo.Comments(Score, CreationDate, PostId);
And, yes, this is about twice as fast now (compared to the last Batch Mode query), mostly because of the better indexing.
Is There A Moral So Far?
Yes, don’t count on Batch Mode to kick in for every query where it would be helpful.
If you want queries to consistently use Batch Mode, you’ll need to do something like this.
SELECT COUNT(*) FROM dbo.Posts AS p JOIN dbo.Comments AS c ON p.Id = c.PostId LEFT JOIN dbo.t ON 1 = 0 WHERE DATEDIFF(YEAR, p.CreationDate, c.CreationDate) > 1 AND p.PostTypeId = 1 AND c.Score > 0;
But you have to be careful there too.
You might lose your nice parallel plan and end up with a slower query.
Huh.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Related Posts
- Getting The Top Value Per Group With Multiple Conditions In SQL Server: Row Number vs. Cross Apply With MAX
- Getting The Top Value Per Group In SQL Server: Row Number vs. Cross Apply Performance
- Signs You Need Batch Mode To Make Your SQL Server Queries Faster
- Multiple Distinct Aggregates: Still Harm Performance Without Batch Mode In SQL Server
One thought on “SQL Server’s Batch Mode On Row Store Is Fickle”
Comments are closed.