My Name Is
I speak with a lot of DBAs and developers who have either heard nothing about column store and batch mode, or they’ve only heard the bare minimum and aren’t sure where it can help them.
Here’s a short list of reasons I usually talk through with people.
Your Reporting Queries Hit Big Tables
The bigger your tables get, the more likely you are to benefit, especially if the queries are unpredictable in nature. If you let people write their own, or design their own reports, nonclustered column store can be a good replacement for nonclustered row store indexes that were created specifically for reporting queries.
In row store indexes, index key column order matters quite a bit. That’s not so much the case with column store. That makes them an ideal data source for queries, since they can scan and select from column independently.
Your Predicates Aren’t Always Very Selective
Picture the opposite of OLTP. Picture queries that collect large quantities of data and (usually) aggregate it down. Those are the ones that get along well with column store indexes and batch mode.
If most of your queries grab and pass around a couple thousand rows, you’re not likely to see a lot of benefit, here. You wanna target the ones with the big arrows in query plans.
Your Main Waits Are I/O And CPU
If you have a bunch of waits on blocking or something, this isn’t going to be your solve.
When your main waits are CPU, it could indicate that queries are overall CPU-bound. Batch mode is useful here, because for those “big” queries, you’re passing millions of rows around and making SQL Server send each one to CPU registers. Under batch mode, you can send up to 900 at a time. Just not in Standard Edition.
When your main waits are on I/O — reading pages from disk specifically — column store can be useful because of the compression they offer. It’s easy to visualize reading more compact structures being faster, especially when you throw in segment and column elimination.
Your Query Plans Have Some Bad Choices In Them
SQL Server 2019 (Enterprise Edition) introduced Batch Mode On Row Store, which let the optimizer heuristically select queries for Batch Mode execution. With that, you get some cool unlocks that you used to have to trick the optimizer into before 2019, like adaptive joins, memory grant feedback, etc.
While those things don’t solve every single performance issue, they can certainly help by letting SQL Server be a little more flexible with plan choices and attributes.
The Optimizer Keeps Choosing Cockamamie Parallel Sort Merge Plans That Make No Sense And Force You To Use Hash Join Hints All The Time
🤦‍♂️
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.