SQL Server Performance Office Hours Episode 52
To ask your questions, head over here.
Summary
In this video, I delve into a series of questions submitted by viewers during our office hours session, offering detailed and thoughtful answers to topics ranging from the nuances of parameter sniffing and memory grants in stored procedures to the complexities of index maintenance, query tuning for cloud environments, and the trade-offs involved with using columnstore indexes. I also discuss the impact of reducing CX weights on query performance and stability, as well as why batch mode can sometimes slow down OLTP queries despite making analytical queries faster. Additionally, I explore the challenges of achieving accurate statistics updates in a full scan scenario and touch upon the architectural considerations when implementing columnstore indexes for hot OLTP tables. If you have any questions or topics you’d like to see covered in future office hours sessions, feel free to ask them through the link provided in the video description.
Chapters
- *00:00:00* – Introduction
- *00:01:00* – Office Hours Overview
- *00:02:02* – Upcoming Events
- *00:03:23* – Answering Questions
- *00:04:16* – Query Performance Discussion
- *00:08:12* – Cloud Savings Explanation
- *00:10:19* – Statistics and Estimates
- *00:12:01* – Columnstore Trade-offs
- *00:15:15* – In-Memory OLTP Usage
Full Transcript
Erik Darling here with Darling Data, and you may have guessed, based on various temporal elements, that it is Monday, and Monday means it is time for office hours, where I answer five, as long as I count correctly, user submitted questions to my highly advanced Google Sheets document, and do my best to give good answers, good thoughtful answers, to those. You can find the link to ask those questions down in the video description. While you’re on your way to that link, perhaps, to ask a question, you will see all sorts of other mutually beneficial links down there, in which you can hire me for consulting, which is a great idea. You can buy my training, which is, likewise, a fantastic idea. You can become a supporting member of the channel. For as few as $4 a month, you can say, good job, Erik. Thank you for spending hundreds of hours a month, recording YouTube videos, and coming up with content for us. And then, of course, you can ask that office hours question for free. Amazing. Is there no limit to my generosity? Yes. No. Not sure. And, of course, if you enjoy this content, if you think that someone else in your life may also enjoy this content, please do like, subscribe, and tell a friend, because that’s the nice thing to do. And that is free for all of us. So, I will be out in the world. You know, it is February, currently, at least, I mean, it’s not February when I’m recording this, but it’s going to be February by the time you see this. So, I don’t know, you can maybe guess when I’m recording this based on that, but I’m not leaving the house for a bit. It just snowed like 52 feet here in New York. It looks disgusting already. It is various shades of not white.
out there in the snow. But, but, but as soon as that snow melts, as soon as that, that snow abates a bit, I will be back out joining the world. Baseball will be in the swing of things. It’ll be a marvelous time for all of us. Data 2, Nashville, March 6th and 7th. Data Saturday, Chicago, March 13th and 14th. SQL Day, Poland, May 11th and 13th. Look, I missed the, I missed, I missed the on Poland. I gotta fix that. And of course, Data Saturday, Croatia, June 12th and 13th. At all of these events, I will be teaching my advanced T-SQL pre-con. Because I only have 6.5 hours roughly to teach my advanced T-SQL pre-con, and there is much more to say about advanced T-SQL, all attendees will get free access to my learn T-SQL with Erik course, which encompasses both beginner and advanced material and is far, far longer than the 6.5 hours these training days allow.
So, if you want to get a tremendous amount of bang for your buck or whatever currency these other things use, you should come to that and have a great time. I don’t know. Maybe we can both learn something from each other. But with that out of the way, we have flipped the calendar unofficially to February here. Still winter. Sucks.
Asked ChatGPT to come up with a good, like, shining-y, you know, like, I’ve been stuck in the house all winter vibe. And, you know, aside from this one up here, which has, like, some extra fingers, and that one over there that has a backwards axe, I think, you know, like, the face in the doorway over here is particularly good. I like that one. So, I don’t know. I have some time to think about March. I don’t know what March is going to be yet.
We’ll find out when we get there. But, anyway, let’s answer some questions, right? Let’s do that thing that I promised you I would do. We lowered MacStop. We! Like, it took more than one of you? How many people does it take to change MacStop? I guess there’s a joke in there. How many DBAs does it take to change MacStop? Anyway, to reduce CX weights.
All right. That’s a reason to do it, I guess. And queries now run slower, but look more stable. Is that a net win? Well, of course, your queries run slower. You reduce MacStop.
Let’s just theoretically say you reduce MacStop from 8 to 4. You now have four fewer cores with which to process your queries with. There is some chance that they would get slower by, like, half, right?
So, if your query runs in two seconds at MacStop 8, and you reduce that query to MacStop 4, it might run for, like, two seconds instead of one. So, what I’m struggling with a little bit here is when you say, look more stable. I don’t know what look more stable means.
Is this like a tortoise versus the hare situation? Is this slow and steady wins the race? I don’t know what more stable means. You’re going to have to give me some more information about the stability you speak of, because I have no insight into what more stable means to you.
Is it a net win? I don’t know. Are slower queries ever a net win? I suppose if you are hitting thread pool weights or, you know, like, very, very high CPU percentages constantly, then perhaps it is a net win.
Perhaps your server is just a bit less stressed out, worn out, running out of resources. I don’t know, but golly and gosh, I don’t know if I can call that a net win, because I don’t know what you’re measuring stability with.
I don’t know what criteria you find these queries to be more stable by. So, yada, yada, yada. Batch mode made our analytical queries much faster, but OLTP queries slowed down.
Why does batch mode sometimes hurt mixed workloads? Well, a lot of it, actually, I’ll give you the prime thing that I see when batch mode on rowstore, which I assume is what you’re talking about here, because, you know, OLTP, like, why else would that be a thing?
I’ll give you the primary thing that I see is the optimizer decides perhaps that some facet of one of your OLTP-ish queries would benefit from batch mode. So, it could be an aggregate, but more commonly, it is a join.
And most commonly, what happens with joins is that SQL Server starts choosing adaptive joins, which aren’t a bad thing on the face of them. I’d rather appreciate SQL Server no longer just YOLOing with a hash join or a loop join, regardless of how many rows actually start showing up.
But what it can do is take what is typically a nice, efficient, usually single-threaded nested loops join plan and throw a little bit of a monkey in that wrench by starting the plan off with a batch mode hash join and switching over to a row mode nested loops join if enough rows are not emitted from the outer part of the join.
So, and that is primarily what I see happening, is that, you know, what used to be just a very quick, simple navigational query all of a sudden starts getting weird. And, you know, I’m not saying that you should be scared of batch mode on rowstore, but there may be certain queries that you might have to intervene with and say, please don’t do that.
Like, the easiest way is just to say option loop join for those. But typically, that is what I see. All right.
We tuned queries heavily. I mean, I wasn’t there, so I don’t believe you. But didn’t save much money in Azure. Well, no kidding.
An overpriced resort that that is. Why doesn’t query tuning translate directly into cloud savings? Well, let’s think about that for a moment here. You tuned queries heavily.
And let’s say in the course of your heavy query tuning, you reduce CPU from like 80% to like 30%. Typically, where one starts to realize cost savings in the cloud is when one reduces the size of their cloud hardware. So if you, let’s say, you know, tuned all your queries and made things better in that way, it would have to coincide with some reduction in cloud resources to see money get saved.
It kind of reminds me of the question where it’s like, I got rid of all this data in my database, but the database file is still huge. What happened? It’s like, well, that database doesn’t shrink itself, right?
That Azure instance doesn’t shrink itself. So, you know, what are we going to do here? But anyway, my rates are reasonable and I’m happy to help you save money in the cloud.
Aren’t I nice? We run full scan stats nightly, but estimates are still awful. Why does perfect sampling still produce terrible plans?
Well, it may be a full scan sample, but let’s consider a couple of things about statistics, shall we? First is that you, no matter how big of a scan you use, whether it’s full or not, SQL Server still populates the same 200 step histogram with whatever values it finds during that scan, whether it be sampled or full.
It’s not like you produce bigger histograms just because you do a full scan of the stats. You might already have a two-step histogram. The full scan stat might describe certain portions of that histogram more accurately, but there might still be a lot of information that is not well captured by that.
If that’s the case, you might consider filtered indexes or statistics in order to give the optimizer more information about portions of your table data that is not well represented by the histogram.
The other thing to consider is that you might be writing your queries in stupid ways. You might be doing all sorts of things that may inhibit SQL Server from accurately guessing cardinality for things.
Local variables, table variables, non-stargable predicates, those are some things that immediately sort of spike the ball on that. There’s all sorts of other things you might be doing. You might just have terribly complex queries, which would benefit from being broken down into smaller chunks.
So you cannot rely on statistics updates for everything. They are one part of the picture. There are many other parts of the picture for you to consider.
All right. I wonder if this is the same person. It sounds pretty similar. All right.
Let’s see what we got here. I don’t know why this square is so much bigger. columnstore fixed our reporting queries but wrecked inserts. Really? columnstore wrecked inserts. That’s interesting.
What architectural trade-offs should we expect when adding columnstore indexes to hot OLTP tables? So my initial surprise here is because I typically don’t see columnstore doing too terribly with inserts.
And the reason for that is that, you know, since you’re talking about OLTP, I would assume that you are talking about the OLTP inserts, which are usually a very, very low number of rows.
Generally one, but, you know, I would say maybe up to 100 or something would be reasonable for an OLTP workload before it starts, you know, getting into different types of, you know, different, you know, families of queries, like analytical stuff.
So I’m a little surprised at that because those inserts typically won’t touch the columnstore itself. Inserts that are too small to qualify for instant compression into the columnstore just get inserted into the Delta store, which is sort of a clustered B-tree rowstore brain leach on your columnstore.
So I’m a little surprised at that. If you had told me that updates or deletes were having a tougher time because of the columnstore, that I would believe because, you know, updates with a columnstore are, you know, pretty much the, you know, you delete the old row and insert the new row, and then the new row gets inserted into the columnstore, but you still have to like decompress the old row and all the other stuff.
So a lot of things get, you know, there’s a lot more going on behind the scenes with updates. Even deletes usually aren’t like that beat up by it. You know, you do like just sort of use the deleted bitmap thing in there and like mark the row as a tombstone row or a ghost record or whatever the one million different things Microsoft has called it is.
So I’m not sure why your inserts were hurt so badly by that. As far as architectural trade-offs go, well, you know, mostly it’s the stuff that I just talked about.
You know, you get your either reporting style queries or your kitchen sink style queries where, you know, like sort of, you know, you don’t know what the where clause and the select list and the order by is going to be.
It’s non-clustered column stores. Marvelous for getting those types of things sorted out. But, you know, as far as architectural trade-offs go, what some people will do, and I am an occasional proponent of, is to sort of, you know, separate your data out a little bit, have your hot data that might be seeing frequent modifications sit in a normal rowstore table.
And then sort of once something is past the point of, you know, modification, like frequent modification, you might like move that to sort of an archival thing that is, that has a columnstore index on it.
That’s probably the most common of them. In extreme cases, some people will even use like an in-memory table for the really hot data. It’s sort of like a shock absorber table.
A few of my clients in the online gambling space and in the high-demand consumer shopping goods space is spaces.
I don’t know if that’s, I don’t know how many that actually spans when I think about it. I think maybe it’s just all one space where people buy too much crap. The in-memory OLTP thing is very useful for them.
But if you just have like a generically like hot workload for recent data, like, you know, the last week or month or something, you know, you could keep that in a rowstore table.
You could, you know, move that off in like a big batch to a columnstore table and see most of the benefit of having columnstore for your more historical data and rowstore, which has less insert, update, delete, overhead for the hot data.
It’s just sort of up to you to figure out how long something is sort of useful for. For a lot of people, it’s not a very long stretch. You know, you know, you place an order for something like what happens later, like, you know, like you get like a ship date, right?
Like, and then you get like maybe some tracking notifications or that’s not even like in the table itself. And then you get like a delivery date, which probably updates, you know, the order or something.
But, you know, for the most part, like, you know, most stuff is done within like, an hour to days of it, of its lifetime there. You know, like even things like stock trades, you can only do that once.
I mean, you can place a whole bunch of trades, but it’s not like, like after you make a trade, you’re like, oh, wait a minute. I meant three, no, 12. And then like the day goes by, you’re like, ah, I meant 15.
Like there’s a lot of stuff that, you know, like it’s, it’s, it’s hot time, right? It’s moment in the sun is, is fairly short. But like, even like an Amazon order, which is kind of like what I was just talking through, unless someone like cancels it, but that even that’s going to be, have to be pretty quick, right?
It’s not like, it’s not like months later, someone’s going to be like, I’m still waiting on that cancel, right? It’s kind of silly. Anyway, that is a sort of architectural stuff that immediately comes to mind.
I don’t know more about your situation. So it would be pointless and fruitless to, you know, try to think of sillier situations you might be in.
So if you want more details, well, my rates are reasonable. Anyway, thank you for watching. I hope you enjoyed yourselves.
I hope you learned something and I will see you in tomorrow’s video where we will do some other SQL Server stuff. All right, thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.