SQL Server Performance Office Hours Episode 56
To ask your questions, head over here.
Summary
In this video, I dive into a variety of SQL Server performance questions submitted by viewers, offering insights and advice on topics like query store discrepancies, parameter sniffing with Dynamic SQL, and identifying performance issues in a complex environment. I also share my free SQL Server performance monitoring tool and discuss upcoming events, including SQL Day Poland and Data Saturday Croatia, where I’ll be sharing my knowledge and engaging with the SQL community. Whether you’re a seasoned DBA or just starting out, there’s always something new to learn in the world of SQL Server performance.
Chapters
- *00:00:00* – Introduction to Performance Issues
- *00:02:25* – Backup Tool Impact on IO
- *00:07:51* – Error Log Analysis
- *00:10:58* – Max Stop Parameter Discussion
- *00:14:02* – Batch Mode for OLTP Queries <– Thanks, AI
Full Transcript
Erik Darling here with Darling Data and we are going to get back into the normal swing of things with Office Hours. Last week we had not one but two bit obscene episodes back to back. They were bitterly obscene. And this week I’m going to get back to answering your user submitted questions. Now this was sort of a funny one because I can’t tell if I’ve already answered these or if they just look familiar because I’ve been staring at them for so long. So, if I’ve already answered these, I apologize. If I haven’t, great. I’ll figure it out. Usually I mark answered next to them. These ones, I don’t know. I don’t know what I did. Anyway, down in the video description, you will find all sorts of helpful links. You can hire me for consulting, buy my training, become a supporting member of the channel, ask me Office Hours questions, all the stuff that one would care to do with links in a YouTube video description. And, of course, if you enjoy this content, I do encourage you to like, subscribe, tell a friend, tell a loved one, spouse, I don’t know.
Really, anyone who will listen. Anyone with a YouTube account, you can just subscribe and, you know, turn off notifications, make it sneaky. It doesn’t matter to me. Anyway, I have a free performance monitoring tool. It’s awesome. It’s open source. You don’t owe me anything for it. It’s just all the stuff that I would care about monitoring for performance in SQL Server that I’ve built a dashboard for and I’m giving it away for free.
And I think it’s pretty neat. There’s also built, if you, if you choose, if you, if you are enjoying the robots that have infested our world and lives, there are opt-in MCP servers. So, with well-defined tools so that your robot friends can chat directly with your performance data and they can get a pretty good idea of what’s going on on your servers and robot format. And they can hand you a nice little summary and say, this is where stuff got wonky. Go fix it.
Or you can say, can you fix it? Maybe they can. I don’t know. Who knows what those robots will get up to. I’ve got three events that I can publicly talk about so far this year. I will be adding more to this. You’ll notice that, what’s it called? DataTune Nashville has just dropped off because I have done that one.
Data Saturday Chicago, depending on when this gets published, may have already happened. If it did already happen, it was a fantastic event. I’m glad I went. I’m so happy that I saw you and talked to you. And I will see you next year.
Aside from that, I have SQL Day Poland coming up May 11th and 13th and Data Saturday Croatia coming up June 12th and 13th. So, I look forward to seeing all new friends in Poland and Croatia. I’ve never been to Croatia before. I’m quite excited to go someplace new.
Come to think of it, I’ve never been to Poland either. So, I’m doubly excited to go to two new places. It should be absolutely fantastic. But, it is March, it is baseball, it is beautiful outside, it is actually hot in my office recording today.
You may detect a slight sheen on my face. Not to be confused with the clothing label. Hopefully, I don’t have a sheen face.
Anyway, let’s answer some questions. You’ve got a face for sheen. Anyway, here we go.
So, these are the ones, I can’t, again, I can’t recall if I’ve already answered them, but here we go. So, query store says everything is fine, but users keep complaining. What should I trust more?
Well, I mean, your users are sitting there experiencing things, and who knows how you’re looking, choosing to look at query store, and how you’re, and how query, you’re, you are interpreting what query store is telling you, is everything being fine. That being said, there are a number of things that maybe don’t show up so well in query store that will be worth investigating.
You happen to be a very lucky young person in the world because you can download my free SQL Server performance monitoring tool, and you can start figuring out who’s right and who’s wrong. Is it the users?
Is it query store? You can start to make all sorts of much, much better judgments and determinations and be far less in the dark about all these things than you currently are. If you are unable to use that, I would highly recommend using my free store procedure, SP Quickie Store.
It’ll allow you to sort of slice dice and search through your query store a little bit better, a little slightly smarter ways than the query store GUI makes available to you, so that you can, you can start figuring some additional things out in there.
But either way, if your users are still complaining, you’ve still got a problem to solve. Maybe your job is to figure out if it’s SQL Server at all, right? Things can be very, very fast in SQL Server, very, very slow in the application, right?
And I don’t mean that in an Erlen Somerskog way. I mean that there are all sorts of things that happen between a SQL Server and an application. They are not SQL Server being slow, they are something either application side or application latency that can make things feel terrible for users but are not actually a database issue.
All right. Next, we have here, I fixed parameter sniffing using Dynamic SQL. I would pay up to $5 to see that, see that work, up to and including $5 to see that Dynamic SQL.
I really would. And now CPU is up 40%. That would be worth the cost and price of admission right there.
Did I really fix anything or just trade one problem for another? Curious how you usually decide if this trade-off is worth it. So it sounds like you fixed parameter sniffing, or parameter sensitivity more specifically, which may be preventing quite wild swings in execution plans.
But it sounds like you did not fix the underlying performance problem. So while your queries may not be parameter sensitive anymore, you may have a bunch of queries that are just hauling off using giant parallel plans and pushing CPU up much higher.
My suggestion from here would be to start analyzing the plans that your Dynamic SQL is generating and figuring out which ones are using the most CPU and how you might go about addressing those. Depending on how you’ve written the Dynamic SQL, for example, if you have not written fully parameterized Dynamic SQL, you may be spending a lot of time just coming up with the same execution plans over and over again, passing in literal values, SQL Server having no idea that it’s already come up with an execution plan for basically that same query and, you know, burning CPU away.
So what I would suggest is treat this like a normal database performance problem. Look at high CPU consumers and start tuning those. And again, a free monitoring tool that can be very helpful for that.
SP who is active shows almost no blocking at all. Oh, we’re blaming tools again here. But during business hours, users complain. users keep complaining.
Users complain. Users complain that the app is constantly hanging. We see random pauses of 5 to 15 seconds.
Where would you start looking? So I’m going to tell you to start looking someplace weird. I’m going to tell you to look in the error log. And I want you to look for two things in the error log.
I want you to look for signs that you have a backup tool that is freezing IO, potentially for 5 to 15 seconds. That’s the first thing I want you to look at. What you’ll notice is that if you have a backup tool that’s taking database snapshot backups, you will see messages like IO frozen and IO resume.
I just so happen to have a free store procedure called SP log hunter, which will go through your error log and find these troublesome, meddlesome messages and allow you to look at the time. The last time that a database was frozen and the first time that a database started resuming. If you see a 5 to 15 second pause in there, then there is a good chance that during that pause, all your write queries, since drive activity is quiesced or frozen, as they say, your write queries got blocked by that write freeze.
And they may be blocking other queries, and they may be blocking other queries up behind it. So that’s the first thing I want to look for, want you to look for, not me. Unless you pay me, I’m not going to go look for it.
The other thing that I would look for is 15 second IO warnings. SP log hunter will also surface those, and they will tell you when SQL Server is waiting on IO requests for greater than or equal to 15 seconds. Those are the first two places that I would look, because they will tell you if you’ve got a query problem or if you’ve got an issue with your externalities, the things surrounding the database.
The database is not, the perimeter of performance is no longer just drawn around the database server. There are all sorts of other things that may come in and make everything go pear-shaped. So that depends on where you’re starting a little bit.
If your max stop is zero, or if your max stop is very high, like 16 or 24 or 32 or something, lowering it might actually be fairly useful for you. But I think mostly lowering max stop is a stability band-aid in that, you know, your queries will no longer, the lower you make max stop, the fewer worker threads per parallel branch your queries can consume. And so you might find that you don’t hit absolutely disastrous situations around your CPUs as often.
So you might find that things like thread pool weights calm down, you might find that SOS scheduler yield calms down a bit, perhaps you won’t see as many queries that are sitting there in a runnable state, waiting to get CPU attention, things like that. So that’s my take on it. So mostly a stability band-aid, but depending on what your starting point is, you may not find it to be as useful.
Wow, ZoomIt is just absolutely going wild on me. Batch mode was amazing for reporting, but kind of wrecked our OLTP latency. Is there any middle ground or is this just life?
Well, it’s a good question. I suppose it depends a bit on how you have chosen to implement batch mode. Batch mode, not columnstore, batch mode.
Huh. I kind of read the wrong thing there. I figured you would be like, oh, yeah, columnstore. So, you know, there are various times when batch mode may kick in for queries unnecessarily. But here’s the sort of red flag for me on this, and you can stop me if the situation I’m describing is exactly as it’s happening.
If you have OLTP queries where batch mode kicks in and SQL Server is making the choice to have batch mode do certain things throughout your query plan, your OLTP queries may not be so finely tuned for OLTP work. They may be hitting a lot more rows or being estimated to be hitting a lot more rows than they actually are.
And that’s probably where I’d want to take a look at things there. If you have, you know, like most, you know, you’re talking about OLTP. So I’m thinking, you know, tiny insert updates and deletes, right?
And if you said batch mode, not columnstore, so I’m going to just maybe kind of hedge things a little bit and say that columnstore is not slowing down your itty bitty modification queries. But it sounds like batch mode is causing you some fits when it comes to your normal OLTP read queries. And to me, that says that your OLTP read queries are in need of some attention.
So I would take a look at those and I would start to ask why SQL Server is choosing to get batch mode involved at all in those. You know, batch mode only kicks in when certain heuristics are met. And I’d be curious precisely which heuristics are making batch mode an attractive arrangement for your OLTP queries.
Anyway, if I answered those before, I don’t know if I did a better or worse job this time, but maybe I did better. I don’t know. Either way, I hope you enjoyed yourselves.
I hope you learned something. And I will see you in tomorrow’s video where we will talk about, I don’t know, something else. Be fun. Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.