SQL Server Performance Office Hours Episode 17

SQL Server Performance Office Hours Episode 17


I work for a small startup (7 ppl, 3 on sql server including the boss). Our software is a complex ERP system with business logic heavily implemented on sql server. We are planning to move to Postgres. Is postgres a good platform for complex business logic and performance?  Can a small team achieve this migration for a complex software? Your expert opinion?
Hey Erik. What deciding factors should be considered when marking an entire table for Update Statistics? Is it based on how fast the data changes, or is it okay to run the Update Statistics overnight, irrespective?
Hi Erik, I have top wait type SOS_SCHEDULER_YIELD which is way higher then second one: SOS_SCHEDULER_YIELD: 75h of 230 Hours Sample, 1.5ms on avg. Second PAGELATCH_SH is 5h total. Should I focus mostly on most CPU-intensive queries (in cache) or this can be connected also with “High CPU ready time on VM side” or lack of proper indexes that causes too much scans index/table from memory? Strange is that I see this wait (I’m logging sp_Blitz every 15 min) as top also on weekends when there is no heavy load in DB. Should I start logging sp_PresureDetector to see number of worker threads used and try to correlate it with high SOS_SCHEDULER_YIELD wait? Thank you!
I see Brent is going to put AI in his procedures. Are you plannong on doing that too?
Why would I use dynamic SQL to defeat parameter sensitivity, when I can just use OPTIMIZE FOR?

Video Summary

In this video, I dive into some of the most pressing questions submitted by our community during my office hours session. We tackled topics ranging from migrating to PostgreSQL for complex business logic and performance considerations, to optimizing SQL Server statistics updates and identifying CPU-intensive queries. It was a lively discussion with plenty of insights and advice shared among us. Whether you’re considering a database switch or looking to fine-tune your SQL Server environment, there’s something here for everyone. Additionally, I share my thoughts on the future of AI in SQL Server procedures, explaining why I believe it’s more practical to leverage existing tools like Copilot rather than building custom solutions. The video is packed with valuable information and actionable steps that can help you improve your database management skills.

Full Transcript

Erik Darling here with Darling Data and, well, Ben Huffin again, which means it’s time for some office hours. I will answer five of your questions that you submit through this link right here, this beauty, which is down in the video description. Speaking of the video description, if you would like to become a member, a paying member of this channel to support my efforts to bring you all this work, wonderful, high-quality SQL Server content, you can sign up for a membership in the exact same place. I don’t know why this thing has me turning so pink right now. It’s a little unsettling. I am not normally radiation poisoning red, but I don’t know. Maybe it’s just how I feel inside. If you do not care that much about me with money, you can like, you can comment, you can subscribe, and you can ask me questions for these here episodes of Office Hours, which is always always a highlight. If you need help from a SQL Server consultant. I am one of them. One of too many, I think. When I look around LinkedIn, it’s absolutely saturated with idiots who have experts in their LinkedIn profile, and then you see the things that they say and they write about, and you’re like, you wish that you could bring back firing squads.

None of them would make it. If you need help with SQL Server, health checks, performance analysis, hands-on tuning of your SQL Server workloads, responding to SQL Server performance emergencies, and of course, training your developers so that you don’t have those anymore, and you can go back to drinking or huffing compressed air in peace, I’m available. And as always, my rates are reasonable. If you would like some performance tuning content from me, I have 24 hours of it available. With the discount code, it’s about $150, and that’s pretty good. That $150 will last you the rest of your life. So good for you if you choose to look that far into the future. Bet you have a 401k or something. A real whiz kid.

I, of course, have a new course coming out. Learn T-SQL with Eric. That’s me. That’s Eric right there. This fella. All this content is being recorded leading up to Kendra and I doing our past pre-cons, so attendees to those will get access to all of this as part of their admission. You buy tickets. You come, you get access. The course is available now for everyone else for a pre-sale price of $250 that will go up to $500 when everything is said and done. The videos will start dropping shortly. Well, actually, this is going out after May 10th, so videos will have already started dropping by the time this thing sees the light of day.

I record the office hours ahead of time and schedule them for every Monday, so that’s why sometimes the dates are a little funky on me. So I apologize, but it would be a lot of work to fix that, so we’re not going to do it. I did manage to fix this slide, though, because New York’s E-Saturday will have already happened, so we no longer need to talk about that.

But we still do need to talk about PASS being on tour this summer. New York City, August 18th to 20th. Dallas, September 15th to 17th. And Amsterdam, October 1st to 3rd.

Of course, all of that is leading up to just warming us up, getting us nice and loose and limber and getting the blood flowing and muscles in order for PASS Data Community Summit in Seattle, November 17th to 21st. So we’ll be doing as much as we humanly possibly can to do all that.

But with that out of the way, let’s answer some questions here on this office hours episode. Oh, we got some long ones in here. Boy, oh boy, oh boy.

Y’all like typing so much. All right. I work for a small startup, seven pipples, three on SQL Server, including the boss. That’s a mistake.

Never let your boss use SQL Server. Our software is a complex ERP system with business logic heavily implemented on SQL Server. We are planning to move to Postgres. Is Postgres a good platform for complex business logic and performance?

I don’t know. I don’t use Postgres. But what I can tell you is the number of people who have said the exact same thing to me. We’re going to move to Postgres.

Postgres. One of my oldest clients, I started this consultancy in 2019. I have had a consultant since April. No.

March of 2019. To today. Who, back in 2019, said, we’re going to be on Postgres in six to eight months. You know where they’re not?

Postgres. Postgres. If you want to dip your toes in it, try moving some little pieces of it over to Postgres. See how it goes. See how it handles whatever tiny little bits of logic you have. Maybe it’ll go well.

Maybe it’ll go not well. But if your idea is to just move the whole thing over, I will see you in six to eight months. You goofballs.

Where do you get this stuff from? Boy, oh boy. Oh, that’s the wrong. That’s the wrong ZoomIt option. Hey, Eric.

Hey. How you doing? What deciding factors should be considered when marking an entire table for update statistics? Is it based on how fast the data changes or is it okay to run the update statistics overnight, irrespective?

Oh, gosh. Are you… So, like, when… Reasons to update statistics, you know? You got a table that, you know, maybe sees a lot of modifications.

Maybe the statistics fall out of date pretty quickly. You know, the big thing that I need to tell everyone about update stats is, you know, the update stats isn’t generally there to help you with, like, data that already was in the stats histogram that’s changed.

Usually, the beauty of update stats is to get new data into the histogram that is not currently represented in it. So, if you are constantly adding new data to the table, then I think it is a good idea to update your statistics for that table. So, you know, you don’t have to deal with any off-histogram searches against your table that might not have any representation.

It might get you a pretty bad cardinality estimate. So, yes, update your statistics. But, you know, overnight is fine.

But I’ve worked in some environments where we had to update stats way more often than that. And some other environments where not only did we have to update stats way more frequently than overnight, like every hour, half hour or so, we also had to do it at a very specific sampling percent in order to get a good histogram for things.

So, you know, you might find that the overnight… Just the overnight stats update is not enough. Don’t be afraid to update your stats more often.

All right. It’s hard to tell where this one ends. That’s a lot of writing. All right.

Hi, Eric. Hello. I have weight type SOS scheduler yield, which is way higher than second one. SOS scheduler yield is 75 hours of 230 hours. PageLodgeSH is five hours total.

Should I focus mostly on CPU-intensive queries? Or should… Or this can be connected also with high CPU ready time on VM side? So you’re going in a few different directions here.

And it seems like you’re doing a lot of analysis, but not a lot of actual fixing of things. I think you are generally right that you should be tuning CPU-intensive queries. What I’m concerned about here, though, are a couple things.

One, you have that much SOS scheduler yield, which is 75 hours. And then the next weight down is PageLodgeSH at five hours. There’s no real mention of, like, CX weights, like CX packet, consumer, like any of those.

So I’m concerned that you may have max stops set to one, or you may have either at the server or database level. Perhaps you have cost threshold for parallelism set to whatever the high value for that is. Or maybe you have just a crap ton of scalar UDFs that force your entire query to run single-threaded.

But so if it were me looking at your server, first I would be trying to figure out where’s all the parallelism? Because it doesn’t seem like you have any, at least not that you mentioned. Maybe there are some, you know, like background tasks that are eligible for parallelism that don’t bring it up past the five-hour mark in 230 hours of uptime, which would be interesting.

But that would be my main concern, is why you’re not getting any parallel queries on this. I don’t even know if that’s a thing for SharePoint anymore, to be honest with you. But, like, besides that, yeah, I mean, like, generally, yes, I would go after CPU.

I would go after queries that have the highest CPU on here. And I would also look at queries that have a very high execution frequency. So you could use SP Quickie Store to look both by average CPU or by executions.

That would give you some pretty good insight into both the stuff that takes a long time when it runs and the stuff that, like, uses a lot of, like, very choppy CPU, right? Things like scalar UDFs would be especially prone to that, especially because, you know, like, they don’t run once per query. They run once per row that they have to process, which leads to a lot of additional scheduling.

I would also potentially, maybe not, like, very, very concerned, but I would also maybe be a little worried that you don’t have a lot of CPUs on this thing. And maybe queries might be waiting a long time to get CPU attention. That would tie into, like, the high CPU ready time on the VM side, potentially.

But, you know, that’s more than I can dig into with the text you’ve given me. In general, I wouldn’t add more observer-ness to this. I wouldn’t start logging more stuff.

I would just start going after meaningful queries that, like, either, like, run and take a long time or run a lot. They, you know, like, again, scalar UDFs, loops, cursors, like, loops and cursors in scalar UDFs, stuff like that. There’s a lot of things that I would go after there.

All right. Next question. I see Brent is going to put AI in his procedures. Are you plan on doing that too? So, no.

And so, like, you know, AI is, or LLMs, however you wish to refer to it. They’re always an interesting topic. You know, you don’t really want to watch where the ball is. You want to watch where the ball is going, right?

It’s not, like, just because things are, like, not great now doesn’t mean that they won’t be great down the line. You know, AI and LLMs, they are constantly improving. But we as humans tend to stay relatively the same.

We don’t make, like, big leaps and bounds in our abilities very quickly. And also, you know, when you think about, like, a lot of stuff that has, like, really fundamentally changed the way that humans interact with the world around them. Like, just, you know, technology-wise, technology side, you know, cars were invented, but it was a long time before cars were widespread.

You know, PCs were invented, but it was a long time before everyone had a PC. Never mind a PC sitting in their pocket, right? Like, cell phones.

Then, you know, the internet was around for a long, long time before everyone’s house was wired with internet. And we have satellite space internet now. So, like, AI, like, for whatever it is now, like, maybe, like, down the line, like, there’s going to be a lapse between, like, you know, the current hype cycle of everything AI all the time.

And, you know, like, what it actually ends up, like, how it actually ends up becoming a part of our lives, who knows. But, like, for me personally, putting AI into this stuff doesn’t make a lot of sense. Like, it would be highly speculative for me to say this, but, like, I saw Brent’s post about him doing it.

But it would only make sense to do that if I were planning on releasing my own, like, Erik Darling SQL Server agent that would, like, you would pay me to answer questions for you from these things, right? Like, that’s the only way it would make sense for me to do it. I don’t know if Brent’s doing that.

I don’t know if that’s his plan. Just because, like, right now, like, think about, like, most people’s usage of LLMs, right? Like, Copilot’s going to be an SSMS, right? So, like, whatever results you get from an SSMS query, it’s not a far trip to Copilot whatever results come back, whether it’s, you know, query plans or indexes or whatever.

Most people will have, like, have access to an LLM sitting next to them, right? So it’s like, you know, you could, like, you could copy something from SSMS, put it in a browser tab. So, like, it doesn’t make sense, like, for me to build anything into it that seems like if someone wants to take those results and do something with them in an LLM, it’s not a very long trip for them to do that on their own.

So, like, I don’t have any plans of, like, creating or introducing my own AI agent yakky mouth thing. So, like, I wouldn’t build anything into mine unless there was a way for that to, like, there’s a lot of work, right? Like, the security and figuring all that out.

A lot of work on that side. So I wouldn’t necessarily do that unless it were going to be, like, a profitable venture for me. So maybe Brent has something figured out that I don’t, if it has to do with marketing, he probably does.

There’s a 99% chance that he’s far ahead on that. But for me, I can’t see a lot of upside to putting that into my procedures in, at least as things currently stand with me and my life. All right.

So last question here. Why would I use Dynamic SQL to defeat parameter sensitivity when I can just use Optimize 4? Well, I mean, I’m not sure why you’re asking me this. It’s a bit silly.

If you can figure out a way to get your parameter sensitivity problems fixed with just using Optimize 4, go for it. So the reason why I present the options that I do is because perhaps there are situations where Optimize 4 does not solve your parameter sensitivity issues. Perhaps it just gives you one big plan that happens to make you happy in the moment.

I don’t know. You also left off what you’re optimizing for. If you’re going to say unknown, I’m going to come to your house and I’m going to perch on your bedpost and stare at you while you sleep.

Because that’s a nightmare. And you deserve nightmares for what you do to SQL Server. If you’re optimizing for specific values, that can certainly work.

I’ve done it in the past. But using Dynamic SQL to fix parameter sensitivity issues is a good tool to have around just in case any of that stuff doesn’t work. And if you’re talking about my defeating parameter sensitivity with Dynamic SQL video, one of the methods that I use in there actually does use a Dynamic Optimize 4.

So you can combine both of those into one big happy family of stuff to do. So if you can get away with optimizing for some values and fixing parameter sensitivity, great. You can stop right there.

If you run into a problem that Optimize 4 doesn’t solve, then you might need to reach a little deeper into your tool bag. Maybe even down into the Dynamic SQL pocket and do a little bit more typing. But anyway, that’s it for today.

That was all five questions. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next office hours. And also the next video and so on and so forth into eternity.

All right. Thank you for watching. Goodbye. There’s the button. There we go. All right.

Now, now, now. All right. We’ll do that one more time so I don’t stick to landing. Thank you for watching. Goodbye. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.