SQL Server Performance Office Hours Episode 60

Chapters

*00:00:00* – Introduction
*00:02:15* – Office Hours Questions
*00:04:08* – Free Tools Discussion
*00:06:43* – Tuning Terrible Data Models
*00:09:47* – Complex Query Breakdown

Full Transcript

Erik Darling here with Darling Data, and in today’s video, Monday has thrust itself upon us, and it is time to answer not one, not two, five office hours questions submitted by you, the greater community. Before we get into that, down in the video description, there are all sorts of wonderfully curated, helpful links for you to interact with me in ways that we will both enjoy immensely. You can hire me for consulting, you can buy my training, you can become a supporting, paid member of the channel for as few as $4 a month. And if you don’t find that you get $4 a month worth of joy or information or entertainment out of this channel, you can do some other stuff. Like, I don’t know, you can ask me an office hours question like we’re doing today, and you can, I don’t know, maybe that is worth $4 a month to you.

You can also like, subscribe, and tell a friend. If your enjoyment of this channel is somewhere between one of whatever the smallest unit of your local currency is and $3.99 of your local currency, you can just do something like that. You can say, well, you can say, well, you know, it’s worth like $0.99 whatever’s to me, like cents of whatever your variety you have and say, oh, I’ll send this link to someone who might like it. You can also, for free, download, you can buy it for free, my free SQL Server monitoring tool. Totally free, totally open source, no email, no phone home, no nothing weird.

It just runs a bunch of stuff and gathers a bunch of metrics that I would look at if I were a performance monitor, which now I am, and it puts them in pretty charts and graphs. And you can click through those pretty charts and graphs, and you can get to like the root cause of performance problems on your SQL Server. And if you prefer to have your robot friends do the heavy lifting, then there are optional built-in MCP servers you can use in order to have them do the work for you and give you some analysis on your performance monitoring data.

Not just them going out and running crazy DMV queries on your production server. I have a new surprise. Hey, look at that. A new surprise pre-con. Day of Data Jacksonville had one of their presenters drop out, and so I am taking their place.

Silver Metal Eric, coming your way. I’ll be doing a pre-con on advanced T-SQL performance tuning. And so you can come see me there if you are in the greater Jacksonville area. So that’s May 1st and 2nd. And then I’m home. Oh God, my kids are going to grow so much.

I’ll be in Chicago May 7th and 8th for Pass on Tour. I will be in Poland May 11th through 13th for SQL Day Poland. Crazy. Then I’ll be home for a little bit and I will be at Data Saturday, Croatia June 12th and 13th.

And then at least as far as I know, finally at Pass Data Summit, Seattle. It’s a community summit. Comumity. November 9th through 11th.

So all that going on. But for now, let’s answer some champion baseball office hours questions here. All right. Zoom it. Are we going to do it? Oh, we did it. There we go. Where can I download your awesome database AI art?

You can’t. That’s for me and me alone. You can take a screenshot of it if you want, but I’ll be in it sort of. But I don’t know. Maybe maybe that makes it more memorable.

All right. Old darling data hunting your screenshots. But, you know, I don’t know. I’d make a calendar, except they’d all be a year late. And I know how you people are with money.

Anyway, come on. Zoom it. And, you know, free tools. What can you say about them? Jeez Louise. On a hundreds of million row table, why does update statistics sometimes run multi-threaded and sometimes single-threaded, even though the cost is way above the cost threshold for parallelism?

Same reason that happens for a query. The optimizer says, this one’s going to go parallel. This one’s not.

Cost threshold be damned. You might induce it to, you know, what do you call it? Use a parallel plan by maybe choosing a higher sampling rate. That might work sometimes, depending on how the optimizer feels that day.

Unfortunately, there just aren’t as many tricks for this as there are with, like, you know, inducing a query to go parallel. Right? There are things you can do there, but this, you know, kind of stuck a little bit. All right.

This is a long… I had to start charging by the sentence here. When I use the legacy cardinality estimator for a query, the estimates only change a tiny bit. But the query goes parallel instead of serial.

Same stats, parameters, etc. The only significant cost-related difference is that the estimated number of rows without a row goal for the most IO heavy node of the plan goes down. Does any of that match your experience with the legacy cardinality estimator or trying to encourage parallelism?

I’ve got to be honest with you, it’s a very specific situation. I’m not sure that I have encountered your very specific situation. I’m not sure what other local factors apply to your very specific local situation.

Does that match my experiences? No, not exactly. Not in either case. But if it’s working for you and that’s what you want to happen, I’m very happy for you.

Like, have you ever tasted my mom’s cooking? Can you realistically tune your way out of a terrible data model or is that just delaying the inevitable? Yes, you can tune your way out of a terrible data model.

It is just annoying. You know, but it really does depend on in what direction your data model is terrible. You know, like, you know, the most common one that I see is like the overly wide table.

It’s like 200 columns and you’re like, oh, we’re going to need some wider indexes if we want to deal with this. But, you know, usually it comes down to indexing when you when you need to do that. You know, if you ever suggest changing a data model to people, they just start crying because of the amount of things in the application that would have to change and get tested in the whole works.

You know, it is usually incumbent upon us, the data query performance tuning community to deal with the malfeasance of whomever designed these terrible data models and find some way to accommodate them with our our talents and skills. You know, you can, but it sucks. When do you decide a query is too complex and needs to be broken into multiple steps?

Well, usually about the time that it starts not performing well and usually about the time that it starts not performing well because SQL Server does not fasten the nuts and bolts together in the right places. However, there are different ways of approaching this, of course, you know, temp tables are a very nice materialized mechanism for, you know, material like like sticking a portion of some query into an object that SQL Server can then derive statistics on and you can even index it. And so, you know, like my decision is basically, is this query too slow?

Yes. Is it too slow because SQL Server is not figuring something out correctly because it’s complex? Yes.

Then it is time to do it. Usually the appearance of more than one CTE, especially when those CTE start getting joined together is a good sign. Sometimes it is, you know, if there’s a bunch of derived joins and, you know, stuff like that going on with it.

Other times it might just be, you know, some arbitrary number of sub queries in the select list or in the where clause or even in a join clause. Right. But the decision always starts with, is this query performing acceptably?

And if the query is not performing acceptably, then it could be due to complexity, but it could be some underlying pathological issue as well. Right. So sometimes, sometimes it is a break, breaking the query up is the best possible option.

Other times there are, there are other things, other steps that you’re going to have to take regardless of whatever small steps you, you, whatever, whatever multiple, however many multiple steps you choose to break it up in. And I think, you know, at least the nice thing about breaking a complex query up into multiple steps at some point is that it gives you very specific information and feedback about which portions of the query require your attention. Right. Because there might, there might be some portions that, you know, run very quickly on their own.

There might be some portions that run very slowly on their own. But sometimes it really does come down to like putting those two portions or however many portions together is what causes an issue. Right. So for me, it’s really just, you know, it starts with is performance acceptable?

If not, what steps do I need to take to make it perform acceptably? Right. So that’s, that’s pretty much it for me there. All right. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where I’m going to answer one very special office hours question on its own. All right. Thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.