SQL Server Performance Office Hours Episode 31

Questions:

I have a very slow delete statement (44 mins for 2000 records). Execution plan shows painful Clustered Index Delete is culprit. Parent and child foreign keys are indexed on two associated tables. Where else to look? Merge join is feeding problematic operator but associated sort isn’t causing pain. What would you check next?
What topics will be covered in the advanced material of your T-SQL course?
I’ve been thinking about career goals. Becoming an MVP sounds good. I already do a lot for the community. Good idea or bad idea?
You often mention pessimistic read committed holding locks for too long because of key lookups WITH UNORDERED PREFETCH. Does Read Committed always show WITH UNORDERED PREFETCH in plans where it hold locks for too long?
In theory, with the Microsoft push of all things to the cloud, query tuning and optimization should be a major priority for companies because CPU time (CUs) is money. In your experience, are companies actually lowering costs through query tuning or are they more focused on improving speed and reducing loads? I rarely hear stories of companies reducing Azure capacities or lowering licensing costs due to query tuning but I don’t have much exposure to it.

Video Summary

In this video, I dive into some interesting questions and discussions from our recent Office Hours session. We tackled a slow delete statement that was taking 44 minutes to process just 2,000 records, which led us to explore the execution plan details and suggested looking at actual rather than estimated plans for more accurate performance metrics. Additionally, I shared insights on my upcoming advanced course content, emphasizing its in-depth nature and wide breadth of advanced topics, from T-SQL to data storage strategies. The conversation also touched on career goals like becoming an MVP and the realities of query tuning in cloud environments, highlighting both the benefits and challenges companies face when optimizing their SQL Server workloads for cost savings or performance improvements.

Full Transcript

Erik Darling here with Darling Data and we have a very, very exciting episode of Office Hours today. We are joined by absolutely no one. Who would want to stand here? Have a bad time. So, if you look down in the video description, you’ll see all sorts of helpful links where you can hire me for consulting by my training, become a Supporting With Money member of the channel, and even ask me Office Hours questions. If you enjoy this content, and you want to support me in some other way that is not giving me money, you can always like, subscribe, and of course tell a friend if you think that this channel would help them and whatever they’re dealing with in their life. Usually it’s SQL Server, but of course people find lots of solace in this channel for other reasons. If you want to see me live out in the world, the nice folks at Redgate are dragging me, well, to a few different places, at least through the end of the year. We got Dallas in September the 15th to 16th. We got Utrecht, that’s the one in the Netherlands from October 1st to 2nd. And we have Past Data Community Summit in Seattle from November 17th to the 21st, where I will be doing two wonderful days of two T-SQL pre-cons with Kendra Little.

And we do look forward to seeing you there. All right. Anyway, let’s talk about Office Hours, because we’re going to have some stuff to go through here, don’t we? We have to answer five questions, not an accidental four like last time. So that’s one, two, three, four, five. I didn’t mess that up. I can count. Wonderful. We have passed the first test of Office Hours.

All right. I have a very slow delete statement. 44 minutes for 2,000 records. All right. The execution plan shows a painful clustered index delete is the culprit. Okay. Parent and child foreign keys are indexed on two associated tables. Where else to look? Merge join is feeding problematic operator, but associated sort isn’t causing pain. What would you check next? Well, I’m going to go out on a limb here, and I’m going to say that you are looking at the estimated execution plan and not the actual execution plan.

If you’re looking at the actual execution plan, you would see helpful things like operator times, unless you’re using like SQL Server 2012 or something, in which case, well, you have bigger problems than this delete statement. So what I would suggest is to either find a place, a development server, where you can run this query and get the actual execution plan, or profile your production server in a way where you can catch this slow delete running. So, you know, just between you, me and the wall, or the camera, or the thing behind me, there are some things that you should look at first.

First, make sure that this delete statement was not just blocked for 43 minutes and 59 seconds and then finished in one second, because 2,000 records is not really where I’m going to think, oh, this makes total sense that this would take 44 minutes. So, you could look at the underlying table. Who knows?

There could be all sorts of things there, like it could be way over indexed, have like 999 indexes on it or something. I suppose it could be like a, there could be a columnstore index involved. That might slow things down a bit, but not like 44 minutes slow it down.

So, my suggestion would be to, you know, get some contextual information, make sure, like, you know, maybe fire up the block process report and make sure that this delete isn’t just getting blocked for a very long time. And then, absent a development server with sufficient data to reproduce a long-running delete, I would profile the production server in a way where I can collect the actual execution plan and not rely on very misleading cached or estimated execution plans, either from the plan cache or query store. There are various ways of doing that.

That SP human events, my store procedure, does give you that ability. So, I would maybe do that. Otherwise, you know, really just stop looking at estimated execution plans and looking at operator costs as being a good source of durable performance metrics because they are not.

All right. Next question. What topics will be covered in the advanced material of your course?

A lot of them, and they’ll all be advanced. So, there’s a lot of stuff in the beginner content, which, of course, also has an advanced component. And then there’s also sort of maybe not directly T-SQL things, but things, you know, around performance, maybe some things around various, you know, ways of storing data that has upsides and downsides, things like that.

But there’s going to be a lot in there. It’s probably too much to really list and go through here, but it’s going to be a very, very in-depth and have a wide breadth of material of an advanced nature. So, keep your eyes out for that.

I am currently, I’ve written a lot of the starting, well, not starting. I’ve written a lot of the, you know, like early modules in it. And as soon as I have like a little bit more of that done, I’m going to get that tech reviewed and start recording it.

I’m going to release the material for the advanced course a little bit differently than I did for the beginner course. I think I’m going to piecemeal it out in some smaller batches. Whereas with the beginner stuff, I really like put out a lot discontinuously over like two or three weeks until it was done.

This one, I’m going to release some sort of smaller chunks, I think. Next question. I’ve been thinking about career goals.

Really? How old are you? You’re just thinking about them? All right. Becoming an MVP sounds good. Well, if it sounds good to you, don’t, why are you asking me? I already do a lot for the community.

Good idea or bad idea? I don’t know. I would personally never do it for various reasons that I will not go into here. But it would not be something that I would consider.

But if you already do a lot for the community and you feel that your contributions are ones that Microsoft would see sufficient free marketing labor in, then you should go ahead and become an MVP. You might make some good connections.

You might meet some nice people. You might, you know, get around the world a little bit in that way. But, you know, whether it’s a good idea or not sort of depends on, you know, how humbled and honored you are to carry marketing water for a, what is it, like $250 billion a year corporation? I don’t know.

I don’t know. I don’t know. Who knows? Maybe it would have some benefit for you. Don’t let me stop you. Anyway, let’s go on to the next question. You often mention pessimistic read committed locking holding locks for too long because of key lookups with unordered prefetch.

Well, it could be an ordered prefetch too. It could just, or it could just be an optimized nested loop. Does read committed always show with unordered prefetch and plans where it holds locks for too long?

Well, it’s not holding locks for too long. It’s just holding locks for longer than most people expect and even escalating those locks in a way that causes read queries to block write queries. So, like I said, you will either see unordered prefetch, ordered prefetch, or maybe even optimized nested loops in those query plans.

But more importantly, you should generally see the queries that generated those executable plans in a block process or deadlock report. So, don’t be afraid to use those to track things down as well. All right.

This is a long one. Oh, boy. I’m going to have to think about this one while I’m reading it, which means I might read it wrong. In theory, we do love theories here at Darling Data. With the Microsoft push of all things to the cloud, it’s been going for like a solid decade now.

Not just Microsoft. Lots of people who have cloud to sell want everyone to be in the cloud. Query tuning and optimization should be a major priority for companies because CPU time is money.

It sure is. A big component of my consulting. A little free marketing for me.

Not quite a $250 billion a year corporation, but I’ll do some free work for me. A big part of my consulting is tuning stuff in the cloud so that people need less cloud and spend less money on it. In your experience, hey, there we go.

Well, I led right into that one. Are companies actually lowering costs through query tuning? Well, they are when they work with me. I don’t know about anyone else, but whatever, you know. Or are they more focused on improving speed and reducing loads?

Well, query tuning is improving speed and reducing loads, at least in my estimation. I rarely hear stories of companies reducing Azure capacities or lowering licensing costs due to query tuning, but I don’t have much exposure to it. Well, it does happen, usually shortly after, you know, hiring me.

But, you know, if you’re not really trafficking in that world, you might not see a lot of success stories around it. Now, there is a flip side to that coin, because there are people I work with who went with the absolute cheapest things they could get in the cloud and have terrible performance problems because the hardware is just so, so underprescribed for the workload they’re expecting it to run. But, like, there are Microsoft salespeople out there who don’t know this thing from the thing behind them who will just try to do anything to get people started in the cloud, and they will forecast these terribly small resources and be like, oh, but it’s only going to be this much a month.

And then people get there, and they’re like, our workload doesn’t run anymore. What happened? And they’re like, oh, no, you might have to spend more money.

So, like, but, you know, I end up in some situations where, like, you know, you do have to, you know, hard truth people and be like, look, you, like, the hardware you came from was, like, let’s say, this big, and the hardware you’re on now is, like, this big, and it’s not going well. Like, you, like, there’s no automatic performance benefit just by right of being in the cloud. You still need to, like, do things to make things faster.

So, there are some cases where you do have to, you know, tell people, like, hey, you made a bad choice with this hardware, or some Microsoft sales rep gave you bad advice on this hardware. You really do need more, at least in the meantime, until we can tune stuff, until you can maybe shrink stuff down. But a lot of people have, you know, either perhaps, you know, oversubscribed a bit, or perhaps they’re just unhappy with the cloud build generally.

And they’re like, there’s got to be some stuff we can do to fix this. And that’s where, you know, I come in a lot, and I get queries and indexes and workload various bits and bobs tuned. And then they can reduce the size of those servers without causing any major workload regressions.

So, I hope that answers your question. And perhaps even gives me a nice sales pitch. So, thank you for that.

I appreciate you. Anyway, that’s about it for me. I’m going to go do something else. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you tomorrow where we will talk about some more T-SQL learning. We’re going to talk about, I think, quote name and dynamic SQL, and maybe some other little dynamic SQL tricks that I haven’t talked about 7 million times here already.

So, all right. Cool. Thank you for watching.

Goodbye.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.