SQL Server Performance Office Hours Episode 18

What are the pros and cons of using clustered indexes on #temp tables (as opposed to non-clustered)

I am a junior DBA, I have been reading about performance tuning online. Can you pls let me know how do i start looking for the queries that are worst performing(timing out) and how do I see the actual execution plan of them?

What’s the best way to deal with ascending key issues? Would using OPTIMIZE FOR UNKNOWN or local variables be a good approach?

Hi Erik! Azure SQL Database vs SQL Server on Azure VMs. What do you think are the pros and cons of Azure SQL Database?

I’ve fallen in love with adaptive joins. They’ve solved problems with parameter sensitivity for me. I’ve never seen you mention them in that context. Any reason why?

Video Summary

In this video, I delve into the world of temporary tables and indexes, specifically addressing the pros and cons of using clustered versus non-clustered indexes on temp tables. I also share tips for identifying poorly performing queries through Query Store and offer advice on dealing with ascending key issues in SQL Server. Additionally, I provide a candid opinion on Azure SQL Database, expressing my strong preference for Azure on a VM over Azure SQL Database due to its numerous shortcomings. The video wraps up with a Q&A session where I answer five questions submitted by viewers, covering topics from performance tuning and query optimization to the nuances of adaptive joins and parameter sniffing. Whether you’re a seasoned DBA or just starting out, there’s plenty of valuable information here to help improve your SQL Server skills.

Full Transcript

Erik Darling here with Darling Data. And you might be able to, I don’t know, there’s a background today. So, we’re doing Office Hours. This is where I answer five entire whole questions that you submit. Where do you submit them? That is by itself a great Office Hours question. If you go to this link, which is down in the video description, you can submit questions that I will answer here. Okay? So that’s exactly how it works. No doubt. It’s very different any other time. Usual stuff. If you’d like to support my channel, you can sign up for membership. Also down in the video description. If you like the content, but not like, you’re not like, wow, I would totally pay this guy to keep talking. Proposition there. You can like, you can comment, you could subscribe and all that other good stuff. If you need help with your SQL Server beyond, far beyond what a YouTube can do, then I am available for for consulting help, health checks, performance analysis, hands-on tuning of your SQL Server malfeasance, dealing with SQL Server performance emergencies, and of course, training your malfeasant developers so that you have fewer SQL Server emergencies. If you would like to buy my performance tuning training content, you can do that for 75% off. That’s that combination of link and code there, which is hopefully assembled for you down in the video description.

And if you would like to get in on my new T-SQL course, you can buy that at the pre-sale price. Videos for that will start dropping. Actually, by the time this publishes, they will already have started trickling out into the internet. If you’re going to attend, well, I mean, that information is obviously wrong. Things that held up a little bit with tech review, and there’s a funny little quirk with Podia, where when you put a release date on something, you can’t release before the release date. So ignore that, because that’s wrong. Obviously, obviously. If you’re attending Kendra and I’s past pre-cons in Seattle, you will get access to all this material. This material is considered companion material to that. So, you know, get in on all that stuff while supplies last. If you want to see me live and in person, I will be at all three Pass On Tour events. That’s New York City, Dallas, and Amsterdam. You can see the dates right there. If you can’t, for some reason, send me an email, and I’ll call you and read them to you.

And, of course, Pass Data Community Summit in Seattle, November 17th to 21st. That one’s important enough to say the dates for, I guess. But with that out of the way, let’s answer some questions from you, my lovely users. So here’s the first one.

What are the pros and cons of using clustered indexes on pound sign temp table as opposed to non-clustered? Well, if I’m going… So if I have decided that I am going to test indexing a temp table, perhaps because that temp table gets loaded with a significant enough amount of data, or the way that the temp table is queried tells me that an index would be appropriate here, I am only going to start with a clustered index.

Why? Well, great question. I’m glad you asked. When you create a nonclustered index on a temp table that does not have a clustered index, you have a heap, and you have a nonclustered index. Typically, your nonclustered index is going to be on a rather narrow array of columns.

If that is completely satisfactory to your query, well, maybe that’s good enough for you. But if it’s not, if your nonclustered index is not what we call a covering index that covers all the columns that your query requires, then SQL Server is completely at its own optimization will to ignore your nonclustered index and scan the heap anyway.

So if I was going to offer any advice here, it would be rather than starting with a nonclustered index, start with a clustered index. And if you’re going to start with a clustered index, my preference is to create the clustered index after populating the temp table because it’s generally easier to get a fully parallel insert into the temp table, which can be meaningful if you are loading a lot of data into it.

That’s a lot easier when you just have the heap. And when you create the clustered index afterwards, there’s an additional benefit that you create the index with full scan statistics of the temp table rather than just getting sampled statistics for the temp table when a query finally hits it in a way that SQL Server needs to do cardinality estimation.

So there we go. Next question. I am a junior DBA.

I’ve been reading about performance tuning online. Well, that’s your first mistake. There’s nothing but bad advice online. Can you pose? Let me know. How do I start looking for the queries that are worst performing? And how do I see the actual execution plan of them?

Well, two separate questions there. My big preference for finding poorly performing queries is to use Query Store. I’ve got my pre-store procedure SP Quickie Store available at code.erikdarling.com.

That is great for mining through Query Store data by default. It gives you the top 10 queries that have used the most average CPU over the past seven days. There are, of course, all sorts of parameters that you can use with my store procedure to search through Query Store data that are not available in the Query Store GUI because Microsoft hates you and I love you.

So, that’s the first part. I prefer to look in Query Store for that. And then how do I see the actual execution plan for them?

Well, Query Store is a bit on par with the plan cache for, like, which query plan it shows you, which is, like, the cached plan or the, like, estimated pre-execution plan, which doesn’t have all the actual runtime metrics in there. So, if you want that, you still have to run the query and get the actual execution plan. Now, if the query is so poorly performing that, like, you can’t get it to finish, you can still cheat a little bit here.

So, what you’d want to do is start running the query, but turn on actual execution plans for the query. Don’t turn on live query plans. Live query plans are bugged pain in the butt.

Turn on actual execution plans and start running the query. Then, in another window, use spwhoisactive with the get plans parameter set to true. And then what that’ll do is look at, like, the in-flight query plan.

And sometimes, like, you can catch enough of the bad stuff that’s happening, like, while the query is running. Like, it’ll start updating with metrics on, like, wait stats and, like, operator times and stuff like that. So, you can get, like, a partial view of the actual execution plan while the query is running.

You, of course, could do other stuff if you know what queries you want to catch. It sounds like you don’t, but if you knew what queries you wanted to capture, then I would recommend sort of targeted extended events to capture specific, like, post-execution plans for specific, like, store procedures or queries or something like that. But it sounds like if you’re just on the hunt for ones that are bad, then I would do query store to find the worst-performing ones.

All right. Another one here. What’s the best way to deal with ascending key issues? Would optimize for unknown or local variables be a good approach?

I mean, like, you hate to say no because, like, clearly if you’re dealing… Well, actually, this is an interesting one. So, like, the local variable or optimize for unknown thing would not help specifically if you are using the new cardinality estimator or, as Microsoft so presumptuously calls it, the default cardinality estimator.

Because for ascending key issues where… I assume you’re talking about, like, index key values that are off the histogram, right? So, like, let’s say your histogram goes, like, it’s 200 steps, so let’s just pretend it goes from 1 to 200.

You’re worried about numbers that are over 200 that are not in the histogram, right? So if you’re using the new cardinality estimator, you are already getting the density vector guess, which is the sort of, like, nerdy name for optimize for unknown and local variables. So you’re already getting that.

If you’re using the legacy cardinality estimator, you usually get to guess one row for those off-histogram steps. So, like, optimize for unknown and local variables, you would get the density vector guess regardless of what the off-histogram value is. So, but I don’t know if that’s going to be a good guess for you either, right?

So, like, I don’t, like, I hear that and I’m immediately like, well, I’ve just seen optimize for unknown and local variables cause too many bad cardinality estimates. They still might be better than the guess of one that the legacy cardinality estimator gives you for those things. But, like, I still don’t think they’re going to be, like, great all around for all the other queries that are actually hitting the histogram, right?

Like, those you’re like, well, use the histogram, buddy. So, like, my first approach would be probably more frequent stats updates or, like, using, like, there’s, like, some database scope configuration stuff and some trace flag stuff you can do to help SQL Server with the ascending key stuff. But, like, I would, like, I have clients where I update, like, I’ve set up stats updates jobs, like, specifically on some statistics for, like, like, every 20, 30 minutes because data is in flux so much and, like, bad query plans happen frequently for those.

We’re like, like, just updating stats to keep, like, keep up to date with a few specific ones was the best option. So, like, don’t overlook that as a possibility. All right.

Next question. Let’s see what we got here. Hi, Eric. Azure SQL Database for SQL Server on Azure VMs. What do you think the pros and cons of Azure SQL Database? What’s that old saying?

I’d rather have a sister in a whorehouse than a brother in the Navy? Well, that’s how I feel about Azure SQL Database. I would rather not have a brother or a sister using Azure SQL Database.

I hate that thing. I mean, it’s hard to tell if I hate it more or less than managed instance. Maybe less only because I had such high hopes for managed instance and Microsoft screwed that one so badly.

Like, oh, God. Like, whoever is in charge of that. I don’t know. It’s, I don’t know.

Bad stuff. Anyway, no. I would much prefer Azure on a VM to Azure SQL Database. Anytime someone tells me they’re using Azure SQL Database. Like, aside from the fact that, like, Microsoft was like, hey, the default isolation level here is recommitted snapshot isolation because we don’t want to deal with your stupid blocking problems.

Like, that’s nice, but, oh, the rest of it. Oh, the rest of it. Mm.

Mm. Have you ever seen a company, like, screw up an offering for their own product as badly as Microsoft has screwed up their platform as a service for SQL Server? Like, what happened?

Like, how is AWS so much better at this than you are? It’s embarrassing. Like, your pants left you. Like, how?

How did that happen? Oh, it’s a joke. Anyway, I’ve fallen in love with adaptive joins. Well, there’s someone out there for everyone. In the case of adaptive joins, there’s two someones apparently out there for you. There’s hash joins and nested loops joins.

They’ve solved problems with parameter sensitivity for me. Well, that’s nice. I’ve never seen you mention them in that context. Any reason why? Well, there are a few reasons why.

And to start with the reasons why, you kind of have to start with where adaptive joins really kick in, which is, of course, like, you know, either like something where, like, columnstore is involved because, like, you know, you do need something batch mode-y for adaptive joins to happen. And then where batch mode on rowstore kicks in, which is compat level 150 plus enterprise edition only. And only when SQL Server’s internal heuristics say, hey, I think some batch mode on rowstore might be good here.

And, hey, we could use an adaptive join here because we’re using batch mode on rowstore. So, like, there’s times when it might kick in and be very helpful. Even in some of my demos about parameter sensitivity.

I, like, you don’t have to use the old compat level to have them still work. It’s just that they work differently, right? So, the thing with the heuristic thing, like, if you create, like, some sort of columnstore index or object so that you sort of force the optimizer down the batch mode path, that’s one thing.

If you don’t and you’re relying purely on batch mode on rowstore, my experience with it is that when you use the little, when you compile for the little plan first, you don’t get batch mode on rowstore very often, right? Because SQL Server’s like, this is a tiny amount of rows. We don’t need batch mode anything.

We’re just going to do some nested loops, join some key lookups, and move on with our life. So, like, my experience with it is that, like, if it happens to kick in and solve a problem for you, that’s great. But you have to consider situations where the query might run and batch mode on rowstore wouldn’t kick in and you wouldn’t get an adaptive join where all of a sudden you’d be using, like, the old, like, just row mode, no adaptivity type execution plan, and it wouldn’t work for you.

So, like, if you’re going to consider that, like, a solution for parameter sensitivity, you better think real hard about, like, what luck you ran into or what you did to get SQL Server to consistently use a query plan where adaptive joins happen. So, like, when they kick in, they can absolutely be useful because SQL Server’s like, well, I’m going to start with this hash join, and if I don’t get enough rows, I’m going to switch to nested loops, which is great because you have that choice in there, right? Like, SQL Server’s like, oh, do this.

But if that doesn’t kick in, like, reliably for you, which I find is the case for a lot of the stuff I do, that, like, the heuristic-based stuff doesn’t kick in reliably enough for me. And then when I suggest a non-clustered columnstore index, people are like, oh, but I’ve read 17 million blog posts about why I shouldn’t use non-clustered columnstore indexes because I do an update. Oh, dear God.

I updated a data, and now I can’t use a columnstore. Like, make things much harder on yourselves than they need to be most of the time. All right.

Anyway, that’s five questions, right? One, two, three, four, five. Cool. I feel like I’ve done my good deed for the day. I’m going to call this one here. Thank you for watching.

Thank you for sending in questions. I hope you enjoyed yourselves. I hope you learned something. And I will see you in another Office Hours video probably, probably, like, next week. All right.

Cool. Goodbye.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.