SQL Server Performance Office Hours Episode 30

SQL Server Performance Office Hours Episode 30



Questions:

  • Hi Erik, Thank you for the great content. Why is it better to have a higher sample rate for update statistics?
  • Hi Erik! I had a procedure with a chain of CTEs and UNIONs. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues. I went in desperation and popped an OPTION (RECOMPILE,HASH JOIN) in the end of the CTEs where they were inserting into a #Table. This only took it to less than 2 seconds! How on Earth can this has such a performance effect and SQL Server could not find out by itself? Thanx
  • I recently purchased the Everything Bundle and am enjoying each module. Is there a recommended viewing order? For example, the Index Tuning module seems to reference the Hardware modules.
  • If you were curating a Paul White’s Greatest Hits compilation what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL Servers. Remember: All of Paul’s posts are at https://www.sql.kiwi/

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five of your most pressing SQL Server questions during one of our live Office Hours sessions. Whether it’s about the benefits of a higher sample rate for update statistics, optimizing performance in complex queries using OPTION(RECOMPILE) and HASH JOIN, navigating through the Everything Bundle modules, or delving into Paul White’s invaluable blog posts, I cover it all. I also share some upcoming events where you can interact with me in person, including the PASS On Tour event in Dallas and Utrecht, as well as the Pass Data Community Summit in Seattle. If you have any questions or topics you’d like to discuss, make sure to leave them in the video description or reach out directly—your feedback helps shape future content!

Full Transcript

Erik Darling here with Darling Data, and it’s time to do one of those there Office Hours episodes where I answer five of your most burning SQL Server questions, or whatever you ask me about. Anyway, five. If you want to interact with me in any way, you know, virtually, physically, whatever, look down in the video description. All sorts of helpful links. You can hire me for consulting, buy my training, become a member of this channel, and support my efforts to bring you high-quality SQL Server content. Ask me Office Hours questions on episodes just like this. You could be one of the lucky five people who gets a question answered. And of course, if you enjoy the things that I do here and you think someone else might enjoy them as well, you know, like, subscribe, tell a friend, all that good stuff. I will be live and in person a few more places through the end of the year. Dallas, September, September 15th to 16th for the Pass On Tour event there. Utrecht, same event, different place, October 1st and 2nd. And of course, Pass Data Community Summit in Seattle from November 17th to 21st, where Kendra Little and I have two days of T-SQL pre-cons in which you should go to both so that you can spend, you can maximize your time with us. And with that out of the way, let’s go answer these questions. These have been sitting around for a little bit because I pre-recorded a lot of stuff while I was on vacation.

And getting back into things. And boy, did you ask a lot of questions. So first up, we’re going to do this one right here. Let’s make sure Zuma’s working. So hi, Eric. Thank you for the great content. Hello, you. You’re welcome. Why is it better to have a higher sample rate for update statistics? Well, the good news is it’s not always better. Sometimes the default sample rate is good enough to pick up on a good general description of what data, lives in your indexes and describes that in the histogram. Other times you do need to up the sample rate in order to catch more granular things. If you are using a higher sample rate for statistics and getting better representation of your data in the histogram, there’s no reason to, you know, fear or think that you need to do that for all of them. But that is just sort of how that works. If you are in a situation where even a full scan update of statistics is not doing a good job of describing your statistics, you might try filtered indexes or filtered statistics so that you have a smaller range of data.

You might create a filtered index or statistic on a particularly interesting data point for you and have a more accurate description of just that interesting data point. So again, not always better. Sometimes you do have to increase that. You know, I’ve worked on a few different systems where a regular default sampled statistics update did not do a good job.

And there was like a I think we hit a sweet spot for some of those between about 10 and 20 percent. But I have a few demos that I do in my performance tuning stuff where I have to do 25 or 30 percent. And then there’s even one where I have to do a full scan in order for SQL Server to get things right.

So it is not a sort of across the board thing. It’s a very specific thing. And I wouldn’t recommend doing it for every index because or every statistic because that may be counterproductive to your maintenance time. But if you find that there are some statistics where it does work better, you did a good job and you should keep doing it.

All right. Next up here. Hi, Eric. Hi, you. How are you doing? Oh, there’s even a thanks on this one with an X. That’s a spicy thanks. This thing blurry. Is it just my eyeballs?

Let’s see. Can I get it? Oh, there we go. I’m a little bit more. I’m a little bit more clear now. Hi, Eric. I had a procedure with a chain of CTE and unions. You’re off to a good start. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues.

Oh, you have a high pressure job, my friend. I went in desperation and popped an option recompile hash join in the end of the CTE where they were inserting into a temp table. This only took this only took it to less than two seconds.

How on earth can this has such a performance effect and SQL Server cannot find out by itself? Well, you left a few details out of here. You know, I mean, obviously, I’m not asking to see the full definition of the query because chain of CTE and unions.

That’s that’s that’s that’s that’s pay me money territory. And the fact that you have a recompile on there makes me wonder if there are any parameters or local variables involved. It could be perhaps influencing the optimizers estimation process.

But there are lots of cases where SQL Server may misunderstand the number of rows that might, you know, come out of certain operations. You know, I get by your where and join clauses, survive group by having type stuff and make progress through the plan. And you might end up with well, I mean, you know, I’m going to say in general, you might end up with a lot of nested loops joins that execute way, way more than the optimizer anticipated them than them executing.

So without seeing the query plan, my guess would be that you had a lot of nested loops joins in there in that plan and that SQL Server was getting stuck in them, perhaps even serial nested loops joins. I guess there’s also the possibility that you had some untimely merge joins, especially if you had to sort data before going into those merge joins. But the reason why the optimizer didn’t find it?

Well, I mean, you know, it could be a lot of things. It could be that you have statistically inaccurate information in your database. It could be that you had a parameter sensitivity issue.

It could be that you were using local variables and SQL Server was not estimating things well. You could be using a table variable in there, too, and getting a bad estimation. But also SQL Server could just be doing a poor job of cardinality estimation without any of those things.

And you ended up with some nested loops joins that were completely inappropriate for the number of rows that you were managing. As for why SQL Server couldn’t find out by itself, well, you know, that’s a query complexity thing. The optimizer only has so much time that it’s willing to spend on a query figuring stuff out.

And sometimes it does not have a chance to explore or accurately or correctly cost all of the alternatives. So, you know, this is where stuff like query hints really do come into play. You could try simplifying the query a bit more, maybe removing any elements from the query that would harm estimation, local variables, table variables.

Maybe, you know, if there’s any sort of parameter sensitivity, you could find ways of dealing with that. But if you’ve got things solved with recompile hash join, I don’t see a reason for you to keep tinkering with things. Sounds like you’ve got it under control.

So, you know, the optimizer is a big piece of software and it’s just like, you know, it’s made by humans, just like most software. You know, I’m going to say, I’m going to draw the line a little bit at LLM generated software because that’s not generated by humans and that’s full of mistakes too. But, you know, the optimizer can’t do everything.

It can only make a good enough guess most of the time at a good enough plan. All right. Next question here. I have, I recently purchased the everything bundle.

Woohoo. Thank you for your support. And I’m enjoying each module. Is there a recommended viewing order? For example, the index tuning module seems to reference the hardware modules. So I think as long as you do the starting modules first and I think it’s, what’s it called, the essentials modules first, you should be in pretty good shape.

If there’s anything in the other modules that reference stuff back, you can always go watch those other modules afterwards and sort of like, you know, figure out where, where things, you know, maybe connect. But I don’t really have a recommended viewing order aside from like starting in essentials and then kind of whatever you find interesting and appealing or sort of whatever applies to problems that you’re trying to solve. So that’s about all the, that’s about all the advice I have there.

All right. Last question. Gonna finish on a strong one. Oh, my dear friend, Paul. All right.

If you were curating a Paul White’s greatest hits compilation, what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL servers? All right.

So I’m going to answer the second question first. Paul White would be an excellent sandwich maker. Paul White would make the best sandwiches you’ve ever tasted. They would leave you with a sense of peace and well-being, just like his blog posts do, that you didn’t know was humanly achievable. So that, that, that’s, that answers that question.

As far as curating Paul White’s greatest hits, Paul has a number of like, just absolutely, I mean, like the SQL.Kiwi site where all of his posts live now. This is actually something that a lot of people miss. Like a lot of people will Google like Paul White and a topic and they’ll go to like the old SQL performance site, or they might even find like a dead SQL blog link.

But all of Paul’s posts these days are on SQL.Kiwi. And I’m going to put this in the, the, the, the video description because it’s that important. All of Paul’s posts are there.

You can find lots of one-off posts that are amazingly long, detailed, full of, you know, great information. But I think if you were to, if I were to curate Paul White’s greatest hits, it would be any post that is part of a series of posts. Because I know when Paul White writes a series of posts, that man has got his head around something.

And that’s where I think the best stuff comes in because, you know, you have sort of like this learning pathway of, you know, like, like, like, like great, like a series of things about a particular topic that wouldn’t all fit into one post. So I think whenever Paul White writes a series, that’s the stuff that I would put in like the greatest hits. Like if I were going to make a great post-Paul book in tribute to Paul White’s writing, I would start with all of the posts that are a series.

You know, as far as stuff that has, I think the series of posts that has influenced me the most from him is certainly the Isolation Level series. I have spent a lot of time with that one, especially as I’ve been presenting more and writing more about Isolation Levels. So that one has had the, I think, the most recent big impact on me.

But there are many posts of his that, you know, I have read at different points in my career. Paul White has, of course, been blogging for quite a while. He was actually the first blogger I ever read when I was trying to figure out how to write fast performing paging queries.

He had a couple of posts on SQL Server Central. But yeah, like there are posts that I’ve had to read at like various stages in my career because there were things in each of them that I didn’t fully understand until I understood a lot of other stuff first. And so like it took a lot of puzzle pieces for me to actually get some of the points that he was making, even from many years ago.

So I don’t know. Of course, I always thank you to Paul for everything you’ve done over the years. You are just an amazing human being and teacher.

So thank you for that. But yeah, it’s, you know, I would start with anything that’s a series. That’s the really, like, that’s a really crazy, like, you know, you have to sit down and concentrate. You got to like spend some time with those in order to fully get them.

But anyway, that’s five questions. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in another video another time. Oh, okay.

Let’s just, we know it’s tomorrow at noon Eastern. So let’s just stop pretending. Anyway, cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.