Tuning Performance Problems With Aggregates In SQL Server, Part 2

2/3

Video Summary

In this video, I delve into a fascinating SQL Server query optimization scenario where we encountered an “eager index spool” in the execution plan—a notorious performance bottleneck. To tackle it, I explored alternative ways to structure our query and improved statistics on key columns to refine aggregate estimates. By doing so, we managed to reduce the query’s runtime from several minutes down to just seven seconds, showcasing the power of better data estimation and query reordering. This video is part of a series focusing on optimizing complex queries with aggregates, and I hope you find it helpful as you navigate your own SQL challenges!

Full Transcript

I ain’t got nobody. Just kidding. Just avoiding going out and doing things. Sometimes David Lee Roth karaoke is the best way to do that. Just kidding. It’s recording SQL Server videos on Saturday. Ah, so in the last video we talked about aggregates with bad guesses and I want to show you another aggregate with a bad guess. This one coming to you live and direct from a client inspired event where they had this big crazy query and at the heart of this query was this two counts where we had to filter on stuff. Um, but I, my, my idea, my big idea was we are going to do one count and filter on that rather than two separate counts and filter on those because that’s, we could, we could do that. We could, we could use math to our advantage. Who would use math in a database? Insanity. Certainly not the optimizer. Just kidding. The optimizer is lovely.

Uh, so the, the, the, the query that we were working on was much, much bigger, but this was sort of at the heart of it. And, uh, so let’s say we have, we had an index on the comments table on the post ID column. Um, and I wanted to do my count in here like this. And when I wrote this rewrote this query, um, what happened was, uh, not good. Uh, I ended up with a query plan that looked a bit like this. Yeah. You can see what’s going on here. You can see where my least favorite thing to ever see in a query plan popped up. That is the old eager index spool. Yeah. I hate this thing. I hate this thing for so many reasons.

And you can see why I hate it here. We spent nearly three minutes building this index. Uh, and the reason why it takes so long is because it’s, it says that it’s happening in parallel. But if we go into the properties and we look over here, we can see that, oh boy, all just about 53 million rows from the votes table end up on a single thread to build that index. And this will happen each and every time. We don’t have a missing index request either here and the XML and the missing index DMVs. We guts nothing.

So even though there isn’t physically a lot of distance between here and here, time wise, there is a very large amount of distance between here and here. Uh, the query in full runs for about two minutes and 56 seconds. So we can figure out pretty well that, guess what? We spent most of the time building that stupid spool. So let’s figure out what happened, how we can fix it, what went wrong, what was going on.

Now, the thing is, um, I was confounded by this. And often when I am confounded by things, my first reaction is to do something stupid. So I did what I normally do, which is something stupid. And I just flipped the order that we were union all-ing things from. So I went from up here where comments was on top and votes was on the bottom to votes on top and comments on the bottom.

And when I did that and ran the query, it was a little bit faster. And, um, it w it was faster because the execution plan changed and we no longer had the index spool. And the query was overall faster, but I was still, I was still annoyed with a few things in the query. Uh, so when we look at the plan, we’ll start to sort of see why. All right. We can, we can, we can see some more rather large percentages in here, where perhaps maybe, uh, we don’t need, there shouldn’t be larger percentages in here.

Uh, we’ll start with this one where, uh, the optimizer thought that it could squish our results down by a significant amount, right? From 5.6 million, uh, to, or rather from 17 million to, uh, 141,000, right? The 17 million that came out of here. So it, it, it, it, it’s guess was we could get down to this number, but we can only get down to this number.

So we, we stopped early. Uh, and then of course with along with that bad guess, we get a bad guess here and a bad guess is over here. Uh, some bad guess is over here. Now you’re probably wondering why we have such bad guesses here. And I’m going to show you. So, uh, in a normal batch, uh, sorry, in a normal role mode plan where we had a hash join, we may, in a parallel role mode plan where we have a hash join, we may see a bitmap get created. In batch mode plans, we don’t see that anymore.

But what we will see is in the properties of the hash match operator, we will see bitmap creator true. And we can see that it created optimized bitmap 1085. Very special bitmap. Just kidding. It’s just optimized bitmap 1085. There’s really nothing all that special about it at all.

And if we look down here where we, where, where that bitmaps hopefully get applied, we can see in the, the clustered index scan of votes that, uh, optimized bitmap 1085 was pushed over here. And if we look at the scan of comments that optimized bitmap 1085 was in fact pushed down here. So this bitmap, we, we made some guesses based on this bitmap. And those guesses turned out to be very, very, very, very, very, very wrong.

Mm-hmm. Yes. I mean, we were still, we still did a lot better than, uh, what was it? 30, uh, three minutes, but still not fast enough. Eric, you’re a query tuner. You need to make things faster.

So again, um, let’s just try to improve the estimate here. Let’s not even, let’s not create another index. Um, let’s just, we need to make a better guess. We need to make a better guess on the post table, on the parent ID column, because that’s what we were trying to squish here. Parent ID.

If we make, we have better statistics on what’s going on in parent ID, then we can make better choices elsewhere in the plan. So let’s create those with a sample of 25%. Again, this isn’t, this isn’t even pushing the bound. 25% is like nothing. It’s not a whole lot. It finishes in about four seconds.

It’s something that, you know, even people with the most demanding maintenance windows can get away with. And now when we rerun that query with those statistics in place, things will improve a little bit more. At least they, they, they do without Camtasia running. So we’ll find out what happens with Camtasia running.

So seven seconds. So we cut the time about in half again. And we can see that we started making quite reasonable guesses in many places. I mean, not like, like fully reasonable guesses, but you know, much better guesses, I think.

Like not like thousands of times wrong guesses. And I don’t know, things, things look better to me here. Don’t they look better to you? They look much better to me. I like it.

So just by improving the estimate over on the post table on that parent ID column, we were able to make better guesses about like what, what rows are going to come out of things. Oh, we got optimized bitmap 1077 that time.

A wonderful vintage of that bitmap. Wonderful vintage. Yeah. You know, we were able to improve some guesses. We were able to get faster queries and I think better query plans.

And sure. Like, you know, just like in the other video, we could go the next step and we could add, we could add indexes to tune this further. But I think getting a query down from several minutes to seven seconds is a pretty good start. So we’re going to pause there and actually we’re going to, we’re going to stop there completely.

And we are going to go record another video. Oh, but I wanted to, wanted to say with those statistics in place, we even get a good plan with things in the original order. So we, we still get the plan there. We didn’t even have to change the order anymore.

So lucky for us, isn’t that, isn’t that lucky for us? We can write this in any order we want now and things, things don’t get wacky. Ah, that’s amazing. Anyway.

Uh, thanks for watching and I will see you in the third and final installment of the Angry Aggregates. All right. Goodbye. All right.

Bye.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

2/3

Video Summary

Full Transcript

Going Further

Related Posts