The SQL Server Performance Tasting Menu: Useful vs Useless Bitmap Operators In Query Plans

Bitwise, Filter Foolish


Video Summary

In this video, I delve into the fascinating world of bitmaps in SQL Server query execution plans. I explore when these magical, filter-y creatures are truly beneficial and when they might just be a waste of space. By walking through various execution plans, I show you how bitmaps can significantly reduce the number of rows processed early on, leading to faster queries. However, I also highlight scenarios where bitmaps don’t offer any real advantage, leaving us with an understanding of their true value and when they might not be worth the overhead. This video aims to demystify these often-overlooked elements in query plans, helping you make more informed decisions about your database optimizations.

Full Transcript

Bitmaps. This one’s about bitmaps. You ever seen a bitmap? I’ve seen some bitmaps. You ever seen a batch mode bitmap? You ever seen a batch mode bitmap be good? Yeah, me either. Anyway, let’s look at some row mode bitmaps. Those magical, wonderful, bloom-filtery, sometimes making your query faster-y things. I’ve spent a lot of time reading research and literature about bitmaps, and that’s pretty much word for word what all of them say. Not a whole lot else. So it’s good that they’re so easy to convey to the layman. So let’s look at times when bitmaps are useful. Let’s look at times when bitmaps are useless. And let’s look at times where bitmaps don’t end up where you might expect them.

Bum, bum, bum, bum, bum, bum, bum, bum. Bitmaps. Bitmaps. So let’s see, when is a bitmap useful? Well, they’re generally useful when they, they, they, they reduce the number of rows that come out of a scan. All right, so let’s, let’s go look at this execution plan. And let’s, let’s do ourselves a little thinking. Let’s do, use our, use a little bit of that gray matter that’s been atrophying away. This extended Netflix and chill session. So in this execution plan, we have a bitmap that gets created here. And this is where bitmaps get created. Generally, sometimes in parallel merge join plans, they might be on the inner side of a join, but we’re not going to look at those because parallel merge join plans were a mistake.

They should never happen. So a bitmap gets created here. It sort of keeps track of values that we’ve seen. We create a little, little bitmap filter that we can apply elsewhere in the query plan. This is where it gets created, but where it gets used is on the inner side of a join. If we hover over this clustered index scan right here, and we look at the tooltip, we have a predicate where we are probing in row for data. Ooh, in row probe. Ooh, la la. Sounds fancy. Sounds like a fun time. Hmm. But we can see that probe is on bitmap 1004, which is the bitmap that got created up yonder. And what it’s doing is before rows even pass through various storage engine-y things, we are, we are filtering them out.

So there are rows that are not going to really even enter the scan. We are at the storage engine layer, filtering out things that we don’t need. And you can see that reflected a bit. Now it’s easy to see here because there’s no additional predicates. If we had like another predicate on the users, filtering something on the users table, you might see an additional predicate applied here. And it might be a little bit better. And it might be a little bit harder to ascertain if the bitmap was useful or useless.

But in this case, we have 39,789 of about 2.4 million rows leaving the scan. Normally, one might think that this was just a bad estimate. But in this case, it is because we have this bitmap that is going in and filtering rows out. Isn’t that lucky for us? If we look at the number of rows that we estimated to read, which is all of them, and the number of rows that we actually read, which is significantly less than 83,000.

And then the actual number of rows that passed out, which is about 39,000. This bitmap did a significant amount of filtering for us. And we did not. And this query did less work because of it, which is fantastic. A fantastical time. Now, let’s look at when a bitmap is useless. Let’s scroll on down here. My mouse wheel has been so weird lately.

It’s like sometimes it goes and other times it just jumps around. And I don’t know anymore if it’s mice or if it’s my laptop or laptop dock. Because it seems like I get a new mouse every three months and then out of nowhere the scroll wheel starts just not being helpful for me. So, anyway, useless bitmap is this one. And it’s not because the query is so very slow or anything and whatever.

It’s more about what the bitmap actually does. Here, again, we’ll look at the index scan on the user’s table. Let’s hover on over that. We have our predicate in row bitmap again. Our hero has arrived here to save the day, here to make this query amazingly fast. Except it’s utterly, absolutely, pointlessly useless. There’s no point to this bitmap whatsoever.

If we look at things in here, we might think, wow, we got such a great estimate. This is the best query plan ever. Good job, optimizer. Way to guess. Way to guess way better than you did on that other plan. But we’d be overlooking the fact that we have a useless bitmap involved here.

So, again, the bitmap gets created here. And we start keeping track of user IDs that we’ve seen and not seen. We’ve got you making a filter. And then we apply it down here. But this time, rather than reducing the number of rows or the number of rows read, that doesn’t happen.

Every single row passes the filter. So if we look at the number of rows that we read, it’s every single row in the user’s table. And if we look at the number of rows that passed out of the scan, it’s like every single row in the user’s table.

So this bitmap did effectively nothing. This bitmap, I mean, it got created and it did stuff, but it didn’t help us filter out any rows. Now, this is sort of, I don’t know, I guess, the best possible scenario for a bitmap to get applied over at the index scan. Right? That happened in the last two queries we looked at.

Sometimes, bitmaps get stuck. So let’s look at a case where a bitmap can get stuck in a funny, funky, awkward place. Now, if we look at this query plan, we again, we have a bitmap. It gets created here.

Same as the last two. First verse, same as the last verse, same as the first little… I forget how this one goes. But if we hover over the index scan here, what do we have? Nothing.

Just a scan. Just a regular old scan. And if we look over here, we have a hash match aggregate. We are building a residual. Building that.

And then if we… There’s no probe here either. But then if we finally go over to this repartition streams, we will find our probe. Look at that.

How did that happen? I don’t know. Magic. Magic. Crazy, crazy magic. Witchcraft. So at the repartition streams operator, that is where we start applying the bitmap filter.

I don’t know that we really… I don’t know that this is really helpful. I mean, the estimate might be partially a little bit wonky because of this aggregate, which is completely wrong.

Right? This user ID aggregate, we are off by 1600 or so percent. And then the number of rows that pass through here is all amok. But it’s hard to know if we can blame the bitmap on that or if we should blame the bad estimate on the aggregate.

I would imagine it’s some confluence of the two, but I’m not really a math guy. So I tend to avoid that. So bitmaps can be great.

If you can get some early… Early… Early… I’m gonna have a son and name him early. If you can get some early semi-row…

semi-join row reductions, you are in good shape with a bitmap. Generally though, if you get a bitmap and it doesn’t… isn’t really helpful, you will probably not notice.

It probably is not the end of the world for your query plans. But this is an interesting subject and it’s one that I don’t see a lot of people talking about. So it’s one that I wanted to teach you a little bit about because it’s one of those sort of like weird things in a query plan that a lot of people overlook and don’t fully understand.

Like where they actually get used, what they actually do, how to tell if they’re useful or not. So there you go. There’s a little bit of knowledge for you.

A little bit of wisdom on this brisk Saturday afternoon. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.