A Little About Bitmaps in SQL Server
Video Summary
In this video, I delve into the fascinating world of bitmaps in SQL Server query plans, explaining their utility and behavior through a series of examples. Starting off by introducing what bitmaps are and when they’re useful, I walk you through how these bitmap operators function within query execution plans. You’ll learn that while bitmaps get created early on in the plan, they’re actually used later during table scans, significantly reducing the number of rows processed. The video also explores the impact of degree of parallelism (DOP) on bitmap effectiveness and why sometimes a bitmap might not be applied to every operator as expected. By the end, you’ll have a better understanding of how bitmaps can optimize query performance and when they might not be as effective, especially at different DOP settings.
Full Transcript
Erik Darling here with Darling Data, casting big shadows today. Oh, look at me, I’m huge, I’m like the Hulk. Gigantic, huge person. Anyway, we’re going to talk about bitmaps because I find advanced science fascinating, even if I don’t fully understand absolutely all of it. I couldn’t really explain to you in great detail what a bloom filter is. But we can talk a little bit about bitmaps anyway. So let’s do that. Now, we’ve already got query plans turned on because I actually took the time to mentally rehearse this a little bit. And I’m glad that I did because I was sort of reminded of something weird that happens later on in the script that didn’t used to happen. So we’re going to we’re going to talk about all that stuff. So, first, let’s talk about what bitmaps are a little bit and when they are useful to us in SQL Server query plans. Now, the sort of confusing thing about bitmaps is that we see in the query plan, where they get created, where SQL Server decides, we need a bitmap here. So bitmaps get created here. But that’s not really where they get used.
Where they get used is down here. When we touch the table that the bitmap is applied to. Now, quite often, you’ll see something in your query plan that looks about like this. We’ll see a probe bitmap something in row. Batch mode on rowstore might look a little bit different. Sometimes this gets put into a filter operator, you know, sometimes we won’t see a bitmap operator in the plan. A hash operation will have a bitmap creator true. attribute to it if you look in the properties of the the the hash doodad. So if you pretend we pretend that this was in batch mode, and we get the order the properties of the hash join operator, we would see bitmap creator equals true somewhere. Oh, I don’t know about here, maybe somewhere painting with broad strokes there. Now, where when bitmaps are useful, you will see things like this. Now it’s gonna look like SQL Server just made an absolutely terrible estimate here. You’re gonna stomp your feet and you’re gonna say SQL Server, why did you not know that only 40,000 rows were gonna come out of here? Why are you so stupid? Who made you? Who is your creator? Who is the patron saint of your existence?
Who continues to deal with you? Put up with that?
Anyway, it is not SQL Server being wrong, wrong. The cardinality estimate is figured out well before the query optimizer decides that a is it’s going to the cost based optimizer, sorry, decides that a bitmap is going to be created and used some some leave the fancy term for that is post optimization rewrite or something like that. I love all these phrases.
Yeah, just just rewrite it after you’ve optimized it. Why not? Anyway, it’s that’s not what it is. That is the bitmap at work. Note that if we look at the properties of while doing this, the tool tip for this thing, we do read all the rows.
But as we read the rows, we apply that bitmap to filter them out as rows are coming out of the scan, like, it’s just like a sort of like a residual predicate in other in other in other senses, but we’re just ruling rows out way ahead of time. So that’s nice, right? The storage engine is like nope, nope, nope, nope, nope, nope, nope, nope, nope, nope.
And we get down to 40,000 rows out of the 2.4 million rows that we read. So that’s a pretty good reduction from the bitmap. So thank you, bitmap. You’re a great bitmap. The best bitmap that ever bitmapped.
Love you. Love you to death. So bitmaps are good. On the inner side of the scan, pay attention to estimated rows, number of rows read, the actual number of rows that come out. When you see a bitmap, you’re going to see that probe thingy in there, in the tooltip, and you will get some sense of if the bitmap was effective or not.
Now, this is really only useful information for actual execution plans and estimated plans. You just see all the usual dumb things that SQL Server thought would happen. You know, quite the optimist SQL Server is, except for the default database isolation level.
So where they’re not so useful. And this is kind of funny. So this is where, this was the weirdness that happened to me along the way.
Is I used to have a smaller VM that I did demos on. The demo VM was set to have a max stop of 4. To this many.
And if we run this query at max stop 4. And look at the execution plan. Every single row.
Well, I guess just about every single row. Right? Close enough to every single row. 99% of the rows pass the bitmap filter. Right? Because the bitmap filter kind of blooms things based on DOP.
You get that sort of like parallel page supplier type modulus thing of things. And because of that 99% of the rows pass the bitmap filter. This clustered index scan is going to look nearly identical to the last one that we looked at.
Except this one was far less effective here. Right? We still have that probe bitmap in row. But this time 99% of the rows come out rather than taking it from 24 million out of 34,000 or whatever.
So, you know, obviously that’s not very good. But if we run this demo and we don’t restrain the correct. Let’s just explicitly say max stop 8.
If we run this at DOP 8 and we get sort of a larger hash bucket for the bitmap filter. We do actually start reducing rows a little bit more efficiently down here. Granted, you know, 1, 8, 5, 1, 5, 6, 1.
Granted, you know, we’re still at 1.8 million out of 2.4 million. But at least we were able to get some more out of there. So sometimes bitmap effectiveness is dependent on a degree of parallelism.
Now, if we look at the properties of my server, I have eight cores in there. So I can’t really go higher and show you more effectiveness at higher DOPs. But there is at least some evidence that if we were to double this up to like DOP 16, we might even get to rule some more rows out down there in the scan.
Maybe. Maybe. Just maybe.
So, you know, if you suddenly find all of your bitmap operators become completely ineffective because of SQL Server 2000 2022’s degree of parallelism feedback feature, you know who to blame. You know who to send a letter to.
Actually, I don’t even know who to send a letter to. So whatever. So we’ve looked at a couple of cases where the bitmap was applied directly to when we read the index, when things were the storage engine was just like pow, pow, pow, pow, pow, pow, get out of here. Right.
The storage engine was kicking rows out left and right. Well, at least a decent DOP. And that first one was really good. The second one is still still a little iffy on that one. Let’s be honest with you. But not every bitmap gets pushed down that far. And I’ve never really been able to figure out why.
And I’ve always been too shy to ask. Just kidding. No, it’s just one of those things that I have not spent a whole lot of time trying to figure out because it’s never been a terribly big pain point for me. But if we look at this query plan, if we look at this tooltip here, we do not see.
Oh, wait, I should probably get that to a place where you can see it, too. Right. There we go.
So if we look at this tooltip, there is no bitmap at all mentioned here. If you look at this hash aggregate, there is also no bitmap listed here. We have a build residual, but no bitmapping.
And if we look at this repartitioned streams operator, we finally see the bitmap. Isn’t that strange? For some reason, the bitmap got stuck.
The parallel exchange said, oh, I don’t want to go any further than that. Things get weird down there. I would imagine that this is some sort of optimization that why apply a bitmap to all these rows when you can apply a bitmap to just the rows that come out of the aggregation at the repartitioned streams operator. And, you know, that’s actually sensible.
I actually don’t disagree with that. But if I had to put a best guess stamp on a reason why, SQL Server is like, well, you know, I’m going to aggregate all these rows anyway. We can just apply a bitmap to a smaller result set, you know, rather than, let’s see, that’s the comments table.
That’s 2, 4, 5, 3, 4, 7, 3. Oh, that’s an eight digit number. And after the aggregate, we have a 4, 4, 0, 0, 7, 5, 5.
That’s only a seven digit number. So applying that bitmap to far fewer rows seems like actually a pretty sensible optimization choice. So good job there, SQL Server.
I appreciate you. Appreciate you at DOP 8. I don’t appreciate you at any other DOP. DOP 8 or BUST. So, again, bitmaps, they get created in the plan, not where they get used in the plan.
You should pay attention to where the bitmap gets used. If you’re looking at an actual execution plan, you may want to look at how effective the bitmap is. If you’re able to run the query at higher DOPs, you may see bitmaps increase in effectiveness because you would have larger hash buckets to start filling up and filtering things.
So, I don’t know. It’s always worth a shot, right? Let’s throw some more DOP at it.
Don’t downgrade the DOP. Upgrade the DOP. Up with DOP. Down with downgrades. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.
I hope you really dig my Hulk shadows behind me on the green screen. Sensitive sily, that thing. And if you feel so inclined, if you’re feeling generous, if you’re feeling kind today, you want to like and subscribe to my channel.
Like this video and subscribe to my channel. You can’t subscribe to this video. It’s asinine. But if you feel like doing that, I would appreciate it. If not, I still hope you have a great weekend. That’s about it.
Anyway, thank you for watching. I’m going to hit the record button one more time. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.
I think your hulk shadows are a result of bit mapping lol
Eh yeah modern cartography