In this video, I delve into the intricacies of SQL Server’s output clause and its impact on query performance. Specifically, I explore how choosing an inappropriate target for the output data can force a query to run in a single-threaded manner, which is something I strongly dislike—parallelism is usually preferable unless there’s a compelling reason not to use it. By using the Stack Overflow database as an example, I demonstrate this issue with a table called `high question scores`, showing how returning output directly to the client or using a table variable can lead to a single-threaded execution plan. To avoid such performance pitfalls, I recommend using a temporary table (prefixed with a # symbol) instead, which allows for parallel inserts and maintains better query performance.
Full Transcript
Erik Darling here with Darling Data. And today we’re going to talk about the output clause in SQL Server in one respect. And that is how if you choose incorrectly or unwisely what you are outputting data to when you use the output clause, you can end up with a query plan that is forced to run single-threaded. If you know anything about me, there is nothing that I hate more than a query that runs single-threaded when it doesn’t have to. It’s not a good thing. Like, I’m okay with parallel plans, I’m okay with serial plans, but I’m not okay with a plan that is artificially forced to run single-threaded for no good reason. So, what I have in my stack overflow database is a table called high question scores. And all it is is a worker table that I use for dumb demos that I can modify all over the place and nothing matters with it. So the whole point of this table, right, is to populate it with the display name and highest score.
For a user for a question that they have asked. So the select query that does that work looks like this. And I’m just going to find for one user. All I care about right now is finding for Jonathan Skeet. And if I run this query, now I have accelerated database recovery turned on. So all my rollbacks are instantaneous. None of the query foibles and follies for this query are going to be due to the rollback. If you’re, well, I’m sort of standing in front of it, but I promise you this thing only ran for one second. And I just want to actually take some time to talk about how amazing it is that I nailed that to the second. All right. One second flat. All right. Look at that. It was just like 999. Is it going to make it? Can I do it? I did it. All right. And that’s cool. Most of the work in this plan is done over here. We did a clustered index thing on this side. All right. Had some parallelism in the query plan. Things went fairly well. All right.
We also rolled all that back. We don’t have data in the table. All right. This table has been written into and rolled back from a million times. So it doesn’t really matter. Now, what I’m going to show you is what happens to that query if we return output results directly to the client. So let’s run this. You’ll get the sense that this is not going to run for exactly, precisely one second.
All right. This runs for five seconds. And now we have a query that runs for 5.453 seconds single threaded. Now SQL Server is changing its tune about an index for that thing. But if we look at the properties of the plan, now this is if you’re on SQL Server 2022 or you are using some fancy Azure-ish version of SQL that Microsoft pays you a premium for the privilege of using, you will get this non-parallel plan reason. DML query returns output to client. Very descriptive. Thank you.
Thanks for letting us know about that. I appreciate you. It’s a very good reason, I guess. I’m not sure. A little… Quite frankly, my brain’s a little tarnished on that one.
So that’s not great. The other thing that will cause that is if we output to a table variable. So right now I’m going to declare a table called high question scores, which is a table variable because it starts with an at sign. Everything that starts with an at sign is a variable. I don’t care who you are, where you come from, what you do with your life.
If we run this… Oh, this also runs for five and a half seconds. Zoom it will focus.
And similar to the last query that we looked at. If we go to the properties of the root operator, we will get quite a verbose non-parallel plan reason. There we go. Table variable transactions do not support parallel nested transaction.
All right. So if you’re using the output clause and you are moving around large chunks of data, or if the query you are using to generate the output data does a significant amount of work, you may want to consider outputting into a temp table. That is the pound sign table, right? Not the at sign table, the pound sign table.
Because temp tables do not have the same restrictions on parallel inserts that table variables do. Table variables screw you up in many ways. That’s one of them. Temp tables do not have that restriction or limitation on them.
So if you are going to run some big, you know, query that, you know, grabs a bunch of data and cross a bunch of tables and aggregates it and, I don’t know, row numbers and who knows what else, and you want to return that data to a client, even if it’s only like five or ten rows that come back, if the work you do to get to those five or ten rows is significant, then you should dump it into a temp table first and just select out of the temp table at the end, because you will be able to do a parallel insert into the temp table where possible.
But at the very least, you’ll be able to have a parallel plan to get that data into the temp table. So, please do mind your output clause targets in SQL Server, because there are, as usual, landmines in everything you do. All right. Cool.
Thank you for watching. I hope you learned something. I hope you enjoyed yourselves at least a little bit. Please be kind. Like and subscribe.
I’m not even asking for a comment, just a like and a subscribe. Just a little piece of your life. Like I’ve given you a little piece of my brain and heart, liver, lungs, kidney, large intestine, small intestine, colon, pancreas, appendix. I don’t know. Whatever. I’m through with this.
I only, my copy of Operation when I was a kid was broken. It was missing most of the pieces. So human anatomy is largely a mystery to me.
Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into an intriguing bug that cropped up recently while working with a client’s codebase. The issue revolves around a peculiar update statement that initially baffled me due to its seemingly incorrect syntax. I explain how the query managed to compile and run despite updating the wrong table, thanks to an unintentional aliasing mistake. By correcting the aliases in the query, we observed a significant improvement in the generated query plan, highlighting the importance of proper alias usage, especially in modification queries. This experience serves as a reminder to always double-check our SQL syntax, particularly when dealing with updates and other modifications, to avoid unintended data changes or performance issues.
Full Transcript
Erik Darling here with Darling Data Enterprise for Business Platinum 365. And in this intentionally short video where I am totally screwing up my green screen by having extra windows open that are bright, look everything magically got better. I want to talk about a weird bug that I found in some client code recently that I have simplified down to this. Now I’ve got two tables created in my database. I’ve got a table called T, which you can see there is already an object named T in the database, and a table called TT. And there is already an object called TT named TT in the database. It’s a little funny that IntelliSense is freaking out about this one, but this one is like, nah, you’re cool. But anyway, I got an email from a client, and they were like, something weird is going on with this update statement. We can’t figure it out. And please take a look at it. And so they sent me the update statement. And of course, I was, you know, I was up late writing demos the night before. And I looked at the query and I was like, well, how does that even work? How does that do anything? I mean, your update, the update T, right? And you have an alias and you’re setting T dot ID equals one from a table called TT alias is TT. How is how is this even doing anything? What what is happening? Send me the query plan. And so they sent me the query plan. And I’m going to show you what this query plan looks like. Here and go over here and look. And I mean, the object, obviously, the object called T in my database is there with a columnstore index on it. That’s that was that was my little cheat code for various things and some some demos. But I looked at the query plan. I was like, Well, you’ve got two tables in there. And they’re like, Well, is it a foreign keys trigger? What’s going on? I was like, No, no, there’s a table named the thing that you’re updating. But you’re still selecting from the wrong table, you didn’t alias the second table correctly. I guess they had some an unnamed code completion and formatting tool that had alias the table automatically for them. And it was it was unintentional there. So I look at the query plan. So well, you’re, you’re, I mean, it’s weird that this even works, right? Like, it’s astounding to me that this compiles and SQL servers like cool, no problem.
Got it. All the reasonable things that I feel like I’ve written where SQL servers like, Whoa, whoa, whoa, whoa, whoa, whoa, whoa, whoa, whoa, whoa, back it up. We can’t possibly do that. This thing compiles and runs. So I sent them back a copy and paste of the query done correctly, where we update tt and we set tt dot ID equals one. And miraculously, we got a much more sensible, a much more sensible query plan that didn’t involve any other tables in the database.
So anyway, if there’s a lesson here, please pay attention to your aliases, especially in modification queries. You may find at various points that if you alias things incorrectly, SQL Server will throw an error. You may find in other circumstances that if you alias things incorrectly, you will muck up a lot of data that you shouldn’t have actually touched. Because the way this thing used to be written, it would have actually updated the ttable and, you know, changed a row to something. Wouldn’t have changed the whole table. Because there is, there is a, what do you call it? There is a, let’s actually step back. There is some manual, oh, sorry, there’s some automatic phase separation in this query plan.
And we’ve talked about this in prior videos. I talked about it with a hash join. But here we have a sort that actually does the does the work that a spool would have done. And in another plan. So you wouldn’t update the entire table, you would just update one row out of there. So isn’t that nice? Anyway, like I said, intentionally short video, blog fodder, felt like recording it, didn’t feel like writing it. There are there are times when I love sitting down and writing blog posts, typing things. Right? Taking screenshots and uploading them to WordPress. And there are other times when, gosh darn it, it’s just so much better to do a video. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you’ll be very cautious with your table aliasing, especially in modification queries.
And if you feel feeling generous today, you’re feeling like I really earned your time and attention. You can like and subscribe to my channel. Right? Thumbs up in the bell thing. Anyway, thank you again for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the fascinating world of bitmaps in SQL Server query plans, explaining their utility and behavior through a series of examples. Starting off by introducing what bitmaps are and when they’re useful, I walk you through how these bitmap operators function within query execution plans. You’ll learn that while bitmaps get created early on in the plan, they’re actually used later during table scans, significantly reducing the number of rows processed. The video also explores the impact of degree of parallelism (DOP) on bitmap effectiveness and why sometimes a bitmap might not be applied to every operator as expected. By the end, you’ll have a better understanding of how bitmaps can optimize query performance and when they might not be as effective, especially at different DOP settings.
Full Transcript
Erik Darling here with Darling Data, casting big shadows today. Oh, look at me, I’m huge, I’m like the Hulk. Gigantic, huge person. Anyway, we’re going to talk about bitmaps because I find advanced science fascinating, even if I don’t fully understand absolutely all of it. I couldn’t really explain to you in great detail what a bloom filter is. But we can talk a little bit about bitmaps anyway. So let’s do that. Now, we’ve already got query plans turned on because I actually took the time to mentally rehearse this a little bit. And I’m glad that I did because I was sort of reminded of something weird that happens later on in the script that didn’t used to happen. So we’re going to we’re going to talk about all that stuff. So, first, let’s talk about what bitmaps are a little bit and when they are useful to us in SQL Server query plans. Now, the sort of confusing thing about bitmaps is that we see in the query plan, where they get created, where SQL Server decides, we need a bitmap here. So bitmaps get created here. But that’s not really where they get used.
Where they get used is down here. When we touch the table that the bitmap is applied to. Now, quite often, you’ll see something in your query plan that looks about like this. We’ll see a probe bitmap something in row. Batch mode on rowstore might look a little bit different. Sometimes this gets put into a filter operator, you know, sometimes we won’t see a bitmap operator in the plan. A hash operation will have a bitmap creator true. attribute to it if you look in the properties of the the the hash doodad. So if you pretend we pretend that this was in batch mode, and we get the order the properties of the hash join operator, we would see bitmap creator equals true somewhere. Oh, I don’t know about here, maybe somewhere painting with broad strokes there. Now, where when bitmaps are useful, you will see things like this. Now it’s gonna look like SQL Server just made an absolutely terrible estimate here. You’re gonna stomp your feet and you’re gonna say SQL Server, why did you not know that only 40,000 rows were gonna come out of here? Why are you so stupid? Who made you? Who is your creator? Who is the patron saint of your existence?
Who continues to deal with you? Put up with that?
Anyway, it is not SQL Server being wrong, wrong. The cardinality estimate is figured out well before the query optimizer decides that a is it’s going to the cost based optimizer, sorry, decides that a bitmap is going to be created and used some some leave the fancy term for that is post optimization rewrite or something like that. I love all these phrases.
Yeah, just just rewrite it after you’ve optimized it. Why not? Anyway, it’s that’s not what it is. That is the bitmap at work. Note that if we look at the properties of while doing this, the tool tip for this thing, we do read all the rows.
But as we read the rows, we apply that bitmap to filter them out as rows are coming out of the scan, like, it’s just like a sort of like a residual predicate in other in other in other senses, but we’re just ruling rows out way ahead of time. So that’s nice, right? The storage engine is like nope, nope, nope, nope, nope, nope, nope, nope, nope, nope.
And we get down to 40,000 rows out of the 2.4 million rows that we read. So that’s a pretty good reduction from the bitmap. So thank you, bitmap. You’re a great bitmap. The best bitmap that ever bitmapped.
Love you. Love you to death. So bitmaps are good. On the inner side of the scan, pay attention to estimated rows, number of rows read, the actual number of rows that come out. When you see a bitmap, you’re going to see that probe thingy in there, in the tooltip, and you will get some sense of if the bitmap was effective or not.
Now, this is really only useful information for actual execution plans and estimated plans. You just see all the usual dumb things that SQL Server thought would happen. You know, quite the optimist SQL Server is, except for the default database isolation level.
So where they’re not so useful. And this is kind of funny. So this is where, this was the weirdness that happened to me along the way.
Is I used to have a smaller VM that I did demos on. The demo VM was set to have a max stop of 4. To this many.
And if we run this query at max stop 4. And look at the execution plan. Every single row.
Well, I guess just about every single row. Right? Close enough to every single row. 99% of the rows pass the bitmap filter. Right? Because the bitmap filter kind of blooms things based on DOP.
You get that sort of like parallel page supplier type modulus thing of things. And because of that 99% of the rows pass the bitmap filter. This clustered index scan is going to look nearly identical to the last one that we looked at.
Except this one was far less effective here. Right? We still have that probe bitmap in row. But this time 99% of the rows come out rather than taking it from 24 million out of 34,000 or whatever.
So, you know, obviously that’s not very good. But if we run this demo and we don’t restrain the correct. Let’s just explicitly say max stop 8.
If we run this at DOP 8 and we get sort of a larger hash bucket for the bitmap filter. We do actually start reducing rows a little bit more efficiently down here. Granted, you know, 1, 8, 5, 1, 5, 6, 1.
Granted, you know, we’re still at 1.8 million out of 2.4 million. But at least we were able to get some more out of there. So sometimes bitmap effectiveness is dependent on a degree of parallelism.
Now, if we look at the properties of my server, I have eight cores in there. So I can’t really go higher and show you more effectiveness at higher DOPs. But there is at least some evidence that if we were to double this up to like DOP 16, we might even get to rule some more rows out down there in the scan.
Maybe. Maybe. Just maybe.
So, you know, if you suddenly find all of your bitmap operators become completely ineffective because of SQL Server 2000 2022’s degree of parallelism feedback feature, you know who to blame. You know who to send a letter to.
Actually, I don’t even know who to send a letter to. So whatever. So we’ve looked at a couple of cases where the bitmap was applied directly to when we read the index, when things were the storage engine was just like pow, pow, pow, pow, pow, pow, get out of here. Right.
The storage engine was kicking rows out left and right. Well, at least a decent DOP. And that first one was really good. The second one is still still a little iffy on that one. Let’s be honest with you. But not every bitmap gets pushed down that far. And I’ve never really been able to figure out why.
And I’ve always been too shy to ask. Just kidding. No, it’s just one of those things that I have not spent a whole lot of time trying to figure out because it’s never been a terribly big pain point for me. But if we look at this query plan, if we look at this tooltip here, we do not see.
Oh, wait, I should probably get that to a place where you can see it, too. Right. There we go.
So if we look at this tooltip, there is no bitmap at all mentioned here. If you look at this hash aggregate, there is also no bitmap listed here. We have a build residual, but no bitmapping.
And if we look at this repartitioned streams operator, we finally see the bitmap. Isn’t that strange? For some reason, the bitmap got stuck.
The parallel exchange said, oh, I don’t want to go any further than that. Things get weird down there. I would imagine that this is some sort of optimization that why apply a bitmap to all these rows when you can apply a bitmap to just the rows that come out of the aggregation at the repartitioned streams operator. And, you know, that’s actually sensible.
I actually don’t disagree with that. But if I had to put a best guess stamp on a reason why, SQL Server is like, well, you know, I’m going to aggregate all these rows anyway. We can just apply a bitmap to a smaller result set, you know, rather than, let’s see, that’s the comments table.
That’s 2, 4, 5, 3, 4, 7, 3. Oh, that’s an eight digit number. And after the aggregate, we have a 4, 4, 0, 0, 7, 5, 5.
That’s only a seven digit number. So applying that bitmap to far fewer rows seems like actually a pretty sensible optimization choice. So good job there, SQL Server.
I appreciate you. Appreciate you at DOP 8. I don’t appreciate you at any other DOP. DOP 8 or BUST. So, again, bitmaps, they get created in the plan, not where they get used in the plan.
You should pay attention to where the bitmap gets used. If you’re looking at an actual execution plan, you may want to look at how effective the bitmap is. If you’re able to run the query at higher DOPs, you may see bitmaps increase in effectiveness because you would have larger hash buckets to start filling up and filtering things.
So, I don’t know. It’s always worth a shot, right? Let’s throw some more DOP at it.
Don’t downgrade the DOP. Upgrade the DOP. Up with DOP. Down with downgrades. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.
I hope you really dig my Hulk shadows behind me on the green screen. Sensitive sily, that thing. And if you feel so inclined, if you’re feeling generous, if you’re feeling kind today, you want to like and subscribe to my channel.
Like this video and subscribe to my channel. You can’t subscribe to this video. It’s asinine. But if you feel like doing that, I would appreciate it. If not, I still hope you have a great weekend. That’s about it.
Anyway, thank you for watching. I’m going to hit the record button one more time. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the nuances of local variables in SQL Server and how they differ from parameters and literal values. I explore why using a local variable to fix parameter sniffing is not only ineffective but also counterproductive, as it leads to poor cardinality estimates and suboptimal query plans. Through practical examples, including dynamic SQL and stored procedures, I demonstrate that relying on local variables can exacerbate the very problem they are meant to solve. I also discuss alternative strategies, such as using recompile hints or parameterized dynamic SQL, which can help mitigate issues with parameter sniffing while maintaining performance and accuracy.
Full Transcript
Erik Darling here with Darling Data. And in this video, we are going to talk about how local variables are different and how they work in SQL Server. What are they different from? Well, parameters for one, literal values for another, things like that. And if you want maximum detail, on this post, I would suggest you go here. This is a post on my site. You can follow the trustworthy bit.ly link here. That’s bit.ly slash uppercase L, local, uppercase V, variables. If you go to this URL, but you neglect to put the proper capitalization in, you will end up at a place that I cannot voucher verify for. I have no idea what’s going to show up on your screen. anything .rtf, disappearance of here. A sla as their lib ucont that you get is on the leading key column of the index.
There are no 3D or 4D chess statistics objects in SQL Server, for better or for worse. Maybe someday we’ll get vectorized statistics objects or something.
But when we create indexes and the statistics that get created alongside them show up magically in our databases, in our statistics views in our databases, this is roundabout the data that SQL Server stores about what is in our statistics.
You have some basic information up here. You have some density information down here. And then you have the histogram down here, which has 1, 2, 3, 4, 5 columns in it that helps SQL Server make cardinality estimates based on what you choose to filter on.
The one that I want to filter on here, or the one that I want to call attention to, is for range high key 2, 2, 6, 5, 6, which has 27,901 equality rows in it.
And SQL Server knows it. SQL Server can tell that from the statistics objects. So if we do an equality search on 2, 2, 6, 5, 6, we should be able to get a correct guess for the number of rows that exist for it.
But if we do something like this, let’s turn on query plans too. We might need those down the line. And we get the correct number of records back. We declare a local variable.
It’s an integer. We filter to that integer. And we look at the execution plan. SQL Server had no idea what the hell was going on. Guess 12. 12 rows.
That’s not a very good guess compared to 27,901. Might make for some interesting plan choices if we were joining this off to other tables and consider this to be part of some critical piece of code. Now, we get that guess.
If we look at the statistics, we get that guess from looking at the all density portion of the statistics object and multiplying that by the number of rows in the table.
So if we take those two numbers here and we look at what comes out, we get 11.9 some number of digits in SQL Server rounds that up to 12.
So it guesses 12 rows come out based on that density vector guess. Density is calculated by looking at the distinct number of rows assumed for that column in the table.
So this number here, I promise you, is equivalent to the all density number in the statistics object that we just looked at.
Now, I still hear every time when I look at client code that they used a local variable to fix parameter sniffing. However, I’m not sure in most cases how that can be a valid use case.
It’s a valid use case about once every four or five years. I’m not sure how that can be a generally valid use case considering the guess is so bad that you get that guess for everyone.
You get 12 for everybody no matter what. It’s a uniform guess. If you truly have a parameter sniffing problem because some rows in the table have, some rows are responsible for millions of entries in your table and some rows are responsible for thousands or hundreds of rows in your table and you’re getting bad plan choices because of that, it’s hard for me to imagine how you fixed parameter sniffing when everyone’s getting that crappy guess.
It’s just not sane nor rational to me that anyone still believes that when they say it out loud. I don’t know how anyone says that with a straight face.
I don’t know how anyone says that with any face, really. Anyway, if we look at, let’s take a store procedure where store procedures are usually what people are afraid of, getting whacked with that old parameter sniffing problem.
We’re going to run this query in three different ways. Now, inside our store procedure, and this is usually what, I’ll see something close to this, we’re passing in a parameter called ID, and inside the store procedure, we have a local variable called ID fix.
Now, let’s just pretend once upon a when someone passed in some bad data to the store procedure, it got a bad execution plan, and then things went all haywire.
So we create a local variable and we say if the ID is less than zero, then we set it to one, else we use the ID. And we’re going to use this in three different ways. We’re going to do this once using dynamic SQL, but we’re going to pass the ID fix local variable to dynamic SQL.
The trick here is that the dynamic SQL is parameterized so that we will magically, through alchemy, change our local variable into a parameter.
And I disagree with this capitalization here. I don’t know where that came from. I’m terribly embarrassed. I don’t know where that came from.
Aaron Bertrand’s going to, I don’t know, find some way to kill me. I don’t think he owns guns. But we’re also going to look at the results of this query, which is using the ID parameter rather than the local variable.
And then finally, we’re going to use this one where we use the local variable outside of dynamic SQL. And so we already have this created. Now, all we have to do is execute this store procedure.
We get the correct number of rows back for each and every one of these queries. But of these, only this one gets a correct guess of 27,901.
This one gets a correct guess of 27,901. And this one, of course, gets the crappy local variable guess of 12.
So again, using a local variable does not fix parameter sniffing. It merely changes the cardinality estimation model used when SQL Server goes and does all that fancy relational algebra of figuring out how tables get assembled using various letters in the Greek alphabet.
So as much as possible, I do urge you to avoid using local variables like this. If you’re dead set on using local variables for anything, I would suggest if the code is not frequently called, throwing a recompile hint on there.
If the code is frequently called, and you need to figure out some local variable values, please either pass those local variables to parameterized dynamic SQL or to a store procedure, which can, again, through the magic and wonder of alchemy, convert them over to parameter values.
And we don’t have to worry about getting crappy guesses over and over again. Now, this isn’t to trivialize issues with parameter sniffing. I do talk a lot about that and how to fix that in various ways we can deal with it.
Aside from Microsoft SQL Server 2022’s parameter-sensitive plan optimization, which is, I don’t know. You know when Goldilocks shows up to the three bears’ house and Mama Bear’s porridge is cold and Mama Bear’s bed is too squishy and I sort of feel like that’s what we got out of the box with that.
Hopefully some improvements get made there, but I do wonder. And speaking of the parameter-sensitive plan optimization, if we look at the text for these, we will see that that optimization did kick in for these first two queries, which used a parameter value, but not for this third query, which used a local variable.
So using local variables breaks the parameter-sensitive plan optimization, at least as of today’s recording. Anyway, that’s enough about this. I’ve said too much already.
I don’t know. Thank you for watching. Hope you enjoyed yourselves. Hope you learned something. I do hope you’ll choose to like and subscribe and stick around a little bit.
And, you know, maybe learn a thing or two from an old man about SQL Server. An old man yells at SQL Server. Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Little About How Overindexing Can Hurt SQL Server Performance
Video Summary
In this video, I delve into three unexpected ways that over-indexing can negatively impact SQL Server performance—far beyond just slowing down inserts and updates. I explore how having numerous unused or redundant indexes can lead to inefficient buffer pool utilization, causing more frequent disk I/O operations as the same data is repeatedly cached in memory. This not only wastes space but also introduces variable query performance due to constant swapping of data pages between memory and disk. Additionally, I demonstrate how extra indexes can trigger earlier lock escalation attempts during transactions, leading to increased locking overhead on both tables and indexes. Lastly, I showcase how transaction logging can be unnecessarily burdened by redundant indexes, resulting in more extensive log activity for the same operations. With practical demos, I aim to provide clear insights into these often-overlooked performance pitfalls.
Full Transcript
Erik Darling here with Darling Data, of course. And today we are going to talk about three ways that over-indexing can hurt SQL Server performance. Now, I don’t mean it in the sense that a lot of folks think about it. We’re like, oh, I added too many indexes. Now my inserts are slow. Or now my updates are slow. Or now my deletes are slow. Because sure, that can happen. But that’s the first meme that everyone throws out whenever they think about over-indexing. Now, by over-indexing, I don’t necessarily mean the quantity of indexes. What I mean is the indexing strategies that I see a lot while working with clients, where you’ll either have, you know, like a single column index on every column in the table. Or a bunch of indexes that have been sort of ad hoc over the years that no one has ever sort of reanalyzed to see if they are still of any use or validity. Some indexes could have been created to support parts of the application that aren’t parts of the application anymore.
Or someone could have added another index later on that SQL Server started choosing and using over another index. And you just see either index usage patterns, like drop off for some index, like have totally like zero read indexes. Like these indexes just aren’t helping queries go any faster.
And you still have to keep those indexes in sync whenever you modify the base table, because if you didn’t do that, your data would be corrupt. And also creating indexes on sort of like the same sets of columns over again, over and over again, rather. You know, like, you know, column A, column B, include D, then like, you know, on column ABD, and then, you know, column ABC, and then, you know, you know, stuff like that, where you just see like the same columns indexed over and over again.
And, you know, like index on column A, index on column A, B, C, index on column A, B, C, include column F, G, H. I know all the letters of the alphabet. I can do this all day, all day, standing on my head. But three ways that are sort of unexpected when I talk about them with the folks I work with are how having lots of unused and duplicative indexes can hurt the buffer pool.
How it can lead to earlier lock escalation attempts. Remember that lock escalation is something that is attempted but not always granted. And, of course, transaction logging.
And I have demos to prove it all. I brought receipts. So, the first thing we’re going to look at is how indexes, how indexing, rather, can have you make less efficient use of your buffer pool in SQL Server. The buffer pool, of course, being where SQL Server stores all those lovely data pages that queries want to work with.
SQL Server being a mildly intelligent database. It doesn’t work with pages directly on disk. Any query that wants to read data or modify data, those data pages have to go in memory before SQL Server will start dishing them out to queries to work with.
So, and this goes for even unused. So, this is more about indexes that are read from. So, if you’re indexing sort of the same columns, like, over and over again with the, like, slightly different includes, stuff like that.
It even, like, you know, the same key columns just sort of, like, tacked on at the end over and over again. You will end up with a lot of the same data in memory over and over. Now, this is a problem because most of the servers I look at, and I’m talking specifically to you, I’m pointing at you, I’m saying you, probably do not have enough memory in your SQL Server to cache the relevant data to your workload in memory.
And having the same data effectively, remember, every index is a completely separate structure on disk, and so it’s a separate structure when you read it up into memory. Having the same data indexed over and over again means that you have the same data up in the buffer pool, which means you have less room for other things in the buffer pool. And that can generally lead to a pretty bad situation.
We are constantly going out to, you know, those, well, I’m sure your SAN disks are great, but maybe your SAN network isn’t so hot. But you’re just constantly sort of, like, hot-swapping data for one table or index and for data in other table and indexes constantly. And that just introduces weird variable performance issues because you may have a query that runs very quickly when everything is in memory and then a query that takes, you know, a lot longer when you have to go out to disk and read a bunch of pages in.
So the first thing that I do for all of these demos, because sometimes I don’t always do them in the same order, is rebuild the user’s table so that all the data pages for it are nice and densely packed. I do some updates and other stuff across these that sometimes leads to the user’s table being in a strange state when I go to run the actual demos. So the first thing I’m going to do is tell you a little bit about the scripts that I’m using here.
So let’s actually create these indexes while I talk through those. I get those off and running. This shouldn’t take too, too long to create.
The user’s table is rather small, and I have a great home computer for this sort of stuff. So I have two helper scripts that I’m using for this one. One is called What’s Up Indexes.
It is not a terribly expansive piece of code. It just gets some general information about tables and indexes for a specific database. These are available on my GitHub repo.
The link to that will be in the description of the video. So if you feel like also clicking on that link, as well as clicking on like and clicking on subscribe, you can also go get the scripts to do this stuff at home. So if I run this, we’re going to get some very basic information about the indexes in the database that I just created here.
So we are in the Stack Overflow database. We’re looking only at the user’s table. And we have four indexes currently on the table, one being the clustered primary key, which is 348 megs and about 44,000 pages.
And then we have the three nonclustered indexes that we created, which, since they’re only on a subset of the columns from the clustered index, they are all smaller than the clustered index. But they add up to around like 212 carry the thing, maybe 213 megs total. Again, not huge, but, you know, I’m trying to do some quick demos here.
In real life, we would be looking at much, much larger things. But about 212 megs for all of the indexes. Now, the indexes that I have sort of display a pattern that I see quite a bit in client work, where we have one on reputation include display name website URL, one on reputation creation date include display name location, and one on reputation creation last access date include display name upvotes downvotes.
So three indexes that, I mean, they have slightly different includes, but the key columns all share, like, you know, some commonalities as far as, like, column order and what on all that stuff goes. So another help review that I have in here is called WhatsApp memory, which looks at pages in the buffer pool in which objects are responsible for them. And since I just created these three indexes, the entirety of the index is going to be in memory.
And two, since I had to read from the clustered primary key to create those indexes, like that was a data source for the indexes to create from, that whole thing is currently in memory. At the start of this test, what I’m going to do is run a checkpoint and drop clean buffers. I do it twice because I am a superstitious man.
But if we run that twice and we look at what’s currently in memory, there is absolutely nothing, which is great and wonderful. And boy, I just couldn’t ask for a better demo so far. So I’m going to run this.
I’m going to look at what’s in memory before I run my queries. Then I’m going to run these three queries that all hit the user’s table in a different way. And then I’m going to look at what’s in memory afterwards.
And the execution plans for this are not terribly important because you’ll see at the very end that we ended up reading good chunks of all three of those indexes up into memory. So the first one uses the index U1 right there. I guess I did win, didn’t I?
I won the SQL Server lottery. The second query uses U2, which is a band that I absolutely hate. And they are not New Wave, no matter what anyone says.
And the third query will, of course, use U3 right there. And note that even though we seeked into all of these indexes, we still had to read a pretty significant amount of the pages in. So this is what’s currently in memory.
We got, let’s see, a 53 plus a 47. So that’s 60, 100, about 130, 130 something megs of the indexes are currently in memory of the 212 megs that the indexes make up altogether. So we have three different indexes, nearly the same data in there.
And we have, you know, what I would consider to be sort of a wasted buffer pool space because of that. Now, when I see this when working with clients, usually I consider it part of my job to start consolidating indexes. I mean, like obviously dropping off unused indexes if the server’s been up long enough to make a sort of confident call on that.
But then, you know, consolidating duplicate and, you know, close by duplicate indexes so that we have fewer objects competing for space in the buffer pool. So how I would do that here, I mean, I wouldn’t run drop indexes on a client database unless they were really mean to me, didn’t pay their bills. But what we can do is we can consolidate those three indexes into one index that has the key columns that we care about and then the included columns that we, that sort of made, that all the three different indexes had in there.
Again, include column order doesn’t matter. They’re just window dressing for the indexes. We can put them in any old way we want.
And let’s run that same experiment now with just the single index and see how things turn out. So we start off with nothing in the buffer pool because we cleared everything out. All three queries run and use the u4 index, which makes me euphoric.
Kill me now. And all three are going to use that index because that index has all of the same stuff in there. Note that we did a seek into this index for all of these, starting with the first one.
Let’s just zoom back in on this. All three of these do a seek into the u4 index. That joke was u4 icky.
But now, rather than having, you know, 100-something megs from three different indexes sitting in the buffer pool, we have one index in the buffer pool that takes up about 70 megs. So obviously, we’re doing much better buffer pool utilization because of that.
Now, if you go look at this, remember that the three indexes combined were 213-ish megs altogether. This index all consolidated is 101 megs altogether. And so we have one much smaller data source that can service all three of the queries that we care about.
So, yeah, I don’t know. I guess that’s about that there, right? Cool.
So now we know that consolidating indexes, and that would include getting rid of unused indexes, can help us save space both on disk and in the buffer pool, right? Those precious disks that your SAN administrator gets paid billions of dollars to manage.
Billions. It could be an Oracle DBA making that kind of money. All right.
So the second thing that index overindexing, having too many unused and duplicative indexes around, can hurt is lock escalation. So we’re going to start off.
We’re just going to rebuild the user’s table real quick. And the first thing we’re going to do is in the transaction here, we’re going to update this table. I don’t even know why I keep that in there.
We don’t even look at the execution plan of this one. It’s a habit, I guess. And then we’re going to use a different helper object called what’s up locks. Since we’re not doing anything right now, there are no locks to assess in here. But once we’re in a transaction with this update, boy, howdy, there’ll be fireworks.
So if we run this with just the clustered index, and we look at what locks gives us, we can see that SQL Server has, well, like one intent exclusive lock on the object, which isn’t like a locking locking thing.
It’s like a pre-lock weight kind of. This weight would be blocked by other locks or by an object level shared lock on the table, on the object. But the real thing that we care about here are the granted exclusive locks on pages in the primary key.
So that’s just with one index. We end up with 3,394 pages locked. If we add in this index, and we run through the same thing again, we will see that SQL Server now has a new object to lock, and it takes out more locks, not like the time machine villains, but more locks as in like more locking on the index.
And we end up with some thousands of locks taken and granted, both still on the primary key and still on, come on, mouse cursor, do what I want. And now also on the non-cluster index that we just created.
Those are 4,002 key locks there. So we have key locks here and key locks there. Now, if we add in just a second index, which is, again, a situation I run into with clients a lot, just the same columns in a slightly different order.
And like they both kind of get used, but, you know, no one really knows which queries, you know, hit what, which queries are more important, all that stuff. You know, and this, I mean, this takes a little bit more analysis and domain knowledge to figure out, like if you can get rid of one of these.
But if we add in that third index, and then we run this, rather than locking a whole bunch of stuff separately, we now have one exclusive lock on the entire user’s object, and that has been granted.
So, again, lock escalation is only attempted. It is not guaranteed to happen. If other competing locks get in there, you may not see lock escalation happen. There’s some counters in some of the index DMVs with lock escalation attempts in there.
So those are, you know, interesting things to keep an eye on. But this is one way that, you know, having, you know, additional duplicative or even unused indexes on your table can hurt performance.
Because even, like I said before, even unused indexes need to end up in the buffer pool and need to be locked to modify when you modify the base table. SQL Server can’t play favorites and be like, well, you haven’t been read in a while.
We’ll just update you later. We’ll defer that update to some other time. It doesn’t, that doesn’t happen. It does not happen. And so that can end up being sort of a bad time. All right, cool.
So the third and final way that, you know, you know, over-indexing can hurt SQL Server is with transaction logging.
So what I’m going to do is just create all three of these. I’m going to do all this stuff in one big swoop. And then I’m going to show you what an update looks like from the transaction logs point of view.
So two things I want to show you. One is that there’s a reason for the checkpoint here. And the reason for the checkpoint is because after we create indexes, we have data about creating those indexes in the transaction log.
If I checkpoint everything in there, go away, SQL prompt. If I checkpoint everything in there and I go look back at the transaction log, we’re not going to, it’s going to be empty, right?
So we don’t have anything in there for this table. If I go and do this, so again, we’re doing the transaction in an update so that I don’t, I don’t, you know, we’re doing the update in a transaction rather so that I don’t have to worry about undoing any evils that I’ve done.
If we run this, we’re going to see all three of the nonclustered indexes that I created end up with a bunch of locks in the transaction log and also with a bunch of additional records in the transaction log.
Now, this will be true for, since this is an update, only indexes that have the columns that are being updated will end up in the transaction log.
If this was an insert or delete, then every index would be affected because, you know, deletes are every row in the table, inserts are every row in the table, even if they’re null or something, a new record is added, so you would see records in the transaction log for that.
Now, of course, if you have filtered indexes where, you know, the insert or delete didn’t touch the data that was, you know, not part of the filter, it wasn’t like inclusive of the part of the filter, then you wouldn’t see anything there, but, you know, that’s a little bit more of an edge case.
So, you know, we have all that. Anyway, those are three ways that I find people are rather surprised by that can hurt SQL Server performance by having too many unused and overlapping indexes hanging around in your database.
If you want to analyze your indexes, I, of course, recommend SP Blitz Index. It’s an open source tool that I contributed. Well, still contribute once in a while, too. That one’s sort of set in stone at this point, but I contribute a lot of stuff to that in the past that I’m rather proud of.
Of course, originally written by the lovely and talented Kendra Little, wrote a great post today about SQL Server Management Studio being the best SQL Server monitoring tool on the planet, which I heartily agree with because the rest of them are a bit iffy.
But anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And again, please like and subscribe and all that good stuff.
And I will see you in another video where we will talk about, well, I guess more SQL Server stuff. It’s the way this seems to go, isn’t it? Wish I knew more about something else so I could talk about that.
Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Little About Views, Parameters, and Local Variables in SQL Server
Video Summary
In this video, I delve into the complexities of parameters and local variables in SQL Server views that contain windowing functions. Exploring how these elements interact with query plans under different compatibility levels and settings, I provide a comprehensive overview of what to expect and how to optimize performance. I cover the nuances of simple parameterization, the challenges it poses with windowing functions, and the impact of various trace flags and database configurations. By walking through practical examples and discussing the limitations of local variables, I aim to equip viewers with the knowledge needed to navigate these tricky scenarios effectively.
Full Transcript
There are some levels of exhaustion that are just unfathomable. I’m Erik Darling with Darling Data, and in this video I’m going to talk about how parameters and actually to some extent a little, just a tiny little bit about local variables and how they’re behave with views that contain windowing functions. This is sort of a wrap-up video because I’ve talked about some of these things separately, but with the advent of SQL Server 2000-2022 and a brand new compatibility level of 160 available, there is a little bit more to talk about. And boy is my heart tired. So let’s dig right in and have some fun. Now words of wisdom about windowing functions, they are generally aided and assisted from a performance perspective. And all those P’s are really going to test if my audio is synced up with any degree of accuracy. If we create them, well let’s look at our windowing function and let’s look at our windowing function.
We are query generally. We are generating a dense rank. We are partitioning by owner user ID and ordering by score descending. Aside from those two columns that are also in the select list, we also have creation date and last activity date. And so our index for this windowing function and the query more generally is going to be keyed on owner user ID and score descending and include creation date and last activity date. I like to include when reasonably possible all of the columns in the select list to cover the query so that we don’t have to deal with SQL Server thinking about key lookups because SQL Server often seems to make counterproductive choices when it comes to doing key lookups.
Either not doing them when it comes to key lookups. Either not doing them when it would be a great idea or doing them when it is a pretty terrible idea. So we are just going to make sure that we don’t have too many choices to worry about here.
I am going to worry our pretty little heads about this one. So I am going to create this view called Pushy Paul Error Free Professional Presenter over here. And we have got query plans turned on because believe it or not I have had this script file open for a while making sure that I have a reasonable number of bases covered when I give you all of this information that you are going to forget about mere moments from now.
But let’s run this query. Let’s select from our view and we are going to use a literal value here. Now various things could trip this up.
Simple parameterization would be one of them. And the database level setting. And the database level setting.
Well, let’s backtrack a tiny little bit. The query optimizer. Well, part of the query optimizer choices. One is to add simple parameterization to a query.
That could potentially trip this up. Another is the database level setting. Force parameterization that could also trip this up. But when we use a literal value and we run this query, everything returns nice and fast.
Right now we are in compat level 160. And even though this looks like we got a simple parameterized query, we didn’t. It was attempted, but it was not actually used.
If we look at the index seek over here, we have a literal value. I don’t know what a value is. It’s not a town in Colorado.
But we have a literal value down here. We do not see a parameter down here. So simple parameterization was not successfully applied to this query, though it was attempted. Surprise, surprise.
There are weird things in query plans that no one should ever have to explain to anyone. But we do get a quick index seek. We do get an accurate cardinality estimate.
And why wouldn’t we? It’s true, right? And obviously it’s true to me. Let’s try to say, sorry,win I’m going to kill. We’re trying to see the first demo protocol right here. SoHello. If I were to stuck in and try to estimate them. I would do, umm.
I would $100, $200, $ président, $! differences in the query plan under compatibility level 150. So let’s run these and well these run we’ll say that if you were to use the global trace flag 4199 if you were to use trace flag 4199 as a query in here or you were to use the database scoped configuration query optimizer hot fixes you would not run into this in compat level 150. Now you would also not run into this in compat level 140 I’ll tell you exactly which cumulative updates that applies to towards the end of the video but if we look what happens here the top query no longer features an index seek the top query now features an index scan then a window aggregate and then a filter whereas the bottom query just features the index seek and then the window aggregate the filter is of course going to be the parameter for user ID because that is the only filtering element in the query the main message here is that when you have a windowing function in your view and you use parameters under certain conditions that I will tediously explain over and over again I’m going to feed that dead horse a full meal the local variables and parameters can’t be pushed past either the window aggregate which is just a batch mode sequence project or the row mode sequence project plan operator fun right it’s exciting it’s great stuff under compatibility level 160 that story changes a bit because there was a more recent optimizer hot fix that allows for that to happen so if we switch this to compat level 160 we will see both queries um use well slightly different execution plan but here is our sequence project and in parentheses compute scale R thank goodness it’s not just a compute scale R so that we could tell that we are projecting a sequence here because everyone who looks at a sequence project operator immediately understands that they are projecting a sequence of some sort not just computing a scale R but both of these queries now uh get index seeks even though one of them this top one has a parameter and this one here is still lying to us about uh getting a simple parameterized plan well there’s not much you can do there is there could throw a one equal select one on there to clear things up but then y’all would accuse me of cheating or something so let’s switch back to compat level 160 and uh talk a little bit about why this happens um views of course can accept a parameter when you create a view there’s no parameter list for passing into the view uh you just create the view and then you call that view from somewhere where you decide what you want to filter things on uh the optimizer uh except under conditions that we will discuss uh quite a bit in this video uh cannot push parameters past sequence project uh compute scale R or window aggregate operators which in my earpiece I’m being told that uh window aggregate operators are really just batch mode uh sequence well I forgot a word in there P-R-O-G project there we go uh it only works with uh constant values like literals right um unless you are in compat level 160 or compat level 140 or 150 with certain uh patch levels and uh database level setting settings uh query optimizer hotfixes trace flag 4199 enabled globally or trace flag 4199 enabled at the query level uh recompiling would also allow for this because it would reduce your parameter or local variable to a literal value uh another way of getting around this limitation is to create a uh inline table valued function in place of a view because inline table valued functions uh can accept parameters inline table valued functions also prevent simple parameterization and uh even under compat level 150 this works out pretty well all right there’s our inline table valued function and even if we stick that inline table valued function inside of a store procedure even in compat level 150 we will get the execution plan that we’re after where the parameter value is pushed past the sequence project compute scale r and we get an index seek rather than a full index scan that takes six seven seconds like we saw in other uh in other demos run during the video our seek predicate here is not on a literal value it is just it is on the parameter value passed into the inline table valued function so sql server 2017 cumulative update 30 uh sql server 2017 of course maxes out with compatibility level 140 uh sql server 2019 cu17 which of course maxes out at compat level 150 uh both have this query optimizer hotfix available which again trace flag 4199 uh the database scope configuration query optimizer hotfixes and of course uh sql server 2022 rtm and beyond uh without making either of any of those changes either the trace flag and either uh uh either method or the database scope configuration will fix uh the issue with parameters local variables and views without a recompile hint fantastic you don’t need to go rewrite all your views as inline table valued functions though um i don’t know maybe i don’t think that’s such a terrible idea maybe maybe i think that’s actually an okay idea uh one it fixes a lot of weird problems like this and two it gets you in the habit of writing inline table valued functions instead of crappy functions like scalar udf’s or multi-statement udf’s but uh you could fix this with uh trace flag 4199 as well do this and trace flag 4199 would address the issue good for us uh so one thing that i do want to uh uh talk about a little bit is local variables because um even though uh the uh even though the query optimizer hotfixes will address uh the the issue with not being able to push the predicate past the sequence project compute scalar or the batch mode window aggregate which is just the batch mode version of the sequence project uh they do not help improve uh none of that and it helps improve cardinality estimates at all so even we’re in compat level 160 and we run this even though no did i switch that to 160 oh no i’m i’m i see now i’m forcing compat level 150 in here that’s silly me like i said it’s all so tiresome so if we look at this query plan uh uh we are forced to again scan the whole index and apply the filter later for that for that local variable now uh not now if we uh still use compat level 150 but we enable uh trace flag 4199 we’ll get the seek plan that we’re after but even doing that we get a the really crappy local variable density vector guess there of 12 rows out of two thousand seven twenty seven thousand nine hundred and one uh i apologize for causing you pain but i just want to make sure that you all caught that on this one as well it’s only another six seconds of your life what were you doing anyway can’t even say i love you in six seconds but um we continue to get the crappy estimate here right the estimates are over here because we have to grab every row the estimates only bat at the filter where there’s actual cardinality estimation to be applied to a filtering element in the query uh uh and that also goes for uh if we hint to use compatibility level 160 we run this and even though we get the seek plan that we’re after we still get the terrible guess so even though um some some effort was made by the uh by the the nice folks who brought you dot feedback uh to fix this issue uh we still do not get uh any sort anything resembling accurate cardinality estimates from the local variable so uh i would advise you except under uh closely monitored circumstances to avoid local variables where possible um pain swelling itching redness nausea vomiting diarrhea and possibly even death have occurred when using local variables so anyway uh the larger point here is that uh microsoft does do some valuable things across uh versions of sql server of course uh and even in some cumulative and sometimes in cumulative updates of course we don’t really have much choice anymore since we we will never see another service pack again we will only ever see cumulative updates from now until forever for sql server matter maybe there’ll be a different word for it at some point later but um anyway uh the the main point here is that uh there there are some things that can be fixed by upgrading there are some things that can be fixed with trace flags and this is one of them this is a pretty good example of it uh i’ve seen performance issues related to this crop up oh i don’t know probably a dozen or so times in my time consulting so this is a good thing to keep an eye out for in query plans of course i do generally recommend keeping an eye out for filter operators in query plans because they often indicate that you have done something monstrous not always of course but they are a good thing to keep track of because uh they usually mean that uh there was some predicate that possibly could have been pushed down closer to when you touched uh and table or index uh and it wasn’t and you could have been dealing with a lot fewer rows traveling throughout uh the course of your query plan if you would apply if you were able to apply those predicates earlier on in the query plan i think uh probably the the best example of that is when someone does a left join to a table to find rows that don’t exist and they filter on where the left join to table is null those tables have to be fully joined before that filter can be applied to rule out null values even when you’re using a not nullable primary key you have to deal with that of course that situation is usually better dealt with by using not exists which can eliminate those rows uh at the anti-semi join rather than fully joining the tables and looking for nulls after the fact uh some things like windowing functions like you know if you want to run a query and look for where row number equals one i mean that’s an expression that you just have to gin up at runtime that’s not something that you can uh not something that you can really help but that’ll that would something like that would also result in a later filter operation because that value wasn’t stored anywhere where you could you know persist it index it and filter on it uh earlier in a query plan than after you have generated the row number so anyway uh i think that’s about it for this one i think that’s also about it for me today i’m gonna i don’t know i guess eat a dinner and uh try to forget the computers exist for a little while so thank you for watching uh i hope you enjoyed yourselves i hope you learned something i hope that this wasn’t too tedious for you and uh remember to uh do the old like and subscribe one two shuffle for me anyway uh thank you again for watching
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Little About Improving Modification Query Performance In SQL Server
Video Summary
In this video, I delve into a common performance issue that can arise during modification queries, particularly focusing on how certain query plan patterns might impact overall performance. I share practical examples using the Stack Overflow 2013 database and demonstrate how to mitigate these issues by employing techniques such as temp tables for manual phase separation and leveraging automatic phase separation through hash joins. By walking you through these scenarios, I highlight the importance of understanding query plans and optimizing them to enhance efficiency without compromising data integrity or concurrency.
Full Transcript
Erik Darling here with Darling Data. And in this video, returning to my normal form after a nice haircut. Round of applause for my haircut. Feeling extra cute in my Adidas T-shirt today. Returning to my normal form and talking about a little bit of this and that, we’re going to talk about sort of a general query plan pattern that you can watch out for in modification queries that can have an impact on query performance. I don’t remember why I called it this. It just seemed funny at the time, I guess. I don’t know what to tell you. Thanks, SQL prompt. I needed that. We have that healthy list of things. So we have, let’s make sure query plans are turned on. Stick to the script, pal. And that should be good there. Now, make sure everything’s nice and saved and pretty. All right. So what I’m going to start with is sort of a silly looking update. And this update is going to use an auxiliary table that I create in the Stack Overflow 2013 database just to do some dirty work for me.
I end up doing stuff like this a lot because I really get sort of nervous modifying the user queries. I mean, the user tables. I have like a backup stage copy of the Stack Overflow 2013 database where like I’ve never run in like any sort of modification so that if I need to like, like emergency change some data back, I can do that. But I do like these auxiliary tables because I don’t really care what happens to really care what happens to them all that much. I can just, you know, drop and repopulate them and, you know, move on with my day. Whereas if I screw up a modification to one of the user tables, it can have a, let’s call it a profound impact on other demos that I might run. So let’s update, let’s run this update query. And I only have that one equals one there so that Redgate SQL prompt doesn’t keep throwing messages at me while I’m running the demos to make sure that I have a where clause.
I do find that warning generally helpful. So I don’t want to turn it off. But I also just don’t need to hear any guff. But apparently the old where one equals one gives SQL prompt enough of a, enough of a warm, fuzzy feeling that it doesn’t say, hey, you don’t have a where clause. Which is funny because I do have two where clauses elsewhere in there. But I understand that linting T-SQL is a difficult task. So if you look at this query plan, a few things stick out. One, it is single threaded. That is probably obvious from the lack of parallelism operators or parallelism icons that show up in the query plan. But looking at this, right up until we hit this merge join, our whole query only takes about 1.2 seconds.
And then we hit this table spool. And let’s see, 3.5 minus 1.2. That’s a two, let’s call it 2.3 seconds inside this eager table spool. And then we have another two and a half seconds doing the update. Which, you know, doing the update, like I do expect that to take time. You know, we are updating 1, 4, 3, 4, 7, 2, 6, 1.4 million rows. So like, I expect that to, you know, take a little bit of time to do the update. But what I don’t love is that eager table spool. Many times spools are useful, or at least SQL Server trying to be useful and helpful by caching data. In this case, it’s there for Halloween protection, which is a very useful thing. Strange that Halloween protection was discovered on Halloween, right?
But the spool is there to keep track of rows that have been modified so that we don’t, you know, end up in some awful endless modification loop or some other weird, you know, edge condition where we keep trying to modify rows that have already been modified. And, you know, the old pipelined execution thing kind of gets, kind of gets icky in that regard. Now, everything’s a cursor behind the scenes. So if you’re going to keep complaining about cursors, you should probably just stop using databases.
Anyway, spools, despite their best intentions, best efforts, are not really highly optimized data structures. Rows get fit into them one at a time. You know, they don’t have any of like, like the optimizations that like temp tables and stuff have gotten, even though they exist in temp DB, spools live in temp DB. So even though spools live in temp DB, they don’t get a lot of the optimizations that temp tables and other things have over the years to make putting data into them any more efficient, which is kind of a bummer.
So we can see where the time really starts to accumulate in this plan, right, across these three operators. We have the 1.2 seconds, jump up to 3.5 seconds, jump up to 7 seconds. So I’m not in love with this.
And the reason I’m not in love with this is because there are ways that we could get the same sort of effect as the table spool with a whole lot more, a whole lot better performance. Now, this is a sort of small update, so the performance gains aren’t going to be like, like 7 seconds to like 1 second, but we can shave a bunch of time off this. And in larger scenarios, you could likely shave more time off of this sort of thing.
So the first way that we can introduce this sort of phase separation here is to use a temp table. And the reason why that works is because if we come back and look at the query itself, the total score by user table is the source of the data that we are selecting in order to update the total score by user table. So we want to find a user’s highest score across questions and answers.
And so to do that, we find the question score and we find the answer score and we get the max here. Now, I get it. There’s probably a way you could use a computed column or something else to get this. In case when question score greater than answer score, then question score, case when answer score greater than question score, whatever.
If they’re tied, you can stick that in there too. But in this case, we’re going to pretend that we’re doing this update for very good reasons that we cannot possibly overcome by other means. So one way that we can do this is with a temp table.
Because if we put everything in the temp table that we care to use for the source of the update, then we are no longer using the table itself to update itself. Stick with me on that. The thing is, if we’re going to do this and there’s any sort of concurrency in the database that might be changing data underneath, we need a transaction and we need to make sure that the total score by user table doesn’t change while we’re doing this.
Because anything that we put in the update in the temp table is going to be a snapshot of what the total score by user table looked like when we got the data. But then if something comes along and inserts, updates, or deletes data afterwards, then we could have some wonky results. So in this case, I’ve opted to use the hold lock hint.
And I’m going to partially run this to show you why. So if we grab this chunk of the query, let me say drop the temp table if it exists. We don’t have to do this in a store procedure where you’d most likely be doing this sort of thing.
But we do have to do this in the script itself. If we run this, we are going to utilize an inline table value function that I wrote. And it’s in my GitHub repo.
And what we’re going to do is use the hold lock hint. Alternately, we could also set the transaction isolation level to serializable for this. And what we get back from WhatsApp locks there is we see that our query has been granted 29,830 page locks with a shared lock on them.
Which, if you’ve watched other of my videos, you would know that the shared lock prevents modification queries. As the shared locks are not compatible with modification locks. So if we come over here and we try to insert a row into total score by user, this thing will be hung up, getting blocked by the…
Well, it says set statistics XML off, but I promise you that’s our user session 67. That’s 67 up there. That’s the queries with the hold lock.
So we can’t insert into the table. We see that insert waiting on an intent exclusive lock. If we cancel that and we get rid of useless prompts, we will see that we can’t update the table either. All right.
We come over here and look at this. We’ll see our update. Also going to be looking for an intent exclusive lock, but the update is stuck on waiting on those hold lock hints. Likewise, we will not be able to delete any data from the table because of those locks.
So this delete would also be blocked while this is running. Now, of course, we wouldn’t really want to have a serializable lock on the table for that long to preventing other updates, but we would need that for data correctness. Of course, we are allowed to select from the table.
So if we run this, we can still… Select queries can still run against the table. The only locks that are taken and held are there to prevent modifications. So let’s commit this because it’s just some selects.
And now let’s run the whole thing in one go. All right. So we’re going to drop our temp table. We’re going to insert data into the temp table that we care about. And then we are going to run the update from the temp table.
The update from the temp table is just what happens down here. Right. So I am indexing the temp table in this case because I did find it useful in my performance testing. I do suggest you also test indexing temp tables in those scenarios or in any scenario, really, where there’s performance as a thing.
So what we get back here is 418 milliseconds to dump data into the temp table. This query down here is what’s up locks running. So we don’t have to care about that time because that wouldn’t be part of our normal transaction.
So we have 418 milliseconds there, about 328 milliseconds to index the temp table, and then about 2.7 seconds to do the update. And in this part, in this query plan down here, the majority of the query time just is the update. Right.
Like we don’t spend a lot of time up until then, 643 milliseconds total. So, you know, about two seconds doing the actual update, about 643 milliseconds leading up to it. But the update is the majority of the work in there.
And this is, you know, better performance than we got from doing the straight update from the, like to update the table using itself as a source. So 418 plus 328, that’s 7 and change. And then 2.7 here, that’s 3.4.
So that’s three and a half, about three and a half seconds down from seven seconds. We cut that time in half. Good, good, good. These are all positive things, right? Breaking things up a little bit is a grand way of doing things.
Another way that SQL Server can perform what my dear friend Paul refers to as phase separation. What we did up there is manual phase separation. What we’re going to see in the next example is called automatic phase separation.
And what we’re going to do is instead of do the update and, you know, set the news that like sort of like that sub query to do the update. Instead of doing this, we’re actually going to join to the result of that. And what I had to do in this case was to hint for a hash join.
The reason for that, the reason why this gets a little bit tricky is because the ID column that we’re joining on is the clustered primary key of the table. And when SQL Server has two clustered primary keys, it generally tends to favor either nested loops or hash joins or merge joins rather. Because we have ordered sets, we have, you know, well-defined indexes.
So SQL Server is generally tends to lean towards those. There are plenty of circumstances where it would choose, might they choose hash joins. Otherwise, in this case, I just used a hint rather than fiddle and faddle with a bunch of other stuff.
But if we run this query and we use this update instead, this will finish in, I forget now. I did this late last night, so my mind’s a little less useful than it used to be. But this finishes in about 4.6 seconds total.
And again, most of the time in there is spent in the update, right? So we had 503 seconds up until the update and then 4.6 seconds after. It was about 4.1 seconds.
Not only about 400 milliseconds difference from the 3.7 when we did the manual phase separation thing, but still better than the 7 seconds total there. And this is, you know, without having to, you know, do a transaction, set serializable, because all the locking is performed, right, just within this query by itself.
But notice in this query that, I mean, A, we get a parallel plan, which we did in the original one. So that helps things along. But also there’s no spools in here.
The reason why we don’t need a spool is because we have a few stop-and-go operators that pause output for rows to be kept track of. We have a hash join here.
The hash join is a stop-and-go operator. People call it like an internally blocking operator because all the rows have to get to the hash join. The hash table has to get built and other stuff has to happen.
We also have a sort in the query plan. The sort is another one of those stop-and-go internally blocking operators. All the rows have to arrive before they can come out. So in this case, because we have operators like this in the query plan, we don’t explicitly need a spool to keep track of which rows have been hit or not because all the rows that come out of that are going to be rows for the entire query, right?
All the results have to hit operators like that and then be passed along. We don’t have just an entirely streaming plan that can just pass rows along as it goes and may end up rereading rows on the other side, right?
Because in the case of the first query plan, and I’ll just get the estimated plan for that again so we can talk a little bit about why that’s a thing.
Part of the reason for that is, like when I talked about how read committed is kind of not my favorite isolation level, all of these operators, none of these operators aside from the table spool will cache all the rows from the query.
So that’s why this gets inserted here. SQL Server could have chosen a sort or a hash or something else to do it. It just didn’t. Cost-based optimization chose a spool. So what could happen is, as we’re doing the update here, other operators could end up rereading rows in here.
But because we have this spool as a caching mechanism, because we use a temp table as a caching mechanism in the second example, and because we had hashes and sorts in the other query plan that can be used as caching mechanisms, the spool becomes not needed explicitly to do this because we have other things that keep track of the rows.
So anyway, if you’re looking at modification queries and you’re trying to tune those, and you find that performance is subpar, you now have a few, or actually you have a couple of things that you can try to improve the performance of the updates.
Of course, if most of the time is spent in the update operator, like if, like, you know, let’s say the entire query runs for 10 seconds, and like nine and a half seconds is spent in the update, you may have a different problem to solve, either like the number of indexes that you’re modifying, you know, stuff like that.
You know, sometimes, you know, putting locking hints on there to skip like row or page level locking, jump right to a table level lock can be useful. But this is just kind of a good example of how you can rewrite queries in different ways to improve performance of the entire operation.
Anyway, that’s about it here. Thanks for watching. As usual, I hope you learned something. I hope you enjoyed yourselves thoroughly, thoroughly enjoyed yourselves.
And remember, do the old like and subscribe dance for me, and I will see you in another video probably pretty shortly. Thanks again for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
SQL Server 2022 Degree of Parallelism Feedback Is Wack
Video Summary
In this video, I dive into my thoughts on a feature in SQL Server 2020-22 that I find personally disappointing: the degree of parallelism feedback. I explain why I believe it’s not very useful for most users and discuss how it only tests downgrades to DOP 2, which is often unnecessary given that many servers have their DOP configured correctly. Additionally, I suggest a better approach would be to allow SQL Server to explore parallel plans for queries that do not break the cost threshold for parallelism setting, helping to mitigate risks associated with adjusting this setting. The video covers four main reasons why I find this feature lacking and concludes by recommending improvements or alternative settings that could make query optimization more effective.
Full Transcript
Erik Darling here with Darling Data, and I’m taking a little break from my recording about various community tools and my little bit about series, which, I don’t know, I was going to call that Tip Jar SQL because it’s sort of like small little tidbits about things, but got lazy. Just sort of forgot to go down that path. Anyway, I’m here to talk a little bit about, I think, a really disappointing feature in SQL Server 2020-22, and that is the degree of parallelism feedback. I’m going to talk about the four reasons why I find it personally disappointing, and then I’m going to talk about some things that I think would make better additions to the intelligent query processing set of features that have the intelligent query, and that have the intelligent query, and that have permeated SQL Server since 2016 or so.
It’s a really interesting feature that have permeated SQL Server since 2016 or so. Maybe 2017, who knows? One of those years.
Sometimes it’s hard to tell the difference between 2016 and 2017 in retrospect. So, the first of the things that I dislike about the feature is that it only tests the data, and it’s not just the data, but it’s not just the data, but it’s not just the data. Where you have datatrue that knows that folder has its protection for its operational assessment that has done which changes when the database can IRS Liaultにな place.
Debenstrom app has five, seven views as the data, and all of those areas are connected to that. So, you know, that’s useless at best for most people. Most people have DOP moderately correctly configured on their server, and testing minor downgrades to that is quite a waste of SQL Server’s time and of your money.
You’re paying $7,000 a core for SQL Server to figure out if a query can run at the same speed at a lower DOP by like 2 or 4 or something. It only tests downgrades to DOP 2, which means that you will have at best a minorly parallel plan. SQL Server will not test if going down to a single-threaded plan would not detract from query performance at all.
So that’s kind of silly. And I’ve been given, I think, some reasoning on that, and that would require a recompile, which is just nonsense because when SQL Server comes under sufficient CPU pressure, you will see a parallel plan, but the query itself will only be running at DOP 1.
I have plenty of demos that show that. They do not require a recompile. You still see a parallel plan for it, but behind the scenes, a query only gets one CPU thread. Its DOP is indeed one.
It doesn’t test DOP upgrades, meaning that if you have, let’s just to make numbers easy because I am not particularly good at math. Let’s say that you have a query running at DOP 4 that’s hitting a 100 million row table. Let’s go to 100 million rows.
Let’s really scale this thing up. You have four threads suffering with 25 million rows apiece. It does stand to reason, at least in my experience with query parallelism, that if we were to alleviate some of the stress from those four cores and maybe bump it up to eight cores, and we would have roughly 12.5 million rows per thread, each of those threads would be under far less stress and strain.
We could probably have a more efficient query, but degree of parallelism feedback will not test that for you. You don’t get that. You get the same dumb DOP 4 plan without any hope of a parallel upgrade, which is pretty goofy.
Want my honest feedback? That’s just plum goofy. It also won’t test upgrades from a serial plan to a parallel plan. That doesn’t happen.
Probably gets it more malarkey about needing to recompile. But we all know that’s not particularly true. Yeah.
And so, like, if you have queries that are running for, I don’t know, 5, 10, 20, 30 seconds, and there is nothing specifically inhibiting them from going parallel, then why not test and upgrade to a higher DOP to bring the duration of those queries down?
Since this is something that is not permanent, it is impermanent, as some might say, why not just test it and see if there’s any dramatic decrease in overall query duration at higher DOPs, like, say, 4 or 8?
8’s a pretty good number for most DOPs, as long as you have the CPU cores to back it up. So, that’s a pretty disappointing thing there. Now, this feature really feels like there was no adult in the room at the time when it was decided on.
This feels like developer pizza, where, you know, they decided that they wanted to, you know, pick out, like, the grossest pizza, and you’re like teenagers sitting there with a menu being like, ew, tuna fish and gravy or something, like, just like picking out intentionally gross pizza toppings.
This is just not a very good spec for this thing. I can’t imagine that there was very good telemetry involved in deciding on this one. It just kind of feels like a what if.
Like, developers get one wish per release, and this was the wish. Now, for a very, very long time, SQL Server has had the cost threshold for parallelism setting, and this setting has led many, many people to think that query cost is a very important metric.
And I guess it is in the, you know, because it does, you know, if a query costs under your cost threshold for parallelism setting, and it would be reasonable for that query to go parallel, but it does not get considered for parallelism because it does not have a high enough cost to go over your cost threshold for parallelism setting, then, you know, you could have some issues.
But quite far more often, I find that, you know, servers running at the default cost threshold for parallelism setting have far more problems than folks who have raised it up and might have a few straggler queries that could use some help. Now, way back when, the cost threshold for parallelism setting is, or rather, meant seconds on one guy’s computer named Nick in, like, 1997 or something.
That’s the, at least that’s the folklore around it. And there are still people I talked to today who think that cost threshold for parallelism means seconds, which is kind of weird. But I think that, well, two things.
One, a nitpick is that, you know, the cost threshold for parallelism setting should be part of the SQL Server installer. MacStop is in there, has been in there for a few releases now. You know, and I think it would be a generally useful thing since most sane and rational people, after they install SQL Server, will set cost threshold for parallelism away from the default.
Whatever they set it to is better than the default unless they set it lower for some reason. That would be a weird choice. You know, most folks I know will start at 50 and then adjust as necessary from there, assuming that 50 is not a successful change for them, or not a completely successful change for them.
I think that a much better addition to SQL Server than degree of parallelism feedback, which only focuses on downgrading parallelism, would be a setting that would allow you to allow SQL Server to explore parallel plans for queries that do not break the cost threshold for parallelism setting, and look at their overall duration, perhaps their weights as well, if that’s the kind of thing that you care about, and figure out if a parallel plan would be better.
It would also be good if we could test parallelism at higher DOPs up to some upper limit, you know, whether that’s configurable or that’s just based on, you know, some internal math, and look at the number of cores and all that good stuff.
Cores and sockets and things, things that people really think hard about when they’re setting maxed up and all that, figuring out NUMA, getting crazy in there. But I think a much better setting would be, you know, to figure out if, figure out what threshold you would want to explore a parallel plan for a query that was deemed not expensive enough by the optimizer to consider a parallel plan for, because that would take a lot of the risk out of adjusting cost threshold for parallelism.
You know, just to throw a number out there, let’s say that you change cost threshold for parallelism to 50, and there’s some really important query that has an estimated subtree cost of 49.9 query bucks, does not quite break that threshold, and you have a regression to a parallel plan.
Well, this would take a lot of the sting out of that, because if that thing ran for, let’s just say, 10 seconds, the optimized, you know, the SQL servers internals looked at that and said, oh, 10 seconds?
Well, that doesn’t seem very, that’s not very fast. Let’s try .4. Okay, you know, now let’s just say that it, like the parallelism scaled linearly, and now the whole thing runs in like 2.5 seconds.
Well, that was, that was a successful introduction of parallelism to a query. I realize that there is inherent risk with introducing parallel queries or upgrading parallelism. Downgrading parallelism does not, does not have, you know, the whole issue of, you know, parallel queries and worker threads and all that stuff, you know, there is risk in there.
But I think that is a risk that most workloads do not have. And the fact that this setting is one that you can turn on and off is one that you can use to mitigate those risks a bit for anyone out there who has a workload where they are at risk of that.
But I think most people who have workloads where they are at severe risk of that are at least fairly well managed as far as DBA staff and all that stuff goes. So anyway, those are my thoughts on the degree of parallelism setting.
It’s not a terribly good feature. I’ve seen the demos for it. It’s not very interesting.
DOP goes down, doesn’t go up. And that’s about it. I don’t know. Maybe this will be one of those things that just no one ever turns on and it dies on the vine and that would probably be the best thing for it unless significant improvements are made or other settings are introduced to allow you to better control parallelism on the server, specifically going from a serial plan to a parallel plan where it would be more efficient to do that.
Anyway, that’s it. Thank you for watching. You may not have learned anything, but hopefully you enjoyed yourselves anyway.
I’m going to go eat lunch now, I think. Brush my teeth after that. Make my dentist happy.
I don’t know. Maybe then I’ll go for a walk. We’ll see what happens. Anyway, thank you for watching. Remember, like, subscribe if you’re into that sort of thing, if you’re that kind of YouTube fanatic and also that kind of SQL Server fanatic.
And I will see you in another video that will be back to more of the regularly scheduled content. Anyway, thank you again for watching. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Little About Views, Parameters, and Local Variables in SQL Server
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
A Little About Improving Modification Query Performance In SQL Server
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.