A Little About Improving Modification Query Performance In SQL Server

Video Summary

In this video, I delve into a common performance issue that can arise during modification queries, particularly focusing on how certain query plan patterns might impact overall performance. I share practical examples using the Stack Overflow 2013 database and demonstrate how to mitigate these issues by employing techniques such as temp tables for manual phase separation and leveraging automatic phase separation through hash joins. By walking you through these scenarios, I highlight the importance of understanding query plans and optimizing them to enhance efficiency without compromising data integrity or concurrency.

Full Transcript

Erik Darling here with Darling Data. And in this video, returning to my normal form after a nice haircut. Round of applause for my haircut. Feeling extra cute in my Adidas T-shirt today. Returning to my normal form and talking about a little bit of this and that, we’re going to talk about sort of a general query plan pattern that you can watch out for in modification queries that can have an impact on query performance. I don’t remember why I called it this. It just seemed funny at the time, I guess. I don’t know what to tell you. Thanks, SQL prompt. I needed that. We have that healthy list of things. So we have, let’s make sure query plans are turned on. Stick to the script, pal. And that should be good there. Now, make sure everything’s nice and saved and pretty. All right. So what I’m going to start with is sort of a silly looking update. And this update is going to use an auxiliary table that I create in the Stack Overflow 2013 database just to do some dirty work for me.

I end up doing stuff like this a lot because I really get sort of nervous modifying the user queries. I mean, the user tables. I have like a backup stage copy of the Stack Overflow 2013 database where like I’ve never run in like any sort of modification so that if I need to like, like emergency change some data back, I can do that. But I do like these auxiliary tables because I don’t really care what happens to really care what happens to them all that much. I can just, you know, drop and repopulate them and, you know, move on with my day. Whereas if I screw up a modification to one of the user tables, it can have a, let’s call it a profound impact on other demos that I might run. So let’s update, let’s run this update query. And I only have that one equals one there so that Redgate SQL prompt doesn’t keep throwing messages at me while I’m running the demos to make sure that I have a where clause.

I do find that warning generally helpful. So I don’t want to turn it off. But I also just don’t need to hear any guff. But apparently the old where one equals one gives SQL prompt enough of a, enough of a warm, fuzzy feeling that it doesn’t say, hey, you don’t have a where clause. Which is funny because I do have two where clauses elsewhere in there. But I understand that linting T-SQL is a difficult task. So if you look at this query plan, a few things stick out. One, it is single threaded. That is probably obvious from the lack of parallelism operators or parallelism icons that show up in the query plan. But looking at this, right up until we hit this merge join, our whole query only takes about 1.2 seconds.

And then we hit this table spool. And let’s see, 3.5 minus 1.2. That’s a two, let’s call it 2.3 seconds inside this eager table spool. And then we have another two and a half seconds doing the update. Which, you know, doing the update, like I do expect that to take time. You know, we are updating 1, 4, 3, 4, 7, 2, 6, 1.4 million rows. So like, I expect that to, you know, take a little bit of time to do the update. But what I don’t love is that eager table spool. Many times spools are useful, or at least SQL Server trying to be useful and helpful by caching data. In this case, it’s there for Halloween protection, which is a very useful thing. Strange that Halloween protection was discovered on Halloween, right?

But the spool is there to keep track of rows that have been modified so that we don’t, you know, end up in some awful endless modification loop or some other weird, you know, edge condition where we keep trying to modify rows that have already been modified. And, you know, the old pipelined execution thing kind of gets, kind of gets icky in that regard. Now, everything’s a cursor behind the scenes. So if you’re going to keep complaining about cursors, you should probably just stop using databases.

Anyway, spools, despite their best intentions, best efforts, are not really highly optimized data structures. Rows get fit into them one at a time. You know, they don’t have any of like, like the optimizations that like temp tables and stuff have gotten, even though they exist in temp DB, spools live in temp DB. So even though spools live in temp DB, they don’t get a lot of the optimizations that temp tables and other things have over the years to make putting data into them any more efficient, which is kind of a bummer.

So we can see where the time really starts to accumulate in this plan, right, across these three operators. We have the 1.2 seconds, jump up to 3.5 seconds, jump up to 7 seconds. So I’m not in love with this.

And the reason I’m not in love with this is because there are ways that we could get the same sort of effect as the table spool with a whole lot more, a whole lot better performance. Now, this is a sort of small update, so the performance gains aren’t going to be like, like 7 seconds to like 1 second, but we can shave a bunch of time off this. And in larger scenarios, you could likely shave more time off of this sort of thing.

So the first way that we can introduce this sort of phase separation here is to use a temp table. And the reason why that works is because if we come back and look at the query itself, the total score by user table is the source of the data that we are selecting in order to update the total score by user table. So we want to find a user’s highest score across questions and answers.

And so to do that, we find the question score and we find the answer score and we get the max here. Now, I get it. There’s probably a way you could use a computed column or something else to get this. In case when question score greater than answer score, then question score, case when answer score greater than question score, whatever.

If they’re tied, you can stick that in there too. But in this case, we’re going to pretend that we’re doing this update for very good reasons that we cannot possibly overcome by other means. So one way that we can do this is with a temp table.

Because if we put everything in the temp table that we care to use for the source of the update, then we are no longer using the table itself to update itself. Stick with me on that. The thing is, if we’re going to do this and there’s any sort of concurrency in the database that might be changing data underneath, we need a transaction and we need to make sure that the total score by user table doesn’t change while we’re doing this.

Because anything that we put in the update in the temp table is going to be a snapshot of what the total score by user table looked like when we got the data. But then if something comes along and inserts, updates, or deletes data afterwards, then we could have some wonky results. So in this case, I’ve opted to use the hold lock hint.

And I’m going to partially run this to show you why. So if we grab this chunk of the query, let me say drop the temp table if it exists. We don’t have to do this in a store procedure where you’d most likely be doing this sort of thing.

But we do have to do this in the script itself. If we run this, we are going to utilize an inline table value function that I wrote. And it’s in my GitHub repo.

And what we’re going to do is use the hold lock hint. Alternately, we could also set the transaction isolation level to serializable for this. And what we get back from WhatsApp locks there is we see that our query has been granted 29,830 page locks with a shared lock on them.

Which, if you’ve watched other of my videos, you would know that the shared lock prevents modification queries. As the shared locks are not compatible with modification locks. So if we come over here and we try to insert a row into total score by user, this thing will be hung up, getting blocked by the…

Well, it says set statistics XML off, but I promise you that’s our user session 67. That’s 67 up there. That’s the queries with the hold lock.

So we can’t insert into the table. We see that insert waiting on an intent exclusive lock. If we cancel that and we get rid of useless prompts, we will see that we can’t update the table either. All right.

We come over here and look at this. We’ll see our update. Also going to be looking for an intent exclusive lock, but the update is stuck on waiting on those hold lock hints. Likewise, we will not be able to delete any data from the table because of those locks.

So this delete would also be blocked while this is running. Now, of course, we wouldn’t really want to have a serializable lock on the table for that long to preventing other updates, but we would need that for data correctness. Of course, we are allowed to select from the table.

So if we run this, we can still… Select queries can still run against the table. The only locks that are taken and held are there to prevent modifications. So let’s commit this because it’s just some selects.

And now let’s run the whole thing in one go. All right. So we’re going to drop our temp table. We’re going to insert data into the temp table that we care about. And then we are going to run the update from the temp table.

The update from the temp table is just what happens down here. Right. So I am indexing the temp table in this case because I did find it useful in my performance testing. I do suggest you also test indexing temp tables in those scenarios or in any scenario, really, where there’s performance as a thing.

So what we get back here is 418 milliseconds to dump data into the temp table. This query down here is what’s up locks running. So we don’t have to care about that time because that wouldn’t be part of our normal transaction.

So we have 418 milliseconds there, about 328 milliseconds to index the temp table, and then about 2.7 seconds to do the update. And in this part, in this query plan down here, the majority of the query time just is the update. Right.

Like we don’t spend a lot of time up until then, 643 milliseconds total. So, you know, about two seconds doing the actual update, about 643 milliseconds leading up to it. But the update is the majority of the work in there.

And this is, you know, better performance than we got from doing the straight update from the, like to update the table using itself as a source. So 418 plus 328, that’s 7 and change. And then 2.7 here, that’s 3.4.

So that’s three and a half, about three and a half seconds down from seven seconds. We cut that time in half. Good, good, good. These are all positive things, right? Breaking things up a little bit is a grand way of doing things.

Another way that SQL Server can perform what my dear friend Paul refers to as phase separation. What we did up there is manual phase separation. What we’re going to see in the next example is called automatic phase separation.

And what we’re going to do is instead of do the update and, you know, set the news that like sort of like that sub query to do the update. Instead of doing this, we’re actually going to join to the result of that. And what I had to do in this case was to hint for a hash join.

The reason for that, the reason why this gets a little bit tricky is because the ID column that we’re joining on is the clustered primary key of the table. And when SQL Server has two clustered primary keys, it generally tends to favor either nested loops or hash joins or merge joins rather. Because we have ordered sets, we have, you know, well-defined indexes.

So SQL Server is generally tends to lean towards those. There are plenty of circumstances where it would choose, might they choose hash joins. Otherwise, in this case, I just used a hint rather than fiddle and faddle with a bunch of other stuff.

But if we run this query and we use this update instead, this will finish in, I forget now. I did this late last night, so my mind’s a little less useful than it used to be. But this finishes in about 4.6 seconds total.

And again, most of the time in there is spent in the update, right? So we had 503 seconds up until the update and then 4.6 seconds after. It was about 4.1 seconds.

Not only about 400 milliseconds difference from the 3.7 when we did the manual phase separation thing, but still better than the 7 seconds total there. And this is, you know, without having to, you know, do a transaction, set serializable, because all the locking is performed, right, just within this query by itself.

But notice in this query that, I mean, A, we get a parallel plan, which we did in the original one. So that helps things along. But also there’s no spools in here.

The reason why we don’t need a spool is because we have a few stop-and-go operators that pause output for rows to be kept track of. We have a hash join here.

The hash join is a stop-and-go operator. People call it like an internally blocking operator because all the rows have to get to the hash join. The hash table has to get built and other stuff has to happen.

We also have a sort in the query plan. The sort is another one of those stop-and-go internally blocking operators. All the rows have to arrive before they can come out. So in this case, because we have operators like this in the query plan, we don’t explicitly need a spool to keep track of which rows have been hit or not because all the rows that come out of that are going to be rows for the entire query, right?

All the results have to hit operators like that and then be passed along. We don’t have just an entirely streaming plan that can just pass rows along as it goes and may end up rereading rows on the other side, right?

Because in the case of the first query plan, and I’ll just get the estimated plan for that again so we can talk a little bit about why that’s a thing.

Part of the reason for that is, like when I talked about how read committed is kind of not my favorite isolation level, all of these operators, none of these operators aside from the table spool will cache all the rows from the query.

So that’s why this gets inserted here. SQL Server could have chosen a sort or a hash or something else to do it. It just didn’t. Cost-based optimization chose a spool. So what could happen is, as we’re doing the update here, other operators could end up rereading rows in here.

But because we have this spool as a caching mechanism, because we use a temp table as a caching mechanism in the second example, and because we had hashes and sorts in the other query plan that can be used as caching mechanisms, the spool becomes not needed explicitly to do this because we have other things that keep track of the rows.

So anyway, if you’re looking at modification queries and you’re trying to tune those, and you find that performance is subpar, you now have a few, or actually you have a couple of things that you can try to improve the performance of the updates.

Of course, if most of the time is spent in the update operator, like if, like, you know, let’s say the entire query runs for 10 seconds, and like nine and a half seconds is spent in the update, you may have a different problem to solve, either like the number of indexes that you’re modifying, you know, stuff like that.

You know, sometimes, you know, putting locking hints on there to skip like row or page level locking, jump right to a table level lock can be useful. But this is just kind of a good example of how you can rewrite queries in different ways to improve performance of the entire operation.

Anyway, that’s about it here. Thanks for watching. As usual, I hope you learned something. I hope you enjoyed yourselves thoroughly, thoroughly enjoyed yourselves.

And remember, do the old like and subscribe dance for me, and I will see you in another video probably pretty shortly. Thanks again for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Improving Modification Query Performance In SQL Server