Erik Darling, Author at Darling Data

The Difference Between Read Committed And Read Committed Snapshot Isolation In SQL Server

Posted on May 23, 2024January 22, 2026 by Erik Darling

The Difference Between Read Committed And Read Committed Snapshot Isolation In SQL Server

Thanks for watching!

Video Summary

In this video, I delve into the differences between read-committed isolation and read-committed snapshot isolation (RCSI) in SQL Server, specifically addressing how RCSI mitigates some of the issues that can arise under the pessimistic isolation level. I demonstrate these concepts using a simple example with tables named consultants and clients from my CRAP database, showing how read queries behave differently when using RCSI compared to traditional read-committed mode. By walking through the process step-by-step, I explain why optimistic isolation levels like RCSI can be more aligned with developer expectations in many scenarios, while also highlighting potential trade-offs and considerations for different workloads.

Full Transcript

Erik Darling here with Darling Data. And spring is all around us here in the Northern Hemisphere, but for me more specifically, spring is all in my nose. So if I sound a little weird today, that’s why. And spring being in my nose is not a Deadpool-esque drug reference, it is an allusion to my allergies, which is alliteration. So last week I went on vacation, but before I did, that was a rhyme too. Wow, I’m nailing it today. Last week I went on vacation, but before I did, I recorded a video about different read phenomena that can happen under read-committed, the pessimistic isolation level that can make query results look weird. And in that video, I made several sweeping proclamations that this sort of thing wouldn’t happen with an optimistic isolation level like read-committed snapshot isolation.

Then of course, while I’m away, exploring business opportunities across Europe, I get questions in my email like, How would it be different? Why would that be different? Please tell me. So I’m recording a video to answer all of those questions today.

So the first thing that we’re going to do is we’re going to alter the CRAP database. I do not distribute the CRAP database. It is not an open source project. If you would like the CRAP database, you are free to create a database called that, and you are free to fill it with whatever you’d like, all the joy in life, your hopes, your dreams, whatever they may be. And then we’re going to redo a couple steps from the last video where we get rid of any tables that might exist that might cause strange things that would make the demo not work.

So I aim to be idempotent. It’s a tough word sometimes. And so what we’re going to do is create a table called consultants.

And even though there is only one consultant in there, it’s called consultants because I think naming tables as plural is kind of the right thing to do. You have consultants in a table. Each row is a consultant. Many consultants are plural.

So for now, there’s just one of me. And we’re going to stick old Erik Darling in there. And then we’re going to create a table called clients just like last time. And just like last time, the clients table will have a lovely primary key and will feature a magnificent foreign key that references the consultant ID in the consultants table.

Good stuff there. All right. Cool. So let’s insert a couple rows into clients because we have Erik Darling, the consultant at Darling Data, has many clients.

Right. So it makes sense to call that clients, even though there’s only two here. I promise there’s been more in a few. And then if we look at the data that we currently have in clients, then well, sorry, consultants and clients joined together, everything looks good.

Right. Consultant ID, first name, last name, invoice ID. The consultant ID again, a bit repetitive, but that’s what you get with select star. And of course, the invoice amount, which is an amount that I would love to invoice in American dollars to someone someday. Perhaps if the government is looking for some SQL Server help, we could talk.

Print some more money for me so that whatever number that is, is worthless. Worth five dollars in real life. So over here and let’s take this.

Let’s stick this over here. I think, oh, look, I already did. I’m so, I’m so smart. I see ahead. I see into the future. I see all things. All right.

So that returns right results. Now, in the last video, what happened was I said, begin tran and didn’t update. And then over here in this window, I ran this query and this query got blocked. This query got blocked until I committed the transaction in the other window and then it returned inconsistent results.

Since read committed snapshot isolation is now working with the version store, we are getting a snapshot of this data from prior to the update happening. So we still see all the right stuff in here because if you look over what this update is doing, we are incrementing that invoice amount by $1. I got a dollar tip on that huge invoice.

Someone was just like, yeah, Eric, darling, you’re worth that extra buck before we overflow the big value for SQL Server. Top you off. So in the last video, this query got blocked and this query, it doesn’t.

This query runs and returns the values as they existed prior to the update running. And the same thing will happen if I run this update, right, and I changed my last name to Darling Data because Erik Darling married to the data game, right? So we run this.

This will still not be blocked, but it still won’t reflect the change within that transaction, right? That’s still out there in the open. Nothing going on there. If I run this query from within the transaction, well, then I can see all the changes that happened because this query is working within the transaction that made the changes.

And then finally, if I commit this transaction, let’s make sure we’re extra committed to that transaction. And we come over here. Now this query will finally see the changes.

So the difference, of course, is that your read queries don’t get blocked, but your read queries might be seeing older versions of how data existed before changes started happening to it. So that can be great for some people. That can be great for 90 something percent of the people who have a workload in SQL Server because it’s great for most people who have workloads on other database platforms that made better choices of default isolation levels like Oracle, Postgres, and probably DB2 if anyone can find DB2 out there.

So that’s how they’re different. And we can repeat the same thing with the other query that I showed you where it looked like a unique constraint had been violated in SQL Server. Where it looked like we had duplicate values in unique constraint.

In this case, we won’t see any weird results. So we’ll run these first three updates right here. One, two, three.

And we’ll just refresh your soggy memories. There is a unique constraint on Butthead. And so if we were to try to insert a fourth row, it would fail because we would violate that unique constraint. And that was sort of the gist of the last demo where I was like, hey, if enough changes happen and things swap around, your query results could make it look like that column returned non-unique results.

And that would be confusing and awful and terrifying. You would question SQL Server and you’d call up Microsoft and be like, Microsoft, SQL Server has a memory leak. And it’s broken.

You better catch it for whatever kids do these days when they make prank calls. So let’s grab this query just to prove out our point once again. Let’s paste that in there.

And if we run this select, we get exactly what we should. Because no changes have happened with the table. We get Beavis incorrectly saying huh and Butthead incorrectly saying other things. Well, I guess the one at the bottom is right.

But the first two, absolutely wrong. Non-canonical Beavis and Butthead laughs. All of Flutter here in this demo. But now, let’s begin trend and update. And just like last time, we are not going to see any blocking here.

But we’re going to see the data the same way that it was before the update started. And if we run a couple more updates, we are still going to not be blocked. But we’re still not going to see any changes.

Now, if I were committing these changes along the way, of course, we would see them over here. But because these changes haven’t been committed yet, we’re still seeing that snapshot of the data from before the changes. And just like in the prior demo, if I select from this, and here we can see the changes, right?

Then if I commit and then extra commit that transaction over here, those changes will finally show over here. But the important thing is that this select will never look like it returned a violation of our unique constraint on the Butthead column.

So, when I talk about workloads that would benefit from RCSI, this is a big one. Not get, not your read queries, not getting blocked, and returning correct results from prior to modification starting, is usually what people want.

Using no lock hints, being able to see transactions in flight is not good, right? And seeing dirty data, it’s dirty reads. The concern here, of course, is reading stale data, right?

Because if your select queries do require getting blocked, and then seeing changes after the blocking, read committed might be the right isolation level for you.

But with the huge caveats that came with stuff in the last video, about how if there are multiple queries in a transaction that make changes, if your query got blocked after like, like let’s say that there are 10 of them, if your query got blocked on like the fifth one, and then five more things happened, your query would have seen like five things that changed, five things that didn’t change, and then five things that did change, and that can be really screwy for results too.

So, again, no isolation level is generally 100,000 million percent perfect for every workload. However, I do find that the way optimistic isolation levels is far closer to developer expectations than the read committed reality that they’re faced with, and of course the no lock or read uncommitted shortcuts that they take when blocking happens, and they’re like, why God, why have you forsaken me here, letting my queries get blocked like this.

So, that’s the difference between read committed, the pessimistic isolation level, and read committed snapshot isolation, the optimistic isolation level. I hope that this makes things more clear, and I hope that this helps you with anything that you might want to test in your own environment to make sure that if you are going to start using an optimistic isolation level, that you use the right one.

Because remember you have snapshot isolation over here, which requires queries to ask for it, and you have read committed snapshot isolation over here, which affects all of the read queries that come into your database. So, if you find that there are queries that would operate incorrectly under read committed snapshot isolation, you could choose snapshot isolation and have certain queries that don’t require that to opt in to using an optimistic isolation level, or you could add read committed lock hints to queries that do require not reading from a version of the data, so that those queries would obey the locking semantics of read committed.

But, just remember, read committed leaves a lot to be desired as far as guarantees go. The guarantees for read committed, the pessimistic isolation level, are very, very weak. If you truly have queries that need to operate off of the freshest data without having, like, changes and other things happening all around them mess them up, you might be looking at needing to use repeatable read or serializable as an isolation level for absolute correctness.

Because under concurrency, read committed the pessimistic isolation level wipes out pretty hard a lot of the times. Does not, well, I would say it doesn’t live up to expectations, but the reality is, it’s really just behaving as it’s documented and your expectations were wrong.

Read committed is just kind of a crappy isolation level. Even in the crap database, it’s a crappy isolation level. So, before I need to blow my nose or sneeze or anything else that would be untoward to do on camera, I’m gonna end this video.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you missed me terribly while I was away exploring business opportunities in Europe. If you like this video, the thumbs up button is a great way to show that.

I also like comments that say, Yay, good job, Eric. And I also like subscribers, because the more subscribers I have, the more people I reach in teaching people how to use SQL Server correctly, which is nice, because very few people do that.

And I’d like to see more of it, so that one day I can do something else with my time. I don’t know. Start a channel about how to behave properly in a bar.

How to not annoy bouncers. I don’t know. How to not annoy bartenders. There’s lots of things. Lots of things that I could do. Lots of things I could do with my time.

But here I am, talking about SQL Server. So, that’s enough of that. Again, thank you for watching, and I will see you in another video, another time, another place, another you, another me.

Who knows? Goodbye. Goodbye. . .

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Is Using OPTIMIZE FOR More Reliable Than Forced Plans In SQL Server?

Posted on May 22, 2024May 9, 2024 by Erik Darling

Parameter Sniffing?

I often see clients using forced plans or plan guides (yes, even still, to this day) to deal with various SQL Server performance problems with plans changing.

There’s usually an execution plan or two floating around that seems to be a good general idea for a given query, and a couple weird high-end and low-end outliers for very specific populations of values.

This is especially common in third party vendor environments where code and/or index changes may not be allowed without the okay from the high priest of tech support who only answers questions when their celestial craft passes near Earth every 27 years.

Of course, forced plans and plan guides can both fail. You may also run into a “morally equivalent plan” in Query Store that looks quite morally ambiguous.

Recently while working with a client, we came across just such a scenario. And of course, of the many reasons why a forced plan might fail, this one was just a… general failure.

The fix we came up with was to track down the compile values for that nice middle ground plan, and use OPTIMIZE FOR to push that plan shape into reliably reality.

Territory

Here’s a close enough approximation to what we did, with a good-enough demo. Trying to get a more realistic one was hard without a much more complicated schema, which the Stack Overflow is not.

An index!

CREATE INDEX 
    p
ON dbo.Posts
    (OwnerUserId)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

And a procedure!

CREATE OR ALTER PROCEDURE 
    dbo.OptimizeForStuff
( 
    @ParentId integer = NULL, 
    @PostTypeId integer = NULL,
    @OwnerUserId integer = NULL
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
    
    SELECT TOP (1000) 
        p.*
    FROM dbo.Posts AS p
    WHERE (p.ParentId = @ParentId OR @ParentId IS NULL)
    AND   (p.PostTypeId = @PostTypeId OR @PostTypeId IS NULL)
    AND   (p.OwnerUserId = @OwnerUserId OR @OwnerUserId IS NULL)
    ORDER BY 
        p.Score DESC, 
        p.Id DESC;
END;

All good so far, even if it does have an air of laziness.

Darwin

The problem was that when the query executed something like this:

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = 22656, 
    @ParentId = NULL, 
    @PostTypeId = 2;

It got a good-enough fast plan:

But when the query executed in almost any other way:

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = 8, 
    @ParentId = 0, 
    @PostTypeId = 1;

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = 1349, 
    @ParentId = 184618, 
    @PostTypeId = 2;

It got this sort of lousy plan.

Even Worse

When stranger executions came along, things got way worse!

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = NULL, 
    @ParentId = 0, 
    @PostTypeId = 1;

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = NULL, 
    @ParentId = 184618, 
    @PostTypeId = 2;

We need to avoid all of this.

Step Up

Here’s what we did (again, round about) to make sure we got the generally good plan across the board, without failures!

CREATE OR ALTER PROCEDURE 
    dbo.OptimizeForStuff
( 
    @ParentId integer = NULL, 
    @PostTypeId integer = NULL,
    @OwnerUserId integer = NULL
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
    
    SELECT TOP (1000) 
        p.*
    FROM dbo.Posts AS p
    WHERE (p.ParentId = @ParentId OR @ParentId IS NULL)
    AND   (p.PostTypeId = @PostTypeId OR @PostTypeId IS NULL)
    AND   (p.OwnerUserId = @OwnerUserId OR @OwnerUserId IS NULL)
    ORDER BY 
        p.Score DESC, 
        p.Id DESC
    OPTION
    (
        OPTIMIZE FOR 
        (
            @OwnerUserId = 22656,            
            @ParentId = 0, 
            @PostTypeId = 2
        )
    );
END;

Which gets us the original fast plan that I showed you, plus faster plans for all the other executions.

For example:

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = 8, 
    @ParentId = 0, 
    @PostTypeId = 1;

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = 1349, 
    @ParentId = 184618, 
    @PostTypeId = 2;

Go from 1.5 seconds to ~300ms:

And the two outlier queries improve quite a bit as well (though neither one is exactly great, admittedly).

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = NULL, 
    @ParentId = 0, 
    @PostTypeId = 1;

EXEC dbo.OptimizeForStuff 
    @OwnerUserId = NULL, 
    @ParentId = 184618, 
    @PostTypeId = 2;

In all cases, the plan is generally better and faster, and sharing the plan across (though imperfect for the outliers) tamped down the extreme performance issues that were there before with attempts at forced plans.

Posit Hell

While I’m no great fan of OPTIMIZE FOR UNKNOWN, using a specific value can act like a less faulty version of plan forcing.

You shouldn’t pull this out every time, because it is a bit of duct tape to keep a sinking ship above water, but in oddball cases, it can be a quick and rather painless fix.

At some point, better solutions should be explored and implemented, but emergencies don’t generally allow for the greatest care to be taken

Thanks for reading!

Going Further

Compiles! Recompiles! Stored Procedures! Temp Tables! Extended Events! In SQL Server

Posted on May 21, 2024January 22, 2026 by Erik Darling

Compiles! Recompiles! Stored Procedures! Temp Tables! Extended Events! In SQL Server

Thanks for watching!

Video Summary

In this video, I delve into the intricacies of stored procedures, temp tables, compiles, recompiles, and query hints—essentially packing as many fascinating topics as a human hand can hold. Using `spHumanEvents`, an extended events wrapper I developed to simplify tracking for SQL Server users, we explore how to monitor and manage these processes effectively. By running a series of queries with different hints in a loop, we uncover the nuances between recompiles and compiles, highlighting the potential benefits of using the `KEEP PLAN` hint to stabilize query plans across multiple executions. This exploration not only sheds light on common issues but also offers practical insights for optimizing stored procedures that rely heavily on temp tables.

Full Transcript

Alright, Erik Darling here with Darling Data and this happy, friendly, amazingly kind video, this community spirited video, we’re going to talk about a cacophony of fascinating things. Namely, stored procedures, temp tables, compiles, recompiles, and query hints. It’s as many things as I can fit on a hand. If I were blessed by any of the gods with additional fingers, we would talk about more things today, but I’m all out of fingers, so you get what you get. In order to track compiles and recompiles, we’re actually going to do, we’re going to look at this three different ways. One of those ways is not going to be query store because query store is somewhat uninteresting in this way. We’re going to look at, we’re going to use spHumanEvents in two different ways. We have one event up here to track compiles. We have one event up here to track recompiles. Alright, and then we have this query plan clash, plan cache query that we’re going to use to track the same thing in the plan cache. Cool, great, wonderful. What are we tracking exactly? Well, it’s really better if I start this loop off and then we talk through exactly which specific things we’re dealing with here.

So what I need to do is in very quick succession. Not too quick. I gave myself a little room to wiggle. I do like having room to wiggle. I’m going to start this off here. spHumanEvents, that’s my store procedure that I use to make extended events a little bit easier on the average SQL Server user because Lord knows Microsoft didn’t make it easy. And as you all know, I love and care about you far more than Microsoft ever will. They’re a nameless, faceless, conglomerate bunch of people who say no. Me, I say yes. Darling Data, we say yes to you.

So I’ve got a store procedure and this is the very store procedure that’s running in the loop that’s executing right now, if you would believe that. And what this store procedure does is it takes some parameters, obviously. That’s what store procedures typically do. They take a parameter. It’s like we here at Darling Data will take a drink. Store procedures will take a parameter. We create a table called filtered posts. We insert into that table the results of this query where our wonderful parameters are used, where SQL Server drinks greedily from our parameters. And then we run the same parameters. And then we run the same query, essentially, in four different ways. So the first way that we run this query is with no hints whatsoever, just looking for where the score column in our temp table is greater than zero.

And notice that I’m doing the greater than select zero thing here. The reason that I’m doing that is to avoid simple parameterization, not because it actually interferes with this demo, but just because a lot of people see it and say, oh, that looks weird. What is that? And then everything gets derailed, hopelessly derailed. We have another iteration of the same query, except this one has a recompile hint on it. So this one we’re explicitly saying, SQL Server, please recompile this plan. Tear down that old plan. And then we have two more queries that we run. One of them uses the keep plan hint right here. And I wanted to use this to see if keep plan would get us anything in this situation.

Not to jump too far ahead, but keep plan really seems to work better for like actual tables and like an actual schema, not like 10 tables. And then we have keep fixed plan down here. All right. Wondrous, wonderful, amazing. You believe that I do this for free for your benefit. You believe that the things that I do for you, all the typing that I do for you, brothers and sisters. And so, well, I mean, there’s our loop, right? Our loop is finished. And if we go get the results from this thing back and we look at what happened, we can see this loop ran 21 times with post type ID one.

And then another 21 times with post type ID two for a total of 42 trips to that store procedure, right? What this loop does is what as long as post type is less than three, and that’s not a weird heart. I don’t play those games. I’m married to the data. I can’t go making hearts at every single thing in SSMS.

And then while end date is less than this, we do this and we increment our dates by three months at a go. And then when we finally reach our limit here, then we reset our dates and we set post type ID one higher. And only on that trip around do we raise our loop counter.

So 21 times for post type ID one, 21 times for post type ID two, which is a total of 42 trips to that store procedure. I did not have 42 drinks. You are watching me. There’s no way I could have had 42 drinks during the execution of that store procedure.

Though Lord knows sometimes I wish I could. If we come over here and we look at the compiles, we are going to see a whole bunch of statistics gathering in tempDB. And if we expand this a bit, we’ll see that this is all coming from the filtered post table in our store procedure.

All right. This is all compiling filtered post statistics. And we have 35 rows of that, I guess. And we have most of these just have one compile, but this one at the top has eight compiles.

I’m not sure why this one thinks it’s so special that it can go and compile eight times in total, but it went and did it anyway. And if we look at the results of the recompile extended event that we had in here, look at what we’re going to see. This is actually interesting to me.

I found this part fascinating. Utterly fascinating. You might be wondering, Eric, why did you find this utterly fascinating? Well, let’s look at what we have some recompile causes in here, right?

For two of them, we have statistics changed. For one of them, you can probably guess which one this is. We have option recompile requested, right?

Which is great. Love the capitalization there. Capital O option, all lowercase recompile. And for the two down here, we have deferred compile. All right.

All right. Cool. Those all happened in Stack Overflow 2013, which is exactly where we executed our store procedure. They were all from the store procedure that we’re talking about. But if we come over here just a wee little bit, and I have no idea why one of them has that little extra less space in it.

A little strange. And we’ve got to do a little bit of work to get this expanded out to where we want it. Because what we want to see are which compile, which query text with which hints or no lack of hints, ended up with all these compiles and all this stuff going on.

And now let’s slide over this way. Since we’ve got this all framed up nicely here, we have our first query, which is no hints, which had 42 total recompiles and almost 10 seconds of recompile CPU, on average taking 232 milliseconds to recompile this query.

All right. That’s a fairly crazy thing, right? And remember, the recompile cause, and this is statistics changed.

Down here, where we have our keep plan hint, we still ended up with 42 total recompiles, but we just used far less recompile CPU on that. Almost nothing there.

This is pretty wild that just like a regular query requested that required, not requested, required. Didn’t just say, can I have like 10 seconds of recompile CPU? And SQL Server was like, yeah, I guess.

It was just like, no, I need to do this. If I don’t get this, I’m going to go crazy. And down here at the very bottom, we have our two other queries, which only recompiled once, which probably means they just kind of compiled once or something, I guess.

I’m not entirely clear on all of the inner workings of how Extended Events tracks these things. Nor do I have the moxie to care too much about all of that. So, you know, that’s fun.

Sorry if you don’t like it. But there’s only so much time in a day, and there’s only so much free I can give. So these two here, the keep fixed plan did like no work, and the insert select into the temp table did absolutely no work. Now, if we come over here and we run our plan cache query, we’re going to see nearly the same thing, except it’s going to be a little misleading.

I mean, sure, these three up here have crazy high plan generation numbers on them, probably much higher than they should. Well, I’m not going to say much higher than they should be. Just like 129.

We only executed the thing 42 times. I don’t know. I don’t know. Pretty wacky stuff.

Pretty wacky stuff. But what’s misleading is that these queries all look like they only executed once. So that’s because of all the compile, recompile, stats change, go get stats, go make a new plan, kick out the old plan, we don’t need it anymore, get the new plan in here. And all that stuff, which is, I don’t know, can be aggravating.

To be perfectly honest with you, that stuff can be quite aggravating. And the two queries down at the bottom, this is just kind of to show you that the plan cache does agree mostly with the extended event recompile stuff that we saw, where these queries both have 42 executions, right?

And they did very little on the recompile tip. So the moral of the story here kind of is, if you have stored procedures, and those stored procedures feed data into temp tables, and you’re having trouble with, like, high recompilations, and you want to, and you, like, you’re, like, confident and comfortable in the query plan that you’re getting across all those executions, which can be, granted, tough to figure out if you’re constantly recompiling.

But if you’re getting the same plan over and over again, right, and you want to reduce all that compile overhead, right? You have stored procedures that execute a ton, they use temp tables, you put, like, a relatively stable amount of data in those temp tables, and you’re just, like, SQL Server, just use a decent plan, just reuse it, I don’t care, then the keep fixed plan hint can be pretty useful to tamp down the recompiles and potentially take a lot of the stress and strain out of all the many, many executions of that stored procedure.

Will that be perfect in every case? Of course not. But we live in a complex, data-driven database, data face, data world, and it’s up to us to decide when it’s appropriate to do these things. It’s up to us to figure out when these things are causing problems and come up with a reasonable fix for them.

In this case, I would recommend for the query in that stored procedure that we stick with the keep fixed plan hint because that would tamp down on the recompilations, and it would give us a nice stable execution plan that not only would we be able to reuse, right, but we would be able to track exactly how that plan is performing over time.

If we find that that plan is not performing so well, we could, you know, recompile the stored procedure and let SQL Server come up with a new plan the next time, and maybe that one will be better. We always, we pray that SQL Server will make good choices, don’t we?

Lord knows we don’t. It’s up to SQL Server to make these good choices. So, this is just kind of a fun little demo that I put up with while investigating something else entirely, but I thought it was worth sharing.

If you’re not familiar with SP Human Events, the link to my GitHub repo where it lives will be in the show notes, and you’ll be able to go and install it in all of your servers and run it in all its glory and be able to track down these interesting problems the same way I do, this professional stored procedure-y way.

So, anyway, that’s good for today, I think. I hope you enjoyed learning about compiles, recompiles, temp tables, stored procedures, and extended events. I don’t know if that was the same five from before, but again, we’ve hit the magic number, and this is also how I wave goodbye.

So, if you enjoyed this video, I do like thumbs-ups and… Family-friendly thumbs-ups. Do not stick our thumbs where they don’t belong.

I like comments, and I like subscribers. So, if you like this and you like me, then subscribe to the channel and you get more of me and more of this. And it’s kind of a win-win for everybody, because then I get what I like, you get what you like, and we’re all the happier for it.

We’re all better off, aren’t we? So, as usual, thank you for watching. I hope you learned something. I hope you enjoyed yourselves. And I will see you in another video, I guess probably tomorrow, when I’ve thought of something else to talk about, which amazingly I always do.

So, thank you for watching.

Going Further

A Mild Annoyance With MERGE Statements And Triggers

Posted on May 20, 2024May 20, 2024 by Erik Darling

I’m No Expert

I will leave the finer points of the problems with MERGE statements to the Michaels and Aarons of the SQL Server world.

This is just a… But why? post about them, because I’m honestly a bit puzzled by this missing implementation detail.

To get us to the point, I’m going to use a code snippet (with embellishments) from Aaron’s post here.

We’ll be starting with this table and trigger from the linked post, with a couple small tweaks to satisfy my OCD:

CREATE TABLE 
    dbo.MyTable
(
    id integer
);

INSERT 
    dbo.MyTable 
VALUES
    (1),
    (4);

CREATE OR ALTER TRIGGER 
    dbo.MyTable_All
ON dbo.MyTable
FOR INSERT, UPDATE, DELETE
AS
BEGIN
    SET NOCOUNT ON;
    
    IF ROWCOUNT_BIG() = 0 RETURN;
    IF TRIGGER_NESTLEVEL() > 1 RETURN;
  
    PRINT 'Executing trigger. Rows affected: ' + RTRIM(@@ROWCOUNT);
    
    IF EXISTS (SELECT 1 FROM inserted) AND NOT EXISTS (SELECT 1 FROM deleted)
    BEGIN
      PRINT '  I am an insert...';
    END;
    IF EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted)
    BEGIN
      PRINT '  I am an update...';
    END;
    IF NOT EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted)
    BEGIN
      PRINT '  I am a delete...';
    END;
END;

After all, one of the best ways to make sure you get code right is to copy and paste it from the internet.

Quiet On The Set

As much as we all love to dunk on MERGE, like cursors, heaps, and UDFs of various types, they did give you some neat options with the OUTPUT clause, like the $action column, and the ability to get columns from other tables involved in the query. You can’t do that with a normal insert, update, or delete when using the OUTPUT clause, though I think it would be cool if we could.

Working a bit with the code linked above, here’s an expansion on it showing the additional OUTPUT capability, but this is also where my annoyance begins.

BEGIN TRANSACTION
    DECLARE
        @t table
    (
        action varchar(6),
        i_id integer,
        d_id integer,
        s_word varchar(5)
    );

    SELECT
        mt.*
    FROM dbo.MyTable AS mt;

    MERGE 
        dbo.MyTable WITH (HOLDLOCK) AS Target
    USING 
        (
            VALUES
                (1, 'one'),
                (2, 'two'),
                (3, 'three')
        ) AS Source (id, word)
    ON Target.id = Source.id
    WHEN MATCHED 
    THEN UPDATE 
           SET Target.id = Source.id
    WHEN NOT MATCHED 
    THEN INSERT
           (id) 
         VALUES
           (Source.id)
    WHEN NOT MATCHED BY SOURCE 
    THEN DELETE
    OUTPUT
        $action,
        Inserted.id,
        Deleted.id,
        Source.word
    INTO @t 
    ( 
        action, 
        i_id, 
        d_id,
        s_word
    );
    
    SELECT
        t.*
    FROM @t AS t;

    SELECT
        mt.*
    FROM dbo.MyTable AS mt;
ROLLBACK TRANSACTION;

You likely can’t guess what I’m sore about just looking at this, because this isn’t what annoys me.

This is all fine, and rather a nice showing of capabilities for an oft-maligned (by myself included) bit of syntax.

The problem is really in the trigger.

What’s Wrong With Triggers?

If you work with trigger code enough, you’ll get used to seeing:

Catch all triggers that do a lot of gymnastics to differentiate insert from update from delete
A few separate triggers to catch each modification type separately, and still do some checking to verify

The problem is that in any trigger, the $action column is not directly exposed for use to determine the action of a MERGE statement.

Sure, you can add a column to a table to track it, or some other hacky workaround, but I consider it a quite poor design choice to not have the $action column as a part of the Inserted and Deleted virtual tables.

Having it there would also benefit generic modifications that are captured by triggers in some manner to make the absolute type of modification quite clear to query writers.

Thanks for reading!

Going Further

Join @bobwardms At SQL Saturday Boston For The Azure Workshop for SQL Professionals (@NESQLServer)

Posted on May 17, 2024May 16, 2024 by Erik Darling

All’s Well That Friends Well

The nice folks at the New England SQL Server User Group (w|t) had me in town a week ago to present my workshop, The Foundations Of SQL Server Performance.

We had about 40 folks show up — which ain’t bad for a random Friday in May — including one attendee from Nigeria.

Not just like, originated in Nigeria. Like, flew from Nigeria for the workshop. That’s probably a new record for me, aside from PASS Precons where folks are already headed in from all corners.

Speaking of PASS Precons — me and Kendra are double teaming your Monday and Tuesday — have I mentioned that lately?

As a favor to some dear friends, here’s a promo code to get $125 off Bob’s Friday Precon, The Azure Workshop for SQL Professionals.

The precon takes places on October 4th, and the SQL Saturday is October 5th. You can register separately for the SQL Saturday only here.

If you want to catch Bob’s Friday workshop, use the QR code below, or register here and use the code PPP50 to get your discount.

Thanks for reading!

Going Further

Two Ways To Tune A Slow Query In SQL Server

Posted on May 16, 2024May 21, 2024 by Erik Darling

Like All Assumptions

You and Me might feel like the lower back end of a thing if we’re tuning a query that has other problems. Perhaps it’s running on one of those serverless servers with half a hyper-threaded core and 8kb of RAM, as an example.

When I’m working with clients, I often get put into odd situations that limit what I’m allowed to do to fix query performance. Sometimes code comes from an ORM or vendor binaries that can’t be changed, sometimes adding an index on a sizable table on standard edition in the middle of the day is just an impossibility, and of course other times things are just a spectacle du derrière that I’m allowed to do whatever I want. You can probably guess which one I like best.

This post is about the two other ones, where you’re stuck between derrière and ânesse.

For the duration of reading this, make the wild leap of faith that it takes to embrace the mindset that not everyone who works with SQL Server knows how to write good queries or design good indexes.

I know, I know. Leap with me, friends.

The Query And Execution Plan

Here’s what we’re starting with:

SELECT TOP (10)
    DisplayName =
       (
           SELECT
               u.DisplayName
           FROM dbo.Users AS u
           WHERE u.Id = p.OwnerUserId
       ),
    p.AcceptedAnswerId,
    p.CreationDate,
    p.LastActivityDate,
    p.ParentId,
    p.PostTypeId,
    p.Score,
    p.CommentCount,
    VoteCount =
        (
            SELECT
                COUNT_BIG(*)
            FROM dbo.Votes AS v
            WHERE v.PostId = p.Id
        )
FROM dbo.Posts AS p
ORDER BY
    p.Score DESC;

And resulting plan:

We can surmise a few things from this plan:

If there are good indexes, SQL Server isn’t using them
That hash spill is some extra kind of bad news
Spools remain a reliable indicator that something is terribly wrong

Okay, so I’m kidding a bit on the last point. Sorta.

The Query Plan Details

You might look at all this work that SQL Server is doing and wonder why: With no good, usable indexes, and such big tables, why in the overly-ambitious heck are we doing all these nested loop joins?

And the answer, my friend, is blowing in the row goal.

The TOP has introduced one here, and it has been applied across the all of the operators along the top of the plan.

Normally, a row goal is when the optimizer places a bet on it being very easy to locate a small number of rows and produces an execution plan based on those reduced costs.

In this case, it would be 10 rows in the Posts table that will match the Users table and the Votes table, but since these are joins of the left outer variety they can’t eliminate results from the Posts table.

The row goals do make for some terrible costing and plan choices here, though.

This all comes from cardinality estimation and costing and all the other good stuff that the optimizer does when you throw a query at it.

The Query Rewrite

One way to show the power of TOPs is to increase and then decrease the row goal. For example, this (on my machine, at this very moment in time, given many local factors) will change the query plan entirely:

SELECT TOP (10)
    p.*
FROM
(
    SELECT TOP (26)
        DisplayName =
           (
               SELECT
                   u.DisplayName
               FROM dbo.Users AS u
               WHERE u.Id = p.OwnerUserId
           ),
        p.AcceptedAnswerId,
        p.CreationDate,
        p.LastActivityDate,
        p.ParentId,
        p.PostTypeId,
        p.Score,
        p.CommentCount,
        VoteCount =
            (
                SELECT
                    COUNT_BIG(*)
                FROM dbo.Votes AS v
                WHERE v.PostId = p.Id
            )
    FROM dbo.Posts AS p
    ORDER BY
        p.Score DESC
) AS p
ORDER BY
    p.Score DESC;

You may need to toggle with the top a bit to see the change on your machine. The resulting plan looks a bit funny. You won’t normally see two TOPs nuzzling up like this.

But the end result is an improvement by a full minute and several seconds.

Because the inner TOP has a bigger row goal, the optimizer changes its mind about how much effort it will have to expend to fully satisfy it before clenching things down to satisfy the smaller TOP.

If you’re only allowed quick query rewrites, this can be a good way to get a more appropriate plan for the amount of work required to actually locate rows at runtime, when the optimizer is dreadfully wrong about things.

The Index Rewrite

In this case, just indexing the Votes table is enough to buy us all the performance we need, but in my personal row goal for completeness, I’m going to add in two indexes:

CREATE INDEX 
    v 
ON dbo.Votes 
    (PostId) 
WITH 
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

CREATE INDEX 
    p 
ON dbo.Posts 
    (Score DESC, OwnerUserId) 
WITH 
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

Going back to the original query, we no longer need to play games with the optimizer and pitting TOPs against each other.

This is obviously much faster, if you’re in the enlightened and enviable position to create them.

Perhaps you are, but maybe not in the exact moment that you need to fix a performance problem.

In those cases, you may need to use rewrites to get temporary performance improvements until you’re able to.

Thanks for reading!

Going Further

Actual Execution Plans Finally Show Lock Waits!

Posted on May 15, 2024May 2, 2024 by Erik Darling

Shorty

This is a short post. I know you’ve been somewhat spoiled by longer posts and videos lately!

Just kidding, y’all don’t pay attention (with the exception of Kevin Feasel), so maybe this will be consumable enough for even the most squirrel brained amongst us.

A long time ago, I complained that wait stats logged by actual execution plans don’t show lock waits. That seemed like a pretty big deal, because if you’re running a query and wondering why sometimes it’s fast and sometimes it’s slow, that could be a pretty huge hint.

But now, if you run a query that experienced lock waits, you can see that in the details. Just highlight the root operator, hit F4 or right click and go to Properties, and look under the wait stats node, you’ll see this:

When did this get added? I have no idea.
How far was it back ported? I have no idea.

I could look on VMs with older versions of SQL Server, but it’s dinner time. Or as they call it in Saskatchewan, “supper”.

Thanks for reading!

Going Further

T-SQL Tuesday: Your Favorite Job Interview Question

Posted on May 14, 2024May 14, 2024 by Erik Darling

Hello Yes Nice To Meet You

This month we have an invitation from Kevin Feasel.

Since Kevin is often kind enough to link to my posts via his Curated SQL feed/aggregator, I figured I’d add to the SEO pyramid scheme this month.

As part of my consulting services, I often help clients decide if they could keep a full time database person (there are too many potential titles to list, here) busy for 40 hours a week for the foreseeable future.

If the Magic 8-Ball comes back with a Yes, I’ll also help them write or review their job posting, and screen candidates. If a resume makes it past various detectors and doesn’t end up in the randomly generated unlucky pile, I’ll interview them.

Me. Personally.

Not in-person, though.

There’s not enough armed security in the world for me to get that close to HR.

The Production DBA Question

If someone is going to be in charge of production DBA tasks, I’ll of course ask questions about experience with whatever model is in use, or is to be implemented by the company.

So like, if the company has Availability Groups, or wants Availability Groups, we’ll talk about those.

For flavor, I’ll also ask them why anyone would be so insane as to not just use a Failover Cluster with Log Shipping.

But the real question and answer that tells me if someone knows their business is this: When you set up backups, how often do you take log backups?

If anyone says “every 15 minutes” without any further clarification or qualification, they immediately go into the “No” pile.

See, 15 minute log backups are a meme in the same Pantheon as 5% and 30% for index fragmentation. Neither answer shows any deeper understanding of what exactly they’re doing.

Log backup frequency is a business requirement based on RPO goals (Recovery Point Objective). If your RPO goal is less than 15 minutes of data loss, 15 minute log backups don’t meet that goal.

And anyone who says that Availability Groups help meet RPO goals gets their resume burned in front of them.

The Developer DBA Question

For anyone responsible for performance tuning, I need to make sure that they’re staying current with their learning and techniques.

I’ll show a lot of pictures of queries and query plans, ask about various anti-patterns and tuning techniques, but my absolute favorite is to show them pictures of query plans.

Actual execution plans, as it were.

Why is this important? Because at least half of the people I show this to will totally ignore the operator times and start talking about costs.

If you’re looking at a plan where this much information is available, and all you can dredge up to troubleshoot things are estimated costs, I’ll probably mail you a copy of this.

And I’m not kidding here — it shows a complete lack of attention and growth — what’s the point of calling yourself a performance tuning expert if your expertise peaked in 2008?

There Are Good People Out There

They may not spend every waking moment writing, recording, speaking, or chasing MVP status.

And that’s fine — I don’t care if you partake in that stuff or not — there are plenty of very smart people out there who don’t file quarterly paperwork or schedule posts months in advance.

What I care about is that you aren’t just an automaton seeking a new place to go through the motions in.

Thanks for reading!

Going Further

Join Me And @Kendra_Little At @PASSDataSummit For 2 Days Of SQL Server Performance Tuning Precons!

Posted on May 13, 2024May 3, 2024 by Erik Darling

I am a heading

Thanks for watching!

Going Further

Why Performance Tuners Need To Use The Right Type Of Join In SQL Server

Posted on May 10, 2024January 22, 2026 by Erik Darling

Why Performance Tuners Need To Use The Right Type Of Join In SQL Server

Thanks for watching!

Video Summary

In this video, I delve into the importance of writing queries in a logically correct way, highlighting common pitfalls and offering practical advice to improve query performance. I share a client-inspired example where a left join was unnecessarily used, leading to a poor row-to-second ratio and inefficient parallel merge joins in the query plan. By changing the join type to an inner join without adding any indexes or making other changes, we significantly improved the query’s execution time from 9 seconds to just 3 seconds. This example underscores the importance of understanding the intent behind your queries and ensuring that the types of joins you use are necessary and appropriate for the task at hand.

Full Transcript

Erik Darling here with Darling Data. Darling Data. I don’t know why everything that attempts to do a transcription of my videos insists that I work for Darlene Data. Clearly saying Darling, not Darlene. Starting to hurt feelings a little bit. But today we’re going to talk about the importance of writing your queries in a logically correct way, which sounds like a very obvious thing that everyone should do and know how to do, but not everyone does. There is a profound sense of safetyism with people who write queries, either on their own or via an ORM, aka the lazy way. And this is a client-inspired query tuning exercise that I want to go through with you. Now, not just every query requires a left join. A lot of people freak out, think about, I don’t know what, but they just randomly sprinkle left joins everywhere just in case. And that’s not always necessary. Now, this tactic isn’t always going to work out. Sometimes you might need a left join. Sometimes changing the type of join might not make any sort of profound difference to your query, but it’s an important thing to keep in your brain.

as you are assessing queries that you are assessing queries that you are working on, either to first first write, first birth into the world, or one that you’re working on trying to make faster. All right? It’s a good thing to keep in mind. Think about the intent of your query, and if you actually need the type of join that you’re writing. All right? It’s a smart thing to do. So, this query has a couple supporting indexes.

They’re mostly supporting indexes. There’s one index that’s not supporting in here, and that’s part of why the query isn’t as fast as it could be, but you can’t always just add an index immediately. Some people have change controls. Some people are unfortunate enough to be on standard edition and not be able to just create indexes when handsome young consultants are working with them, and they need to do it later because they don’t want to block things because standard edition is proof that Microsoft hates you. So, let’s look at this query. And what we’re doing is looking for the top 10 users ordered by a few things down here that kind of take the need for a left join out of the equation.

We’re looking at people with the highest question score, the highest answer score, and the most posts. Now, if this were like the first day that Stack Overflow existed, and maybe like the first hour, and let’s say we had like 11 users, and we were really afraid that like two of them hadn’t posted anything yet, then you would probably need a left join to round out the like 10 rows that we need to make this report, right? Or make this page or whatever. Make this, look at our top 10 users. They’re amazing. Whatever.

But, like, this is the Stack Overflow 2013 copy of the database. By this writing, there were like 2.5 million rows in the users table, and the odds are pretty good. I’m not much of a bookie, but the odds are pretty darn good that 10 of them have made posts. Questions and answers. Questions and answers jammed into one table. Absolute madhouse. Who would do that? Who would establish a parent-child relationship all in one table? I don’t know. It’s probably on my list of things to do to split out questions and, well, I guess, I was going to say questions and answers, but questions and every other kind of post into two separate tables and demo how queries are better when you normalize like a decent human being.

But that’s for another day. Day far, far, far, far, far away from now. So, what we’re going to do is just run this query. Just as it is, we’ve got query plans turned on, so we don’t need to worry too much about missing anything important here. And we have a left join in our query, and we’re going to run this, and we’re going to wait 8 or 9 seconds. I always forget. Oh, that’s my hand. 4, 5, 6, 7, 8. Okay, 8 seconds.

It’s a pretty long time to get 10 rows, right? Not a good row-to-seconds ratio. I keep stressing the importance of the row-to-seconds ratio. 8 seconds for 10 rows? Very bad. It’s like a little bit more than one row a second. It’s a terrible, terrible ratio.

Again, not much of a bookie, but I would not want to put money on that horse. Might want you to put money on that horse, so I make some money, but, you know, whatever. So let’s look at this query plan. We can see from the query plan itself, it actually took about 8 and a half seconds.

So my initial estimate of 9 seconds was only off by 500 milliseconds. That’s actually a pretty good call there. It’s not like I run this thing 15 times or anything, right? Definitely not.

And this gets a pretty shabby query plan. Part of it is, of course, because I don’t have any kind of good index on the comments table, and so we spend about 2 seconds in this branch of the plan aggregating some data, and then aggregating data some more, and then sorting data.

Now, the reason why this is one of my least favorite types of query plans is because there are not one, but two parallel merge joins. Uno dos, right?

And personally, I think that the optimizer should cost parallel merge joins out of existence. They’re nothing but trouble. They are hassles, and they have caused so many performance problems that I’ve seen in my query tuning time that I just don’t know why they continue to exist except to keep my bar tabs paid, I guess.

So the problem with merge joins is that they expect sorted input, and when you expect sorted input, you create intra-thread dependencies, right? So parallel plan, you have multiple threads.

Those threads all rely on each other, putting stuff through things in order, especially around parallel exchanges. If things go really bad in a parallel plan, you’re going to end up with intra-query parallel deadlocks, you said.

You’re so smart. You could also end up with just exchange spills. Either way, you’re not in for a good time performance-wise. I hate these things.

I wish they’d go away. I wish it was just parallel nested loops or hash joins. Just, like, if it can’t be an adaptive join, it shouldn’t be in the picture. Get rid of it.

So anyway, I’m, like, half-joking. I suppose there’s a use for them. Aside from keeping my bar tabs paid, I just haven’t found it yet. So this query takes nine seconds. A lot of bad stuff goes on.

And what we’ll notice is if we write this query in a way that more accurately expresses what we’re looking for, because if we’re trying to find the users who may have the top ten most impact on the site, they’re people who have had to have made posts, right?

They’re not going to be non-posting people with high question or answer scores or a high number of posts. We’re just not going to find them. See, they’re people who have had to have done something in the database.

They must have. They must have. So we’re going to change that join to an inner join without adding any indexes, without changing anything else. We’re not going to update statistics.

We’re not going to defragment anything. We’re not going to restart SQL Server. We’re not going to recompile, optimize for unknown. We’re not going to do anything, any of these bizarre things that I see people do when trying to figure out why a query is slow. So we’re just going to change that left join to an inner join, because there’s no way we need to worry about preserving people who haven’t posted anything.

And this query immediately goes from eight and a half, I’m going to say nine seconds. Nine seconds. I’m going to round up in my favor.

Nine seconds down to three seconds just by doing the right type of join. Now, you might notice that the shape of the plan changed quite a bit, and you might notice that certain join types will remain nameless, mainly parallel merge joins, have been stricken from the record. We got a couple hash joins.

We got a nested loops join. And this is more typical of the type of parallel plan that I want to see when I’m tuning queries. I almost never want to see a parallel merge join unless I want to write a good demo about parallel merge joins being terrible.

So just to repeat myself a little bit here, just to make sure that this is utterly clippable, if anyone decides to stitch this or whatever they call it. So when you’re writing a query, when you’re tuning a query, make sure you understand the intent of the query. You don’t need to understand absolutely all of the business logic.

But you should understand the intent of the query and use some of your God-given gray meats to figure out if the types of joins you’re requesting SQL Server to do are necessary or even correct. Right? Right?

Because a lot of the times when I see people using left joins is because they had no idea what they were doing. And they were just like, I see a lot of other left joins here. I’m going to do a left join everywhere. And you can end up with, you know, not great performance if you are not writing queries the correct way. Now, again, inner joins won’t always be faster.

Left joins won’t always be slower. There may be times when that crosses over. But when you’re analyzing query stuff, you really do need to keep in mind that these choices of joins do matter. Right?

They do matter when you’re writing queries. And sometimes how much they matter, like to query correctness, can also have an impact on how fast your query is. So it’s a really good thing to keep in mind, to pay attention to, to make sure you understand what your query actually needs to do so that you don’t add superfluous join types to your query and end up with a bunch of parallel merge joins.

It’d be sad for you. I don’t know. Maybe you’d have to hire me, which wouldn’t be so sad.

Keep those bar tabs paid. So, thank you for watching. Hope you enjoyed yourselves. Hope you learned something. If you like this video, thumbs up is the absolute best way to show that you like this video.

If you like SQL Server content about performance tuning, other things, bar tabs, sports betting, you know, stuff like that. Subscribe to my channel so that you can get a notification every time I talk about performance tuning or bar tabs or sports betting. You know, got to keep things lively over here.

Anyway, I guess that’s probably about close enough. Thank you for watching.