Learn T-SQL With Erik: Serializable Is Not A Snapshot

Learn T-SQL With Erik: Serializable Is Not A Snapshot


Video Summary

In this video, I dive into the nuances of SQL Server’s serializable isolation level and how it impacts data consistency and blocking scenarios. We explore a practical example where two select queries within the same transaction see different views of the data due to shared locks held by the transaction. This session is part of my comprehensive 23-hour T-SQL course, now available at a presale price of $250—doubling in value once the course is complete. I urge you to save on this valuable resource and purchase it before missing out later. Let’s demystify serializable together and understand why, for most scenarios, it might not be the best choice for achieving snapshot isolation levels.

Full Transcript

In this video, you and I, you and me, we’re gonna get real serializable together. This is of course a small taste of the material that I have in my T-SQL course. 23 hours of beginner content is all available now. It’s still at the presale price of $250 that will double in value to $500 when the course is done. So as usual, your job is going to be done. Your job is to save $250 and buy that thing now before you regret it later. You have to give me $500. Then I light a cigar with it and walk away laughing in my fur coat or something. I don’t know. Whatever. Leave me alone. So we’re gonna, again, we’re gonna talk about serializable because, you know, a lot of developers have this weird view of, you know, isolation levels that, you know, they really isolate everything. And, that’s still not exactly true even with serializable. Serializable does come kind of close because it forces read queries to take and hold shared locks on data. When, you know, it may even take like either like a lock on a, on a key if you’re using an equality predicate or a range of keys. If you are, you know, searching for like, like a range of values, right? Between greater than less than zero.

All those, all those range of things. So, uh, this, what serializable does prevent you with what most people would call an unchanging view of data. Uh, it is still not necessary. It’s still not a snapshot of the data within a transaction. Like, you know, like different, what I’m going to show you today is different queries, uh, within the same transaction, seeing data from different points in time. Uh, what serializable doesn’t guarantee is particularly when that view of the data started. All you know, is that that is how the data looked when the query finished. That might be good enough for you. If it’s not, you might need to look at, uh, the snapshot isolation level. Uh, read committed snapshot isolation does, you know, also come reasonably close to that. But snapshot isolation is, uh, is the snapshotiest, especially if you need, uh, transaction level consistency and not just statement level consistency. That’s one, that’s, uh, pretty much a place where read committed snapshot isolation and snapshot isolation differ. But we’ll, we’ll talk about that more in depth later.

Uh, the view of data that serializable provides is not necessarily, uh, from like when a transaction began, especially if the query was blocked. And that’s, that’s kind of what, that’s what we’re going to look at in these demos here. So, uh, what we’re going to do is just drop a table, create it and stick 11 rows in it. Uh, you can see here that we have, if zoom, it will cooperate. We have the odd numbers from one through 11 put into, uh, this table called surly eyes, which is, uh, one of the best bars in Boston.

If you’re in case you’re, in case you’re wondering where the name come from, came from, come from, it’s getting towards the end of the day here. All right. So let’s make sure there’s no weird open transactions in here.

And we’re going to, we’re going to run through these demos in two different ways. Uh, the first way I’m going to run this is we actually need to switch this back to seven. Is we’re going to start a transaction.

We’re going to set the isolation level, the serializable. And then we’re just going to run a select from that query where ID equals seven. Now this table does have a primary key on the ID column. So we don’t like, like we can just seek right to whatever we care about.

We’re not like scanning pages or like locking the table or anything. So, uh, let’s run this and let’s grab ID seven. We can, we get that very easily, right?

But now we have, we have this, this transaction and serializable has taken a lock, right? Like it’s going to hold this shared lock on ID seven. If we come over here, what we’re going to find is that we are still allowed to insert ID eight into the table, right?

So we can put ID eight in because serializable doesn’t need a lock to make sure that like ID eight doesn’t change. But now if we come over here and we try to run this, this select query is going to get blocked because this select query is looking for where ID is less than or equal to 11.

Remember 11 was the last key in our table. So it’s essentially just looking for everything behind 11. This query is now blocked because like this, and you know, this would happen under read committed as well.

This query would get blocked because they’re like this insert has an open transaction and it’s holding the lock on ID eight. So we can’t read that row while we’re doing this. So we can’t read that. But if we run this insert, right?

We do this and this all finally commits. What I think surprises a lot of people is that the serializable transaction with two select queries in it that ran at two different points in time, essentially saw two different views of the data.

Right? So like this thing ran and it found ID seven. That’s not really all that weird. But what people find weird is that when we run this query to look for where everything where ID is less than or equal to 11, not only does this query see the ID eight that we inserted, but also the ID 11, sorry, the ID 10 that we inserted, the even numbers, not the, not the odd numbers.

So this is what kind of freaks people out. Cause like they’re expecting this, you know, unchanging, like consistent snapshot view of data, which is not what serializable provides.

Again, it’s only what the data looked like when the query finished. Like when it, when that query began or any locks that it ran into along the way can, can certainly interfere with that. So let’s make sure that we don’t have anything weird going on here.

Uh, make sure this is all fully committed. So when we refresh the table over here, we don’t get blocked and get embarrassed by ourselves. So, uh, let’s do this.

Let’s, uh, let’s insert. Let’s start a transaction and insert ID eight here. Now let’s come back over here and let’s actually change this query for a little bit of texture. And let’s run both select query.

Let’s actually just run the whole thing at once. Right. And now we have these two select queries running. This one is blocked looking for ID eight. Cause ID eight is held onto with this one, but now I can run this query and I can insert ID 10. And like, not only is this query that ran, right?

We like this query that ran up here now found ID eight. Cause this happened after the lock was released, but this query once again, saw both eight and 10. So the main thing here is that, you know, serializable, like, you know, select queries will take and hold shared locks much, much longer than they will with read committed, the default locking isolation level for all SQL servers, except for Azure SQL database.

But, um, like serializable will take and hold shared locks and where read committed doesn’t. And what that, what that sometimes leads people down the path of is think that they’re thinking that there is like, you know, some internal mechanism scheduling these queries, like queuing them up.

Like, like when you see like the, like the blog post diagrams of like, here is T1. It is scheduled for now. It’s like a train schedule.

Here is T2. It is scheduled for now, but that’s, that’s really not what happens. These queries all run and they attempt to take these locks, but they’re all working off the same, like live table data. There’s no, you know, there’s no snapshot consistent view of this data, like goat, like that’s often stored anywhere.

So like even two select queries in the same transaction can encounter different rows when they run. Again, this might be good enough for you if, if you even need this high of a level of isolation, but it is still not a snapshot view of the data.

And once again, you would still need a, a row versioning isolation level in order to obtain that. Anyway, that’s enough about serializable. You probably don’t need it.

You probably don’t want it. When you use serializable, you certainly prolong blocking operations, especially from read queries, and you very, very certainly increase your likelihood of encountering deadlocks with read queries, taking and holding shared locks in ways that are often unfamiliar to people used to working with either read committed, the locking isolation level, or just festooning your queries with no lock hints, every which way you go.

So I hope you enjoyed yourselves. I hope you understand serializable a little bit better now. And I will see you over in the next video where we’re going to, we still have a, we still have some ground to cover with the transactions and isolation level stuff.

So we’ll, we’ll, we’ll get to something a little bit less isolating than serializable in the next one. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Transactions and Blocking and Confusion

Learn T-SQL With Erik: Transactions and Blocking and Confusion


Video Summary

In this video, I delve into the world of transactions in SQL Server and explore how they can lead to blocking issues that might seem perplexing at first glance. We walk through a practical example where an update and insert operation within a transaction block a simple select query, highlighting why using `NOLOCK` hints might not always be the best solution. I also discuss the limitations of the `sp_whoisactive` stored procedure and the `BLOCKING_PROCESS_REPORT`, showing how they can sometimes mislead you about what’s actually causing the blocking. The video emphasizes the importance of considering transaction length and isolation levels to avoid unnecessary complications, urging viewers to keep transactions as short and focused as possible for better performance and reliability.

Full Transcript

Erik Darling here with Darling Data. And we’re going to, of course, have some more T-SQL laughter, fun, joy. You know, all the stuff that comes along with writing T-SQL, really, aside from, you know, the pain, confusion, annoyance, all that other, all the other stuff. We’re going to have the fun part, right? So again, all 23 hours of the beginner content out there live for you to start consuming, digesting, mulling over. And it is all available at the pre-sale price of 250 US dollars. That’s good for life. But the price is going up to 500 bucks when the advanced material drops. So, uh, buy it now. I guess, I guess the call to action there. So in this video, we’re going to talk a little bit about, um, transactions in SQL Server. And specifically, when it comes to dealing with like blocking stuff, how like, what you see by the way you see is that you’re going to be doing a lot of things. In the way you see that the block can be very, very confusing and misleading.

So, uh, I have an, I have a transaction here and within this transaction, we have two statements, right? We have an update and we have an insert. Um, and over in this window, we have a select query that’s going to try to select data. Well, I guess it worked there. Uh, look at that. Hey, it worked. We’re not running anything yet. Good job us. We have a select query that’s going to try to select some data from the users table. All right, good. So let’s come over here and let’s begin our transaction. And now let’s update, uh, this one row in the users table. And, uh, let’s pretend that that update worked very, very successfully. And now let’s, uh, let, now let’s insert a row into the users table. If we come over here and try to run this query, and again, we’re not, we’re not going to be using any no lock hints, right? We can, well, I mean, I can show you, like if we say, uh, again, you’re like with, uh, no lock, da, da, da, da.

And we run this. Now we get, uh, this back. Of course, you know, over here, we updated the age column. So let’s, let’s, let’s add the age column in so we can see exactly what hell no lock has wrought. So we can see that we have updated John Skeet to be 21 years old. Maybe this is great for John Skeet. I don’t know if John Skeet liked being 21, but John Skeet is now a 21 year old, uh, software developer. But without the no lock hint, we, uh, just get blocked forever and ever, right? This can, this, because we, this, that other query has a lock and it’s not doing anything bad to anyone aside from, you know, like it’s not doing anything illegal, right?

It’s not, we’re, we’re, we’re annoyed. We’re, we’re definitely imposed upon. We are aggravated by being blocked, but that other query, that other transaction is not doing anything that it’s not supposed to be doing. If we come over here and we run SP who is active, the results are going to be somewhat confusing because what it looks like is happening is an insert into the user’s subset table is blocking a select from the user’s table, right? And we can see that this query has now been, I’ve been talking for almost 30 seconds while this poor query has been blocked.

Now we can see, uh, if we use the get locks parameter that will enumerate all the locks. Now I don’t recommend using this if you’re in a system with a lot of blocking going on, because this can take a long time. Like if there’s a lot of blocking sessions and like this can, it’s like, it’s a cursor that does this and like XML stuff.

So don’t, don’t use get locks if you’re in a real blocking crisis. Um, but if you come over here and we look at the locks at the end, that the select query is trying to take, it’ll, it’ll say, yeah, we were trying to take a lock on the user’s table. Like we’re, we’re, we’re just in there and, but up here in the, in the locks XML for this one, we’ll see two different things.

We’ll see that we have locks on both the user’s table from that first update that ran in the transaction. And we have locks on the user’s subset table. So we can at least dig in a little bit further, but if you’re, if you’re running SP who is active without get locks, without the get locks parameter, you’re going to not have that information, right?

We, we don’t have that handy XML thing over here. All we see is this insert blocking the select. And this could be very confusing because this is a completely different table.

And how could this completely different table be blocking a select from, from this other table? It’s, it’s, it’s not a very fun situation to be in. So if we just, you know, roll this thing back, right?

We say, Hey, you know, screw it. We, we made a mistake there. Let’s get rid of that. Uh, now we can run this and we’ll see John Skeet’s age is back to null. John Skeet, he’s null years old.

He’s not, he’s not 21, but what’s even worse is the block process report. Now the block process report is considered by Microsoft to be a best effort report of what was blocking. Unfortunately for us, unfortunately for us, what we see in the block process report is the same thing that we saw in who is active.

What we have, what looks like an insert into users subset blocking a select from the users table, right? That’s all this stuff over here from users. Now, if you like in real life, if you were to like, you know, you know, run, run SP human events block viewer, my wonderful free store procedure, uh, you would look at this and you would say, Eric might be on crack.

Eric, Eric might, Eric might have done something weird in this, but I, I was not me. I did not do this. Uh, what, and, and, you know, you might start looking at the user subset table and being like, well, is, is there a, is there a foreign key to, to users?

Is there, is there a trigger? Is there a, is there some indexed view that’s, that, that holds these things? No, it’s just the block process report being annoying, right?

Because all this stuff happened in the, in the scope of a transaction and the locks on the users table from the earliest statement in the transaction are still being held. Even though we have moved on and done something else in the transaction, uh, is blocking our select query. No wonder people just put no lock everywhere.

This stuff sucks, right? But this is where a row version and isolation level would help you. Cause then your select wouldn’t get blocked and it wouldn’t read wrong data. But you know, the block process report, if we scroll way over here and we click on this and we scroll past all this stuff that got added in SQL Server 2022, uh, then, you know, we will at least be able to see, uh, some more information about what, what went on in here.

Uh, way earlier in the XML, there’s like this object ID and some other stuff. And depending on how nice the block process report is, is feeling, sometimes you’ll get, you know, uh, sometimes you’ll get an actual object ID that works. And sometimes you’ll just get some weird randomish number.

I don’t, where did this come from? I don’t know. So do either of these objects have this object ID? Nope. Certainly do not. So here we are.

Anyway, uh, the block process report, well, can, can, well, it can be very useful to show you generally what’s going on with blocking is, uh, sort of not your friend. But there’s, there’s a, there’s a larger message here. And that is that, uh, something, something that a lot of people misunderstand about transactions is lock lifetime.

So that update ran, right? We started a transaction. We ran that update.

We hung around for a minute, smoked a few cigarettes. Then we inserted some data into another table, smoked another few cigarettes, and then tried to run that select query. The whole time that we were being cool, partying, smoking cigarettes, drinking champagne, doing all that fun stuff, those locks were being held from the update. The more stuff you put in a transaction, and the longer that stuff takes, the longer those locks hold, the more of a chance you have of causing problems.

Again, like, you know, like people just use no lock everywhere. They’re like, I don’t want these problems. But this would cause problems for another, for other competing modification queries too, right?

Like, like, like other updates, inserts, deletes, all that other merges, all that other good stuff. That would get blocked as well. So think about transactions, uh, and the length of time that a transaction is open.

Sort of being like the surface area of your query. The bigger that surface area is, the more problems it can cause. The more problems that can be caused to it.

So if you’re in the habit of using transactions, don’t shove everything in the world in there. Like, it’s gonna, it’s gonna hurt you, right? Especially because you’re using SQL Server and the default isolation level is read committed.

Right? Otherwise, you read a bunch of wrong data with no lock. So, uh, when you’re, when you’re using transactions, please make sure to pay special attention to which queries you’re putting in there. Make sure that they all actually need to be logically grouped together.

Make sure that they all actually need to commit, sorry, succeed or fail altogether. Uh, tune those queries as diligently as possible. Do not put anything in there that does not need to be in there.

Do not put any extraneous calls, especially outside of the database in there. I worked with one client years ago who was, uh, in a transaction doing a bunch of stuff. And then also using XP command shell to call like an executable and like go like, like write a file and then upload that file.

And then like do some other stuff. And then all that other extraneous database stuff was real bad time. So, uh, be very, very careful when you start using explicit transactions because the lock lifetime on those is from when you take those locks, right?

When that update took locks until we hit rollback there. So be very, very careful. Be very, very diligent with your transactions.

They should be as short and sweet as possible. All right. That’s probably enough here. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video.

We’re going to, uh, I think we’re going to talk about the serializable isolation level next. Cause boy, boy, howdy. She’s a humdinger that one. That’s serializable. Anyway, thank you for watching.

You’re very, you’re very, you’re very smart.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Row Versioning vs Locking Isolation Levels

Learn T-SQL With Erik: Row Versioning vs Locking Isolation Levels


Video Summary

In this video, I dive into the fascinating world of isolation levels in SQL Server, specifically focusing on the differences between locking and row versioning isolation levels. I share my experience of briefly testing out a Wario voice at the start, which didn’t quite hit the mark as I was still digesting lunch. However, things are about to get lively as we explore how read-committed snapshot isolation and snapshot isolation can offer more efficient query execution compared to traditional locking methods. I also highlight why Microsoft’s decision to default to read committed snapshot isolation in Azure SQL Database is a step forward, encouraging viewers to consider the benefits of row versioning isolation levels for their own databases.

Full Transcript

Eats-a-me, Erik! That was my best Wario voice. I don’t know, I don’t know, I don’t think I did very well. We’re going to talk a little bit about isolation levels in this video. I just want to briefly describe what the difference between walking and row versioning isolation levels.

If I seem a little sleepy, it’s because I just ate lunch and I’m like, just like a bear full of salmon right now. Anyway, I promise it’ll be very lively in a moment. As soon as that fish oil kicks in, you see what it’s like. My skin starts to glow and my hair gets shiny. All 23 hours of the beginner content is up and running and out there for you to start learning from.

It is all available at the presale price of $250. And all this content is going to double in value after the summer when the advanced material drops out of my head. So I would, again, encourage you to buy early and save yourself some money. All right, let’s talk about isolation levels. In SQL Server, isolation levels can be broadly defined as locking, row versioning, and of course, read uncommitted.

The locking isolation levels are read, repeatable read, and serializable. And the row versioning isolation levels are read committed snapshot isolation and snapshot isolation. And then we have this uncategorized mess, this monster, read uncommitted, which is a synonym for no lock, right?

No lock and read uncommitted, same thing. They do not, there’s no difference except where you write them. Still take schema locks and it will read data from uncommitted transactions, commonly referred to as dirty reads. This is the one that you generally want to avoid, though most of you have done quite a poor job failing grades on avoiding read uncommitted slash no lock.

So for the locking isolation levels, the locking isolation levels work directly with data and tables and indexes by acquiring shared or exclusive locks as data is read. That’s an important thing to understand because like the locks don’t occur like in mass, right? Like I guess you could specify like a tab lock or something and have SQL Server just be like, no, but under most circumstances, you read something you like, as you read data, the locks are required.

You know, even a tab lock isn’t guaranteed to be able to be taken the second you say to take it because you might have competing locks that would prevent it. Read queries under locking isolation levels might be blocked by modification queries while waiting to take their shared locks on rows. They may also block modification queries under certain conditions where shared locks are held until the statement has finished executing.

This is particularly true of serializable and repeatable read, but it is also quite definitely true under read committed. Read queries can indeed block write queries. I will have examples of that later.

Depending on the strictness of the locking isolation level, shared locks may either be held again until the statement is finished or very quickly released. Because like read committed, you’re just kind of like chewing along. You take your read lock, you take your little shared lock, you get whatever data you need, whether it’s rows or pages, and then you let that shared lock go and you carry on.

Both repeatable read and serializable will hold those shared locks to prevent modifications from modifying data that they have already read. So that doesn’t mean that they look, they’re not really like looking forward and trying to like, you know, jump ahead to take locks. But as they read data, then that’s when that all goes down.

The row versioning isolation levels are a little bit different. The row versioning isolation levels will read copies or versions of locked rows, which can, of course, result in fewer detrimental blocking and deadlocking scenarios. I, you know, I, well, I am a big proponent of the row versioning isolation levels, you know, especially all of you SQL Server people out there who are fawning over Postgres, who just can’t wait to leave SQL Server and get over to Postgres land.

Postgres uses multi version concurrency control by default, which is a row version, which for that, which means row versioning, right? Except they do it in the worst way possible and they keep all their versions in the table and you have to clean that up. You have to vacuum that mess up. So you’re going to have a whole new set of problems once you get over to your free database.

But the, the, like, you know, just to understand in SQL Server though, the implementation of the row versioning isolation levels, I believe is far superior. It’s still not for free, right? Because you have to version the rows. You have to read from potentially large stores of version rows, even queries that just have to go and check to see if there are like versions of rows that they might have to go read.

That’s not free adding eight by pointers to your rows so that you can follow the version stuff. That’s not free, right? So like, there’s a lot of stuff that’s still not free about it, even though it’s better, it’s still not free. But I think the non locking isolation levels, the row versioning isolation levels are generally better and less troublesome than your, than your locking alternatives.

And I think that the very easy way to tell that this is true, at least in the context of SQL Server, we’re not, we’re not talking about Postgres anymore, is consider that by default, SQL Server uses read committed locking as its, as its isolation level. This is of course different if you use Azure SQL database, Azure SQL database uses read committed snapshot isolation by default, but you can turn it off. But nearly every single application I see, nearly every query that someone sends me to say, hey, what’s going on with this thing is just absolutely festooned with no lock hints, right?

No lock everywhere. It’s like the, like people have such a hard time with read committed, the locking default isolation level that they’re like, screw it. I’d rather have wrong results than deal with this.

Right. Just no lock hints every year. There’s years of warnings about the perils of no lock. Like, you know, everything that people have, like all the blog ink and blood that has been spilled and drank and spilled again. Uh, you can’t keep people away from it.

It’s like, it’s, it is the, one of the most toxic database relationships that I have ever seen in my life. And, you know, like I said, also like consider that when it came time for Microsoft to offer a SQL Server to the world, like, but they were hosting it, they were managing it. They’re like, we’re in charge.

We can do this. Right. We have separate subject, but, uh, they at least got this part, right. And they were like, no, like, like we’re, we’re hosting that. You know what?

We don’t want to deal with a bunch of stupid reader writer blocking and deadlocking. You know, like, this is so annoying. Like, why would anyone want to deal with this? And like, meanwhile, they’re like, that’s, they’ve been forcing the general public to deal with for the last like 30 something years. Right.

When Microsoft got their hands, they’re like, nah, we got better things to do. So they made the default isolation level read committed snapshot isolation. So a lot of this is just to set the stage for some somewhat simpler statements. And, uh, we’re actually going to close this one out here.

Uh, under row versioning isolation levels, read queries are not blocked by write queries. They go read, uh, versions of the rows, copies of the rows that are locked. Uh, also read queries do not block write queries.

Another good thing, right? It’s, it’s almost, it’s like using no lock, but better because there are no dirty reads. Uh, and also deadlocks do not occur between read queries and write queries. There are of course, some edge cases and caveats to that where, uh, that like complicate those statements a little bit.

Like things like, you know, uh, far and key cascading, uh, maintaining, uh, index views, uh, that have multiple tables and then do weird stuff to the row versioning isolation levels. But those are like kind of out there edge cases. They’re not like everyday stuff that’s going to be happening to you.

Like every three seconds in your database that it’s going to make you use no lock hints. Under the locking isolation levels, read queries can and are quite frequently blocked by write queries. Read queries can also, uh, be blocked by write queries and deadlocks can occur between read queries and write queries.

So there is, uh, there is a lot bad to be said about the, um, about the default situation in SQL Server. I think that head Microsoft made a better choice, uh, back in say 2005 when the, uh, the row versioning isolation levels were introduced to the world. Perhaps not to say that all of your databases are now going to be using a row version isolation level, but perhaps to say that any new databases will by default be using a row versioning isolation level.

But you can turn it off if you want, sort of like with Azure, like Azure SQL database. You, it’s on by default, but you can opt out of it. I think SQL Server would be in a much shinier, happier place than it is today where people still have all sorts of unwarranted fears and misconceptions about row versioning isolation levels.

Perhaps from reading terrible blog posts about them that involve marbles and whatnot. Anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in the next video where we will talk about, uh, I forget, but it’s still transactions and isolation level. So it’ll be a good time because that’s what we do here. We have a good time.

You can’t have a good time. What can you have? Nothing. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: A.C.I.D.

Learn T-SQL With Erik: A.C.I.D.


Video Summary

In this video, I delve into the concept of ACID properties in SQL Server databases, specifically focusing on how they apply to T-SQL operations. Starting from the basics, I explain each of the four ACID properties—Atomicity, Consistency, Isolation, and Durability—and provide clear examples to help you understand their practical implications. While the video doesn’t include demos, it does offer a thought-provoking example that illustrates how isolation can be misunderstood in SQL Server. By exploring scenarios where queries might not behave as expected due to read-committed isolation, I aim to highlight the importance of understanding and managing transactional integrity. Additionally, I touch on durability, explaining its significance and the potential caveats when dealing with features like delayed durability or in-memory databases. This video is part of my comprehensive T-SQL course, which will soon be expanding with advanced material after summer. If you’re looking to enhance your SQL Server skills, now might be a great time to join at the presale price before it doubles in value!

Full Transcript

Erik Darling here with Darling Data, and we’re going to continue on with the teaser content from my T-SQL course. All 23 hours of the beginner stuff out there published, ready for you to start consuming. The advanced stuff will be dropping after the summer. The course is still on the presale price of $250, and it will literally just absolutely double in value once the advanced material pumps out. So I would strongly suggest that you get in on this now while it’s at the cheaper price because I realize how price and cost sensitive a lot of people are. So in this video, we’re going to talk about acidity because acidity in the context of databases and more specifically in SQL Server is not quite the level of acidity is not quite the level of acidity that you might receive if you read a highly academic Wikipedia article that outlines all of the effects of the ACID, which is Atomicity, Consistency, Isolation, and Durability. So we’re just going to cover those. There’s no demos in here. It’s just sort of a one kind of example query to give you some food for thought on that.

So let’s look at this one on exactly how isolated a query executing might be. So Atomicity means each transaction either happens or it doesn’t. SQL Server by default runs in auto commit mode. So if you want to select, it’s sort of like in the background SQL Server on its own without you typing anything. It’s sort of like saying, you know, begin transaction, run the select commit. Same with inserts, updates, and deletes. It’s auto committing everything. So you don’t have to declare an explicit transaction for every piece of code you write. You can, of course, declare a specific and explicit transaction. But when you do that, like the, you know, assuming that you do it for a reason that is not just putting one query into an explicit transaction, there are multiple queries in a transaction. It is for the purpose of having those queries succeed or fail as a single unit. We do not want to leave those queries in a strange, lingering, stranded state.

But let’s say that you’re running an insert query and you’re trying to insert two rows. The first row goes fine, but the second row causes a, I don’t know, say primary key or foreign key violation, you know, constraint violation, something along those lines. That just means that, like the first row that even though that one was okay, isn’t going to still be in the table. That statement will roll back and the successfully inserted row will get removed.

Zooming out a little though, all atomicity means in a database is that each transaction will succeed or fail as a unit. It does not guarantee that other queries can’t interact with the underlying data while they’re running. Right? Like, you know, while you are inserting your two rows, someone else might insert a different two rows and someone else might delete a row somewhere else in the table.

And someone else might update a row somewhere else in the table. And someone might run a no lock query and select your two half inserted rows, right? A row and a half is in there or something.

Consistency for SQL Server is especially disappointing. Because the only thing that the C gets you is a guarantee that your constraints are obeyed. By constraints, I mean primary key, foreign key, unique, not null, and check.

Default constraints, I mean, well, I suppose they will be obeyed. They hardly qualify here for, I forget if I covered it in the teaser material for inserts. But if not, I covered in the full material.

So you should go buy that. But, you know, just don’t expect too much out of the consistency thing, at least in the context of SQL Server. However, isolation is the toughest one because most people are completely confused about what isolation means, like what isolation levels guarantee and at what point they provide the various guarantees that they make.

So let’s say that you have a code pattern that looks like this. You begin a transaction. You declare a local variable.

We’re just going to call this next user. And when you go to assign this local variable of value, which is going to be the ID column from the users table, you have to access, in the context of this query, you have to access the users table twice. If you’re, like, we don’t have any walking hints on here.

We don’t have any isolation level hints on here. So just assume that we’re using the default locking read committed isolation level. That doesn’t really guarantee much.

We’re going to talk way more about that in some of the other videos. But just understand that, like, you know, when you read, when you say select from, that’s one read of the users table. And when you say select from here, this is another read of the users table.

These two reads are not going to show data from the same point in time. Read committed is not a point in time view of data. So you might read this data from the users table in the outer context or the inner context at a different time from the inner context or the outer context.

And you still, if you’re scanning through the data, you have to read along those pages. All sorts of stuff can happen to your data as you’re reading along those pages. Again, it’s not a snapshot of the data.

It’s just you’re just taking little read locks as you go along and do stuff. As you’re taking those little read locks, all sorts of stuff can happen all around you. So, like, you know, just if we were to look at the query plan for this, just hit control and L.

We have one scan of the users table here. That’s going to happen at one point. This is going to run like the, like you have another scan down here.

But this isn’t going to run at precisely the same time because we have to get all this data, aggregate it, right? Scan the table, all two point something million rows, aggregate it, gather the streams, aggregate it again, distribute the streams again, and then go into a nested loops join so that every row that comes out of this goes in here and then reads the data.

And since that loop join is just going like this over and over again, these are all at different points in time too. Again, there’s no snapshot happening. So if the goal here is to find the top, like the most recent accessed user in the users table and then go assign them some award, right?

Say, hey, you’re the most recent user. You win a million dollars or reputations. You can go buy a cool stack overflow windbreaker or canoe or Zippo or something.

You know, like you don’t have a guarantee that that person actually was, like when this update ran, the person who should get it. Because all sorts of stuff can happen all around you. Someone else could log in right after you ran this query and this, whatever value you assigned to this would no longer be valid.

So this is where you would have to think about increasing the isolation of this query. Perhaps you would need serializable. Perhaps you would need updlock, right?

Perhaps you would need to do something to protect this data when you read from it both times in order to make sure that this date, like nothing else sneaks in and does something to this data, changes the validity of the value that you assign to that next user when you go and assign some stuff here.

Durability is an easy one, right? Durability, this one is very simple. What this means is that once a transaction commits, you won’t lose it.

There are some caveats around features like delayed durability, but in, you know, I think the in-memory stuff kind of makes it might make this a little weird too. But in most cases, unless you experience data corruption, committed transactions will be safe.

So just like talking, like thinking about the ACID stuff in the context of SQL Server, you know, A is most likely the strongest guarantee. C is a very iffy proposition.

I, you know, as you think about like, you know, depending on the isolation level you’re using, the I might not mean a damn thing, especially if you’re using read uncommitted or nolock. I means nothing.

I might not, I might as well not even be there. I might as well just be ACID. And durability is, you know, up there with A as far as strength goes, though there are some caveats to durability depending on database settings and, you know, the reliability of your underlying storage hardware.

So anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. In the next video, we’re going to talk a little bit more about isolation levels and stuff like that. So hold on to your bucket hats for that.

Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 26

SQL Server Performance Office Hours Episode 26



To ask your questions, head over here.

Some of my reporting queries finish a little faster with NOLCOK hints even when nothing is going on in the database. Any ide why?
I get the argument that new features sell better than fixed up old features, so how come highly upvoted User Voice items for new features have sat around for years with no action?
There are a lot of applications out there that do horrible things to SQL Server. Why don’t software vendors hire people like you to fix things?
What is batch mode not a good fit for?
Is it always possible to tune queries to save money in the cloud with smaller hardware?

Video Summary

In this video, I delve into some intriguing questions submitted by viewers during my Office Hours session. We explored topics ranging from why certain queries perform better with no lock hints to the challenges of getting new features implemented in SQL Server. I also touched on the dynamics between software vendors and database performance experts, offering insights on how to optimize cloud costs through query tuning and hardware reduction strategies. Whether you’re a seasoned DBA or just starting out, there’s plenty here to help you navigate the complexities of SQL Server optimization.

Full Transcript

Erik Darling here with Darling Data, and we have another exciting rendezvous with Office Hours. My favorite. That’s where I get to be a real man of the people and answer some people questions that you submit. If you want to submit your own question, you can go to that link, which is down in the video description. If you feel so inclined and you like the material that I produce on this YouTube channel, you can go to that link, which is down in the video description. If you feel so inclined and you like the material that I produce on this YouTube channel, you can also sign up for a membership, you can also sign up for a membership, and you can support me with as few as $4 every month. If you need SQL Server Consulting help, I am out there in the world, working with clients, helping people, tuning queries, doing all sorts of stuff. And as always, my rates are reasonable. If you want to get in on my performance tuning training, you can get all 24 hours of it for about $150 buckaroos that is at that link with that discount code. That is for the everything bundle that covers my performance tuning training. If you would like to buy my new T-SQL course, that is also available there. All 23 hours of beginner content is done and recorded. This course will double in value from $250 up to $500 after the advanced material comes out after the summer months.

So save yourself some money now and buy that while you can. Redgate is being a bunch of real sweethearts and dragging me out of my house to varying degrees this summer. New York City, August 18th and 19th. Dallas, Texas, September 15th to 16th. And Utrecht in the Netherlands, October 1st and 2nd. This is all part of the Pass on Tour events that they are putting on, little mini Pass events. And this is all leading up to Pass Data Community Summit in Seattle, September 17th to 21st. But it is time to go answer some questions now. It is time to have a bit of question asking and answering fun. All right. So first up, here is an interesting one. Some of my reporting queries finish a little faster with no, let’s just call that a no lock hint.

Even when nothing is going on in the database. Any eyed why? Ah, sane eyed. So when you use no lock hints, aside from the fact that you are requesting the read uncommitted isolation level, which will allow your queries to ignore locks taken by modification queries that are executing concurrently, you also allow the SQL Server storage engine to access data a little bit differently. It can use what’s called an allocation order scan using IAM pages rather than using index pages to read through stuff. And that can sometimes be faster.

So, um, like that, that is most likely why I have an old video about it where there was a really dramatic difference between a query with no lock hints and a query without it with nothing else running on the server. Uh, part of it was reading, uh, like from a cold cache that was like reading from disk into memory. Um, you might look at the difference between those and you might see the, uh, the ordered property of whatever index scan operators being false.

Now keep in mind, this is for index scans. Index seeks don’t really get that behavior because you’re seeking along a B tree, but scanning a big index will, will definitely see that. All right. Next up here we have, oh boy, getting, getting personal here. I get the argument that new features sell better than fixed up old features.

So how come highly upvoted user voice items for new features have sat around for years with no action? Well, apart from the fact that, um, Microsoft seems to have the attention span of a squirrel that got into a bag of meth. Um, you know, uh, a lot of what, like, like you have a, like every, like every release, they have like a certain amount of budget that they can spend like people’s time and like salary on to develop stuff.

And, you know, some, some narratives get pushed a lot harder than others, you know, like, um, turning pink because of some weird light thing here. But like, like, you know, just like, if you look at like a lot of, a lot of recent releases, you know, like some narratives got pushed really hard that ended up just being like nowhere’sville. You know, like, um, you had big data clusters, which are a complete flop, right?

They’re, they’re completely deprecated now. Uh, ledger tables, no one cared, right? Like blockchain stuff. Um, you know, it’s like, there’s just a lot of things that, they get poked out there that, you know, are driven by these sort of like checkbox items to, you know, make sure that you have the new shiny thing in your database that people that like executives and C-levels currently care about.

And not a lot of that stuff aligns with what users care about. You know, it’s like, uh, like from what I know talks about, like, you know, improving partitioning come up, uh, like every release cycle and everyone’s just like, well, I mean, whatever. Uh, you know, the argument is that like, you know, it’s like, it’s like this stupid circular argument.

It’s like, like, well, the feature kind of sucks. So no one, like, yeah, like we could fix it. And like, yeah, but no one uses it. So it wouldn’t really have that much impact.

It’s like, well, no one uses it cause it sucks. Right. It’s like, you made it painful to use. Of course, no one uses it. Right. Like, of course it doesn’t have wide adoption. You made it miserable. Right. So, you know, uh, don’t expect that to change. Uh, I guess that’s the bottom line.

Unless you are like Walmart or something or SAP and you, uh, swing a big purse at Microsoft and you really want to push for stuff. You probably, you, you have the worst voice in the room, the quietest voice in the room. Um, and you’re like, like that, that user voice size is just a black hole for ideas anyway.

So like, unless you’re reporting a serious bug, uh, good luck. Good luck out there. There are a lot of applications out there that do horrible things to SQL Server.

Why don’t software vendors hire people like you to fix things? You know, that’s a great question. Um, I have actually worked with a number of software vendors. I am always happy to do so.

Uh, and it is always nice to improve products as a whole. Uh, I have noticed that this, this trend has actually peaked up a bit, uh, since like software as a service has become more and more ingrained in what a lot of companies are offering. You know, when it was like, you know, we sell you a piece of software and you install it on a SQL Server that you’re responsible for.

You couldn’t trust anyone to install anything and do anything right. Like you had no idea what version or a, like you could, you could put out a best practices guide, right? And you could say like, here’s a setup checklist.

Here’s the stuff you should do. Like here, like the bare minimum requirements here, like, you know, hardware requirements, depending on database size, stuff like that. There is no guarantee that anyone’s going to follow any of that. So, uh, the software as a service thing has, um, has, has gotten me hired by more software vendors because now they’re responsible in house for like infrastructure, the code, the indexes, everything.

And so I have been doing more work like that, but, uh, I would love to do additional work like that. So if you know of a software vendor who you think, uh, should, should hire perhaps an Erik Darling in the world who has, who has very reasonable rates, uh, to improve their software offering, you, you, you feel free to send them. Well, I don’t, I don’t have a resume.

You can send them to my website and that’s, that’s probably as good as it’s going to get as far as, uh, resumes go. Uh, let’s see. I seem to have lost my Zoomit cursor there. We’re going to try that again.

Uh, so here’s a, here’s an interesting question. What is batch mode not good for? Well, I think, you know, uh, I assume this question is because I’ve been, you know, uh, doing some videos lately about stuff that batch mode is good for writing batch mode friendly queries time. You know, when like, you know, like how to do stuff, how to think like batch mode, stuff like that.

So, uh, batch mode is typically like, you know, good for like your heavy lifters, right? It’s like your, your, your bigger queries that have to like go through a lot of stuff. If you just have to like find a few rows and like send a few rows along and, you know, stuff like that.

If it’s like, think like, like, you know, OLTP ish workloads, uh, batch mode isn’t generally a great fit for stuff that you do in there. Like typically where batch mode comes into, into play are, you know, when you have to do like big scans of tables and you have to aggregate a lot of rows down, uh, or, you know, you’re like, you know, you have like, you have to like hash join a lot of rows or hash aggregate a lot of rows, stuff like that. If a lot of your query plans are like single threaded, like little loopy join things anyway, that’s, that’s really not a good fit for batch mode.

Obviously modification query is not a great fit for batch mode, things like that. But, you know, um, you know, it, it’s, I, I am happy that Microsoft has stopped locking some of the intelligent query processing stuff behind batch mode because that was, that was kind of a drag. Uh, but you know, I, I, I do wish that like batch mode on rowstore was a little bit more lenient with where it would kick in.

But anyway, uh, let’s go on to the next one here. Cause this is, this is, this is a kind of a fun thought bubble question, isn’t it? Uh, is it always possible to tune queries to save money in the cloud, but like with smaller hearts?

So I mean, I assume that means like by reducing hardware. So I would say generally, yes, but it really does depend on where you’re starting. If you’re already not on, uh, if like, if the hardware that you’re on is like sufficient for what you’re doing and, you know, like you, you’re thinking to yourself, I need to add hardware to keep up with this.

Then yeah, absolutely. Um, but if you’re, if you’re at a point where like, you know, you have like terabytes and terabytes of data and, you know, and like the, like maybe like, you know, high double digits, low single digits memory. And, you know, depending on kind of like some, you know, user concurrency parallelism stuff, like a small number of CPUs, it’s, it gets harder to, uh, you know, tune things so that, uh, so that you can reduce hardware for that.

You can certainly tune things to make better use of the hardware you currently have, which can give you a better idea of what you might need to increase stuff to. So like, like one way to think about it is, you know, like if, if you’re on like a smaller server and you know, you need to go to bigger hardware, it could certainly pay off to do a lot of query and index tuning. So that, uh, you can make a more, like you can make a more like good conservative choice about the next hardware step up.

So like, instead of like quadrupling hardware, maybe only like double or like, you know, 1.5 X your hardware. So you don’t have to go like, you know, like nuclear and get like, you know, like four terabytes of memory or something. Right.

Like not that, you know, a lot of, a lot of, a lot of cloud instances don’t even go up that high, like managed instance piece of crap. But, uh, anyway, um, like, you know, like it is, it is always possible, but really it depends. Like if you, if you have like, you know, like if you just went like to the absolute biggest server you, you could possibly find, uh, and like, you know, no one’s really doing the, no one, no one’s doing, or no one has historically done like meaningful query and index tuning.

Then yes, absolutely. You know, you can, you can bring a lot of stuff down that way. But, uh, you know, a lot of the stuff that I see is sort of painfully like under, under provisioned, uh, from a hardware perspective as a starting place.

And that just makes it a lot, that this makes it a lot harder to do stuff that can make enough of an impact to say, oh, now you can reduce hardware further. It’s like, cause you’re already spending like, you know, some tiny little instance size already, you know, and you’re, you’re expecting, you know, this miraculous 24 seven mission critical 30,000 batch request plus reporting workload to, to all of a sudden fly on that thing. It’s, it’s a little bit tougher of a proposition.

Um, so, you know, like, but there are things you can do, obviously, you know, like, you know, like aside from like the generic index and query tuning, um, you know, I like one of the, one of the biggest things that I, I push is, uh, compressing indexes, which makes them smaller on disc and in the buffer pool. So that you make better use of, you know, those expensive, like, you know, uh, IOPSy discs that you, uh, pay, pay too much for. And so that, and you also make better use of the memory on the server.

Um, so, you know, there’s of course, you know, times when absolutely you can, you can like hit home runs with this, but there are also times when, uh, you know, you like, sometimes you, sometimes you put in all the effort and like, you just allow the server at that size to tread water a little bit longer. So, you know, always no, but generally, you know, you can, you can, you can do meaningful things to, uh, to improve the situation almost regardless of what, what the starting point is. But, uh, there, there are some, certainly some outlier starting points that I’ve seen that, uh, you know, like you could, you, you could tune, you could tune to your heart’s content and, uh, you just wouldn’t move in.

You wouldn’t move the needle enough to, um, like to, to allow, like allow for a downgrade of hardware. Anyway, that’s about good for this one. We have answered five questions, me and the voices.

Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video. All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Row & Range Window Functions vs Batch Mode in SQL Server

A Little About Row & Range Window Functions vs Batch Mode in SQL Server


Video Summary

In this video, I delve into the nuances of window functions and how they interact with rows and range modes in SQL Server. I explore the differences between using `ROWS UNBOUNDED PRECEDING` versus `RANGE UNBOUNDED PRECEDING`, explaining why the former was historically slower due to on-disk spooling, while the latter could be more efficient. By comparing these methods through practical examples and query plans, I demonstrate how modern SQL Server versions and features like batch mode can significantly improve performance for window function queries. Whether you’re working with large datasets or just looking to optimize your T-SQL code, this video offers valuable insights into leveraging these powerful tools effectively.

Full Transcript

Hey, it’s Erik Darling, Darling Data, your number one source for SQL Server content on the internet. I hope. If you can go anywhere else, I mean, I can’t say what you’ll get will be correct or high quality. So, careful out there. Anyway, today’s video, we’re going to talk about rows and range versus batch mode. Now, this all sounds terribly foreign to you. You may not work with window functions very much. But back to the end, I’m going to talk about rows and range versus batch mode. Now, if this all sounds terribly foreign to you, you may not work with window functions very much. Back in the, let’s say, pre-batch mode days of window functions, SQL Server has two different ways of sort of giving you a partition clause or giving you a window of things to look at. You can look and specify a certain set of row elements to be in there, or you can specify a range of row elements to be in there. Now, back in the bad old days, you had to be very careful about specifying the rows. Because rows was a lot faster. If you use range, you got this terrible on disk window spool. There was very, very, there was not very, very, sometimes very, very slow. Let’s just say it was much slower than range.

But now, with batch mode, all that kind of goes away. So we’re going to look at examples of that in this video here today. But before we do, if you would like to sign up for a channel membership to support the best SQL Server content on the internet, there is a link in the video description below. And if you want to ask me questions privately that I will answer publicly on my Office Hours episodes, there is a link right up there for you to do that exact thing with. Otherwise, you can just do the usual YouTube stuff and keep me happy in different ways. If you need SQL Server consulting help, I am here for you, baby. I got you no matter what you need. You know, within reason. I mean, I don’t want to like carry a server somewhere for you. You got to do what you got to do. I’ll rent a dolly if I have to. And as always, my rates are reasonable.

My performance tuning training is about $150 for life. If you go to that link and use that discount code, that’s the everything bundle. And if you would like to pick up my new T-SQL course, all of the beginner content is recorded and published and available. It is currently on sale for the pre-sale price of $250, which will double in value to $500 once the advanced material is out about there in the world. And some other fun summer stuff coming up for me. Redgate is taking me on tour with the PASS mini events. That’s going to be New York City, Dallas, and Utrecht.

And of course, all that leads up to PASS Data Community Summit, November 17th to the 21st. So I look forward to seeing you at as many of those events as humanly possible. But with that out of the way, let’s talk about this window function stuff. Now, what I’m talking about when I say like rows versus range is specifically this part of the window function, where you say rows unbounded proceeding.

Now, I don’t technically have to say and current row because this is it’s by default when you do that. So if you just say rows unbounded proceeding, then you will just get like a running sum of all the prior rows up to the current row. Now, I’ve got this query hinted.

Well, let’s make sure. Query plans are turned on. I’ve got this query hinted to you as optimizer compatibility level 140. The reason for this is because my current database, which is on SQL Server 2022, is in database compatibility level 160. Being on developer edition, which is an enterprise equivalent SKU, and being on SQL Server 2022 in database compatibility level 160 means that I get the batch mode on rowstore feature automatically without having to do things, do any strange things there.

If my database compatibility level 140, I would have to use this hint to say 160 in order to show it off. So we’re just doing this in reverse this time. But when I run this query and we look at the rows unbounded proceeding, we are going to get back some results.

And they take a second because there’s a lot of rows. There’s about a million rows that come out of this. But this all takes just about a second to run.

You can see over here, just one second. It makes life easy. And, you know, things in here are not too terrible. If we run that same query, but we use the range unbounded proceeding, we’ve changed this from rows to range, then things are going to look a little bit different.

And if we look at the query plan, you can maybe felt that that took a little bit longer. That took twice as long at two seconds. Now, part of the reason why does require us running both of these at the same time.

I’ve turned statistics.io on for this because it’s a slightly easier way of seeing where that other second of time went. So now, if we look at the query plans next to each other, we will see just about that same pattern here, right, where it’s one second versus two seconds. And so the range one is twice as slow.

And if we look at the statistics.io results, notice that there’s this work table. Let’s frame that up a little bit better. There we go. And for the rows query, there is a work table with a scan count of zero and a logical reads of zero.

But for the range, there is a work table with a scan count of 1058840 and a reads of 5430748. So in the first query, we use rows, that work table didn’t actually end up getting like it was there, but nothing really happened to it. It’s almost like with a many to many merge, when you actually have a many, many, many, and you end up like there’s a work table, like when you’ll see a lot of activity there.

But when SQL Server is like, I think this is a many to many merge, but then there’s not actually a many to many. The work table doesn’t actually see anything. Now, let’s contrast this situation with database compatibility level 160, where we get the batch mode on rowstore feature to kick in.

I’m going to run this one, right? We’re going to take a look at how long this takes. And this, I mean, like we still have to display a million rows, so it may not feel quite as snappy as you’d hope, but it still runs very quickly.

We’re at just under a second in total for this. So we improved from the non-batch mode version by about, I don’t know, a couple hundred milliseconds or something. That’s not really the thing that is my big deal for this query.

The bigger deal for these queries is that the range unbounded proceeding version does not suffer and take twice as long when things run in batch mode. Now, this one takes just about the same amount of time as the previous one. So to talk a little bit about why, the main difference between these, and what I want to do is just get the estimated plan for this.

And if you look at what happens in here, we have the clustered index scan, we have a repartition streams, we have a sort, we have a segment and another segment, and then a window spool. The window spool over there, let’s clear these out so it’s just one thing in the middle that we care about. The window spool is where that work table occurs.

When this is on disk with range and using like a row mode execution, then this is a bad time, right? This is where that extra second in that work table lives. When we run these in batch mode, right?

And if I run these together, we’re going to get a different query plan. It’s a lot more compact and it has different operators in it. So if we run both of these, they’re both going to take around 800 milliseconds to run. Again, there’s like a million rows that come out.

So things are, you know, not quite perfect. But if you look at these, we still have the clustered index scan, and then we have a sort to put data in order. But then we have this window aggregate.

And this window aggregate doesn’t really have like the on disk thing for it when you, like, depending on if you use the default, which is range, or if you like go out of your way to specify rows. And if we look at the messages tab, and we look at like the work table for both of these, we’re going to see a scan count of zero in here and a logical reads of zero from the work table in here. So neither one of these did any of that like on disk window spool work, because in the query plan, we have the window aggregate rather than the window spool operator.

So this is like, you know, this is like a sign that your window function query ran in batch mode. You see the window aggregate and a lack of like the usual, like, you know, like, you know, segment sequence project, stuff like that. And then the window spool operator, which is where that hidden work table occurred in the other queries.

So if you are using, you know, a SKU of SQL Server and a database compatibility level where you can get batch mode, or if you are allowed to play some tricks on your queries, which I’ve explained in many other videos about how to get batch mode to happen for your queries without that, then you have a little bit less to fear with window functions if you need the range behavior, because the range behavior is slightly different from the rows behavior. So if you’re getting batch mode, you can be slightly less afraid of your queries, your window function queries that may require the range behavior over the row behavior.

If you are using, if you are getting batch mode execution, because you no longer have to deal with that pesky on disk window spool. So some, some, some light at the end of the tunnel out there for the, the big data analysis folks. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video where we’ll talk about something else interesting. I promise. All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: A Neat Thing with MERGE and OUTPUT

Learn T-SQL With Erik: A Neat Thing with MERGE and OUTPUT


Video Summary

In this video, I dive into an intriguing aspect of T-SQL using the `MERGE` statement and its `OUTPUT` clause, demonstrating something you can’t achieve with a regular `INSERT`. I walk through creating sample data in SQL Server Management Studio 21 (version 21.something.6.3) to illustrate how we can modify our target table by adding 10 to the ID column, incrementing dates by one day, and replacing values in another column—all while retaining visibility into what exactly was inserted. This feature is particularly powerful for complex processes where you need detailed insights into the changes made during a merge operation, allowing you to output this information to a different table for historical tracking or auditing purposes. I also highlight why trying to replicate this functionality with a standard `INSERT` statement results in errors, emphasizing the unique benefits of using `MERGE`.

Full Transcript

Hey! Hello, friends! My friends, my only friends, Erik Darling here with Darling Data. And we are going to do some more super fun T-SQL learning today. We’re going to talk about kind of a neat thing that you can do with merge and the output clause that you cannot do with a regular insert in the output clause. So I hope that is is majestically entertaining for you as you take time out of your busy day and schedule to watch this video. All right. So let’s get over to SQL Server Management Studio 21 version, 21.something.6.3. There’s been 15 updates in the last week or so. So I don’t know exactly which version I’m on. But what I want to show you here, so first let’s create some sample data to work with. So we’re going to create some sample data to work with. So we’re going to create some sample data to work with. So we’re going to create some sample data to work with. So we’re going to create a table called the target and a table called the source. Don’t ask me where I got these clever naming conventions from because it took a whole lot of, you know, really, like I needed a higher muse. There was a lot of creative work effort that went into those. But we’re going to insert 20 rows into this table called the source. And the data that we inserted into here is going to look, oh, that’s a good interesting keystroke, is going to look like this, right? So it’s the numbers one through 20.

So we’re going to look at the same tree in the ID column, a bunch of dates in the sum date column, and this sort of half of a Christmas tree or half of a triangle, however you prefer to call it, in the sum thing column, right? So we just have this nice sort of shape in there, right? Pythagoras would be proud. What’s that? It’s a triangle. Nope, it’s a bloop shape. All right. Cool. So let’s make sure that there is no data in the target table just in case. And what I’m going to do in this this merge query is something that you cannot normally do with an insert query, which I’ll show you next, is we are going to say, we’re going to merge, and we’re going to say, using merge into the target table. I do not have a serializable hint here because we are only doing an insert. Serializable is only necessary if you have multiple actions assigned to your merge statement.

So like, insert update, like the upsert pattern is the most common, but deletes would qualify as well. And we’re going to say, using the source, but we’re going to say, on one equals zero. So essentially, there would be no, there’s no match here, right? We’re making the implicit assumption that no rows will match between these two. And then we’re just going to say, when not matched by target, then insert into id, sum date, something.

And we’re going to say, and we’re going to use the values clause here, because that’s usually what you do with the merge and the insert. But we’re going to tweak what goes into the table a little bit. So we’re going to add 10 to the id column. We’re going to add a day to the sum date column. And we’re going to replace the bloop shape of As with a bloop shape of Bs.

All right. And then down here, we’re going to have our output clause. And in the output clause, we’re going to, I mean, A, there’s something you can do with merge that you cannot do with standard modification queries. And that is, you can have this action column. And this action column will tell you what came out of the merge. So if you write like a different type of merge that maybe has multiple actions, this action column will tell you if it was an insert, an update, or a delete, which is very helpful.

One thing that annoys me about output is that you can’t do the, like, column equals expression syntax. You have to say as. You have to say this as this, this as this, this as this. I guess it would be confusing for some people. But then I’m going to alias, just so we can see where the different columns start.

This is going to be where columns from the inserted table begin here. And then this is going to be, this is the other thing that you can’t do with a normal insert query is look at the source values in it, right? So we’re going to have a set of columns here called source values.

And this is where they start, this s dot star. And if we look at what happens when I run this whole thing, we are going to get back the output clause. And we are going to be able to, this is the really helpful thing, is that you can see what, you can see like the, like the source values, which you cannot normally see, right?

So this is where things started. It was the numbers 1 through 20. Everything was 20, 25, 06, 25, right?

And then this bloop shape was all A’s going down. But the values that we actually inserted look like this, right? Well, these are different now.

We can see that the IDs go from 11 to 30 rather than 1 to 20. Some date is all 626 rather than 625. And our bloop shape has taken on the form of all B’s rather than all A’s. So this has a lot of interesting uses, I think, that not a lot of people sort of get into, especially because as you write more and more complicated processes that use merge, you might need to have visibility into this stuff.

And you might even want to output this stuff to a different table to sort of retain a history of things as well. But notice that if we try to do this same thing with a normal insert and we write our output clause, this thing has a little squiggly under it. And even though this little squiggly thing should come from here, SQL Server is not able to, I guess, look forward enough to output values from this, right?

We’re not able to look in here and output any values from that. So this will just give us an error. SQL Server will say, no, we can’t do that.

The column prefix TS does not match with a table name or alias name, even though it’s just, it’s right here. But SQL Server is not able to get into that. So this is just, you know, something that annoyingly, very annoyingly, you can do with merge, but you can’t do with a normal, like, modification.

We’re using insert as an example here, but this is something you can’t do with a normal, like, insert, update, or delete either. You can’t see that source data. You can only see the change data, which is, well, I mean, I guess for updates, that’s, you know, you can see the before and after, but deletes, you know.

Anyway, looking at the source rows and there can be very useful for some query patterns. Anyway, just a neat thing that you can do with output and merge, I think, anyway. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you over in the next video where we will continue to express and enjoy ourselves in the T-SQL query language. Oh, that was a repair.

Structured T, Transact SQL query language. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Some Merge Tips

Learn T-SQL With Erik: Some Merge Tips


Video Summary

In this video, I delve into some essential tips and tricks for working with SQL Server’s `MERGE` statement, focusing on making your code more efficient and less prone to errors. I start by emphasizing the importance of using the `SERIALIZABLE` hint when performing multiple actions in a merge statement to prevent concurrency issues. Then, I demonstrate how simplifying the matched clause can save you from complex null checks, showing a cleaner way to update records with concise SQL. Lastly, I discuss the `USING` clause and explain why it’s more akin to a `FROM` clause than a traditional join, highlighting its importance in ensuring that your merge logic behaves as expected. By the end of this video, you’ll have a better understanding of how to write effective and maintainable `MERGE` statements.

Full Transcript

Hey, it’s T-SQL time with Erik Darling. All right. Usual, usual song and dance here. All 23 hours of the beginner content from my T-SQL course, Learn T-SQL with Erik is available now. The price is $250 and it will be advancing to double in value.

It will go up to $500. So you can, you can, you can short me if you want. When the advanced material drops after the summer months have concluded.

So in this video, we’re going to talk a little bit about some, like a couple of things you should know about merge. Like I’m not going to sit here and be like, this is how you write a merge statement. I’m going to just show you some stuff that can kind of make working with merge somewhat less painful and maybe explain why you might get weird results with merge sometimes.

So let’s, let’s do that. Let’s have that, let’s have that kind of fun today. So the first thing, when you are, when you’re writing a merge statement and you are, you intend that merge statement to have multiple actions.

So like update and insert, like the upsert, like form of merge is probably the most common. The first thing is that you absolutely need this serializable hint here to prevent strange things from happening. When you run that, there are all sorts of strange concurrency phenomena that may occur if you do not use this.

So this is the very first thing here. The second thing I want to show you in this, in this section is how to make the matched section for, to perform the update portion of the merge a little bit easier to write. So when, what I see in a lot of matched clauses is not this, right?

What I see is a lot of stuff like where T dot name is not equal to S dot name and T dot name is no, or S dot name is no. And T dot user ID is not equal to S dot user ID or T dot user ID or S dot. Like it just goes on forever with these, like not equal to or no, like foreverness things, because you might have no’s and you can’t do the not equal to no’s.

And the whole thing just turns into a nightmare. This is a much more clean and concise way of writing this. You say select like the, the target columns, except select the S dot columns.

Uh, you could potentially, well, well, you would, you would have to have made after reverse some other stuff if, if you did, if you reversed it. But, uh, if you do this, this will save you all the null checking because except handles nulls, uh, graciously for you. So that is the, that is the main thing here.

The other thing that I want to talk about is the using clause. Now, the using clause for a lot of people feels like a join, uh, because there is an on clause and that that’s, that’s, that’s reasonable. But you do have to be, uh, you, what you should be aware of is that using is somewhat more like a from clause than a join clause.

Um, and what I mean by that is if you were to write a query like this and you were to say using badges, uh, badges stage as S on like S dot ID equals T dot ID. Uh, like, like any, any, anyone, anyone from the, from this who didn’t like match this exactly would go to the, when not matched by target and would go to the insert portion. Uh, which is probably not what you intend.

So when you’re writing your using clause, a lot of the times what you want to do, and this, this might seem sort of similar to when we talked about like, um, pivot and unpivot. How, when you write the pivot query, uh, you kind of want, like, if you use a derived table expression, uh, you can, you control better the columns that SQL Server will attempt to do, attempt to do the like implicit grouping by thing. Uh, so when you’re, when you, when you write a query where you only want to get certain stuff from a table to use for your merge, what you want to do is wrap that up into something like this.

So you’re only getting, uh, like you’re what the data source that you’re using is a select from the staging table where user ID equals two, two, six, five, six. So this will limit it to just that portion of the data rather, and you won’t end up with like weird, uh, bugs and potential other things going on when you hit them when not matched by portion. So, um, just a couple of things that might help you write somewhat better merge statements in there.

Uh, one, if you are performing multiple actions, you must use the serializable, uh, uh, uh, uh, uh, uh, uh, locking hint, uh, on the, uh, target table. Two, if, uh, you are using, uh, if, if, if in your matched clause, you have to write an excessive amount of null checking, it is a lot easier to just say, and exists, select columns that you care about, except select the other columns that you care about to do your update. And, uh, when you are writing the using clause, if there is any additional sort of filtering or anything else that you, you want to do here, um, confine it to a derived table so that you actually, you start with the correct data source and you don’t have weird things, uh, flying around your, uh, matched and not matched clauses in your merge statement.

So, uh, that’s about it here. Um, you know, there, there, if you want to read a lot about merge, um, I would highly suggest looking at Michael J. Swartz blog. Uh, he says lots of fantastic things about merges and upserts, uh, because he uses them a lot.

Uh, I don’t know if he says any of this stuff directly, but, um, if, if he, if he doesn’t, I’m sure he says other very, very smart things about it. So, anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in the next video, uh, where we will talk. I believe, uh, we’re going to talk a little bit about output next. So that’ll be, that’ll be great fun for all of us, won’t it? All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Neat Update Tricks

Learn T-SQL With Erik: Neat Update Tricks


Video Summary

In this video, I dive into some advanced T-SQL techniques focused on updates, showcasing how to write and validate update queries effectively. I emphasize two key rules: always start by writing your updates as a select statement to ensure the intended results before committing any changes, and highlight from the bottom up to minimize errors. Additionally, I demonstrate a unique trick involving updating local variables or parameters within an update query, which can be incredibly useful for tracking changes without using OUTPUT clauses. This video is part of my comprehensive T-SQL course, offering insights that might not be commonly discussed in everyday practice but are invaluable for mastering the language. The 23 hours of beginner content are now available at a presale price of $250 and will double to $500 after the summer release, giving you plenty of time to start learning and improving your T-SQL skills.

Full Transcript

Erik Darling, Darling Data, same as it ever was, same as it ever will be. Consulting without end. So this is going to be some more preview material from my T-SQL course. We’re going to talk about some neat stuff you can do with updates that are nothing to do with anything being in order like they were with deletes. All 23 hours of the beginner content is out there and available for you to start learning anything, getting better at stuff, specifically T-SQL with. I don’t know if it’s going to make you better at anything else. Who knows? Maybe, maybe it will. I don’t know. Give you some superpowers. Uh, but this is still available at the course presale price of 250 bucks, which will double in value, uh, to $500 when the beginner material publishes after the summer, uh, when I have had time to recuperate some brain cells. So let’s talk about updates. Now, uh, when it comes to modification queries, I have, I have two rules that are, I am pretty steadfast about, uh, now if inserts, inserts a bit less because inserts involve writing a select anyway, when you are writing the portion of the query, that’s going to get data to insert into a table. But for updates and deletes specifically, two big rules that I have are one, you should always write them as a select first. Always write them as a select first. So you can validate that the results are, what you, what you intend that to me, that makes a lot of sense. Sure. You can say begin transaction and then have like a rollback and a commit at the end. But you know, then you’re also like performing some modification, hoping that you remember to highlight, begin transaction, uh, exploring data after you have modified it within your transaction, scrolling through stuff, checking things out. Oh, is that right? That look okay to you? I’m not sure. Uh, and then hoping that you have modified it within your transaction.

That you remember to commit or rollback the transaction. And even worse, I know you, I see you out there. Uh, all your other queries have no lock hints on them. And so while you are in the midst of your transaction, uh, perhaps not causing any blocking problems because all your other queries have no lock hints on them. You may very well just be letting data, letting users read a whole bunch of like, you know, uncommitted data. Uh, well, you figure out if everything is right or not. So the, the first rule is to always write your updates and deletes as selects. The second rule is to, uh, always highlight your modification queries from the bottom up. And the reason I say that is because if we were to, uh, we were to highlight this update from the bottom down, we, we, we might, we might, we might flake on something. We might, we might only get to here. We might get distracted and look away.

And you know, this would be, if we run this, we would in fact just delete, uh, or rather we would update all the dates in the table to add, uh, add one day to them. Granted, this would be fairly easy to fix because we would just have to say data day minus one to, to adjust it back. But depending on how long that takes and some other, uh, you know, local factors that might not be an enjoyable experience for you. So the reason I say to always highlight from the bottom up is because let’s say we do the same thing and we get distracted and we only get to here. We get this very, very helpful error message. Incorrect syntax near dot.

Okay. Uh, no incorrect syntax before dot though. Nothing else is wrong with this query. There’s just incorrect syntax near a dot. Okay. Okay. Anyway, always highlight your queries, your, your updates and deletes from the bottom up. Uh, I suppose that would also go for inserts too. Cause if your select query has some form of where clause on it, that might be important as well. So modification queries, write them as selects first highlight from the bottom up.

All right. Now I want to show you a cool update trick. And this is something that not a lot of people, uh, well, I mean maybe ever know exists, but that’s the value of these amazing courses is you get to learn about stuff that you might never, uh, practically hear about or use in your life. So in this, uh, badges update table that I have created, um, we have, uh, a row for you for user ID 41, but no rows for user ID six, right?

This isn’t like there, something exists for this one, but nothing exists for this one. Right. And this is going to play into the next thing I show you. So one thing that you can do with updates is you can not only update columns, but you can update, uh, either local variables, declared local variables, like I have up there or parameters. And if you’re going to like, you, sometimes this can be very useful, uh, if you don’t want to use output to like, like if you’re affecting a single row, uh, and you don’t want to use output for this, you like, cause you can’t like output into like variables like this, but you can update these local variables or parameters, uh, to values based on what you updated in the table.

So in this case, we are going to update old user ID to the user ID column. We are going to update new user ID equals user ID equals six. So we’re setting this to six and then we’re setting this to six. And then we are going to set, did it update to true? Uh, right now, these three things up here are declared, uh, old user ID and new user ID have are assigned null marks and did it update is assigned a false value. So what we’re going to do after this is we’re just going to make sure that everything sort of happened the way that we thought it would. We’re going to select, uh, our values from these local variables to see what they ended up as.

And we’re also going to look at the, uh, badges update table to ensure that our row changed from 41 to six. So now highlighting from the bottom up, we are going to, uh, get all of, all of this query that we care to run in one go, and we’re going to execute it. And when we check in on our local variables, they have all been assigned the correct values.

Old user ID was 41, new user ID is six and did it update is true. And then when we look in the badges update table, we will see that we did indeed change user ID 41 to user ID six. Cool. We’re all good there.

Uh, and we did roll that transaction back because we don’t want that to stay in there because it’ll ruin the next thing I’m going to show you. Okay. When there are no rows, then, uh, this will not get changed and these will remain null. So now we’re going to do this sort of in reverse and we’re going to say, Hey, let’s try to update user, where user ID six, we’ll do the same thing in here.

And we’ll try to change user ID to 41. So we’ve got a new one, but of course now highlighting from the bottom up again, if we run this, we will see that, uh, we got back nulls because no VALU, no values were assigned or either overwritten or overwrote the null values. the null marks here and did it update remains a zero because nothing came back to change it from being false here right when we attempted to set it to true here we didn’t get a row in order to actually make that assignment change and of course when we look in the badges update table we still have user id 41 so we were we were not able to switch 6 to 41 because there is no row for 6 so some neat stuff that you can do with updates that is not just you know changing some data doing a join CTE blah blah blah hope you enjoyed yourselves hope you learned something and i’ll see you over in the next video where i believe we’re going to talk a little bit about some merge stuff there are a couple things that you should know about merge aside from uh the fact that you know maybe for a lot of cases you should be avoiding merge of course all right anyway thank you for watching

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 25

SQL Server Performance Office Hours Episode 25



To ask your questions, head over here.

I know using MAX for columns has downsides but what about 4000/8000?
What is the hardest subject to teach people about ?
Have you ever had a consulting engagement not go well? What happened?
I get really frustrated while tuning queries because a technique that worked one time won’t work another time. Does that happen to you? HOw do you deal with it?
Do you ever want to get out of tech/working with databases?

Video Summary

In this video, I embarked on an office hours adventure where we tackled some interesting and practical SQL Server questions. We discussed the downsides of using large string columns in tables, delved into teaching isolation levels—a notoriously difficult topic—explored consulting challenges when clients aren’t ready to implement solutions, and shared insights on why tuning techniques that work one time might not work another. It was a great session where we aimed to provide value through real-world examples and practical advice. I hope you found it as informative and enjoyable as I did!

Full Transcript

Oh, look who it is. How you doing? Erik Darling here with Darling Data, and we are going to embark on an office hours adventure for this video. I hope that you find it enjoyable. So if you want to ask questions that I answer on these things, there’s a link here that allows you to do that. It’s a wonderful deal. You get to submit a question for free. I get to answer it for free. And, uh, I don’t know, I guess that’s, that’s about the end of it. Uh, if you want to support this channel, you can also, you can sign up for a membership for as low as $4 a month. Uh, you can do that down in the old video description. Uh, otherwise, uh, you can just sit there and, uh, get it all for as much as it costs you to, I don’t know, have the internet, pay for electricity, all that other stuff that kind of goes, goes along with, uh, watching things online. Uh, if you need SQL Server consulting help, I have, of course, in the best in the world at all of these things outside of New Zealand, of course, the lawyers make me say that every time. Uh, and as always, my rates are reasonable. If you would like to get my performance tuning training, uh, there’s all 24 hours of it available to you via the everything bundle, which is everything that I’ve done about performance tuning. Uh, and you can get that for about 150 USD and that’ll last you for the rest of your life.

So, you know, it’s kind of a good one-time purchase there. Again, as we get into these summer months, when you, uh, want to just lock yourself in a room with an air conditioner and not deal with anything. Well, what, what better way to do that than get some SQL Server performance training while you’re at it? Who needs music and movies and whatnot? Joy. Get better at databases. All right. Uh, my new T-SQL course, uh, all 23 hours of the beginner content. The best event is out there and available. It is currently $250 on the pre-sale price. It will double in value to $500 when the advanced material is done, uh, over after the summer. Uh, I will also be leaving the house a lot this summer. I will be leaving my, my pleasant air condition and life. and I will be traveling to the faraway land of New York City, August 18th and 19th, for the Pass on Tour series of events.

Also going to Dallas, which hopefully is air conditioned, even though it’s all September in Texas. Yeah, that’s a thing. September 15th and 16th, the Hamlet of Utrecht in the Netherlands, October 1st and 2nd, and of course, Pass Data Community Summit in Seattle, November 17th to 21st.

All good things coming up over the summer and fall months. But with that out of the way, let’s do some office hours in here. All right.

We have some questions. We have some very important questions to answer. I know using Macs for columns has downsides, but what about 4,000, 8,000? I assume you mean in VARCAR 4,000 and VARCAR 8,000.

Yeah, of course there are downsides. I mean, you know, your developers get to be lazy and perhaps not deal with truncation errors when they insert their dirty data sources into your beautiful database.

But, you know, you also run into some stuff too where, you know, you can’t have those columns in the key of an index, which might be important at some point.

You know, when you select data out of that table, if you, you know, need, if there’s any memory grant, it will be inflated by those much, much larger string columns.

So, I can’t really, you know, and this isn’t me being like a disk cheapskate, being like, disk is expensive. You should always use the right data type because disk space, blah, blah, blah, blah, blah.

This is like practical performance stuff. Like, like, don’t do it. Still, still a bad idea. Don’t, don’t, don’t hose yourself with overly long string columns because they can come back to bite you in many ways.

All right. Oh, here’s an interesting question. What is the hardest subject to teach people about? Oh, without a doubt, it is isolation levels. That is the toughest material to teach people about because almost no one is approaching it mentally from the, the right perspective.

And, uh, they often come with a lot of preconceived notions, uh, about isolation levels that require the, require unteaching before you can actually, apply any new teaching.

Um, there is a lot of really, really bad blog content out there about, um, especially role versioning isolation levels that should be thrown in the dumpster and set, set, set ablaze.

But, uh, you know, that’ll be the day. Um, all right. Uh, have you ever had a consulting engagement not go well? What happened?

Um, so yeah, of course not, not, not all of them are as successful as, as some of, some of the others. Um, I think what ultimately makes a consulting engagement unsuccessful is when someone needs help, but is not ready to get help.

Um, it’s, it’s sort of like, you know, you, you go in there and you can give them the analysis and you can give them the stuff to do when you can show them like, you know, Hey, if we change this query to do this, this will get better.

If we change this index to do this, like you can, like, you can like, like show them like, like real proof that this stuff happens, but like there might not, like they might not follow through on anything. Right.

Like they just might not end up doing anything. And, uh, you know, like they’ll, they’ll still like hit you up later and be like, Hey, you know, we’re still having problems. And you go look and just like, nothing’s been done.

And like, you know, it’s, it’s frustrating for you because like, you know, like you, you, you, you’re, you’re, you’re giving these people everything, all the tools that they need to solve problems and be successful and like get things done. But there’s just no follow through.

There’s no willingness to actually make changes. And when you ask them why it’s just, Oh, we haven’t gotten to it yet. Oh, we were afraid about this. Oh, you know, there’s a list of excuses.

So, uh, you know, those, those are, those are the, the toughest sort of things to deal with is, um, you know, like people whose servers are just in bad, bad shape, but, uh, there’s just no one willing to sort of take responsibility for it.

You know, actually like go through and do stuff. And, um, you know, when, when you, you’re like, well, is there anything you want me to do? And they’re like, well, I don’t know.

We’ll see. Like, you know, it’s just hems and haws and like the bucket gets kicked. So, yeah, those are, those are probably the worst ones. Uh, all right, let’s see here. I get really frustrated while tuning queries because a technique that worked one time won’t work another time.

Does that happen to you? How do you deal with it? Well, yeah, of course it happens to me. You know, uh, you, you build up this bag of tricks and knowledge and stuff over the years.

And, you know, something that works beautifully, uh, to tune up one query, uh, has no effect on another query, or maybe has an opposite effect, an opposing effect on a query that you’re working with.

Uh, I think, you know, the, the trick is that as you like, like every change is, is, is feedback, right?

So don’t like, like, yes, it can be frustrating, but like, ultimately it’s kind of like what you make of it. So, you know, like if you make, if you try something and it doesn’t work, that’s a data point, right?

That’s you figuring out, Oh, like when I do this, like, you know, this didn’t get the right change or like, this didn’t change the plan, the way I thought it would like, like, what do I need to do next to try and like get past that? And like, as you go through that process more and more, it’ll happen less and less.

So as you sort of like, like sort of like, you know, like expose yourself more to times when something doesn’t work, you start to understand when it doesn’t work and why it doesn’t work. And you start to sort of like, like, like not try it out when you recognize that those circumstances are cropping up again.

So like, like, you know, like, like it’ll happen to you less. It’ll still happen. It still happens to me where I’m like, well, you know, like, you know, I don’t know what some examples, um, you know, doing like, uh, you know, like, like there are a few things where like, you know, I like things that I, I’m like, Oh, well this, this, this would probably be a meaningful change.

Like changing like the left join with a null check to not exists, or sometimes doing like a, like a, like a introducing a top into something to get a row goal, or sometimes using like cross apply, uh, when I’m generating a row number, rather than like joining to, uh, like a whole, like derived join, um, you know, introducing batch mode, like stuff that, you know, like usually has, uh, like a fairly quick and obvious, uh, performance when there’s a feedback mechanism.

Sometimes you try it and you’re like, Oh, well that, that, that didn’t, that didn’t go the way I thought this time. Uh, you know, you kind of go back to the drawing board. Uh, but you know, again, the, the more you do that kind of work, the more you can sometimes recognize like the situation that like, like made it unsuccessful and kind of skip over that.

Or I don’t know, maybe just try it, but be a little bit more prepared for failure. I don’t know. That’s another, that’s another good, uh, thing to master is, uh, being prepared for something to not work out.

Right. Cause sometimes things don’t work out. Speaking of not working out, there we go. Wow. Do you ever want to get out of tech and working with databases? Uh, yeah, of course.

Um, you know, I mean, as, as much as I enjoy it, it’s, uh, it’s hard to imagine doing this at like 70, 80 or something. Like that’d be kind of wild.

Um, but like if, if I, if I got out of tech and working with databases, I think, um, what, what I would, what I would love, even, even though I realized it’s not a money printing machine, I would love to own a gym, uh, just, you know, like, like real basic barbell equipment, not a single Nautilus thing, no leg press, no leg extensions, no, none of like the goofy stuff that I hate.

Uh, like when I go to a gym, uh, just like, you know, barbells, squat racks, benches, um, you know, limited set of dumbbells, uh, and just no, no real nonsense is, is, is, would be, would be my take on it.

So, maybe someday, who knows? Anyway, uh, that is five questions. One, two, three, four, five. Uh, we did it. We made it through. We didn’t swear. We didn’t curse.

Congratulations. Uh, we did a great job on that. Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I will see you, uh, in tomorrow’s video, which I think will be something to see equally, but who knows?

All right. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.