Learn T-SQL With Erik: Serializable Is Not A Snapshot

Video Summary

In this video, I dive into the nuances of SQL Server’s serializable isolation level and how it impacts data consistency and blocking scenarios. We explore a practical example where two select queries within the same transaction see different views of the data due to shared locks held by the transaction. This session is part of my comprehensive 23-hour T-SQL course, now available at a presale price of $250—doubling in value once the course is complete. I urge you to save on this valuable resource and purchase it before missing out later. Let’s demystify serializable together and understand why, for most scenarios, it might not be the best choice for achieving snapshot isolation levels.

Full Transcript

In this video, you and I, you and me, we’re gonna get real serializable together. This is of course a small taste of the material that I have in my T-SQL course. 23 hours of beginner content is all available now. It’s still at the presale price of $250 that will double in value to $500 when the course is done. So as usual, your job is going to be done. Your job is to save $250 and buy that thing now before you regret it later. You have to give me $500. Then I light a cigar with it and walk away laughing in my fur coat or something. I don’t know. Whatever. Leave me alone. So we’re gonna, again, we’re gonna talk about serializable because, you know, a lot of developers have this weird view of, you know, isolation levels that, you know, they really isolate everything. And, that’s still not exactly true even with serializable. Serializable does come kind of close because it forces read queries to take and hold shared locks on data. When, you know, it may even take like either like a lock on a, on a key if you’re using an equality predicate or a range of keys. If you are, you know, searching for like, like a range of values, right? Between greater than less than zero.

All those, all those range of things. So, uh, this, what serializable does prevent you with what most people would call an unchanging view of data. Uh, it is still not necessary. It’s still not a snapshot of the data within a transaction. Like, you know, like different, what I’m going to show you today is different queries, uh, within the same transaction, seeing data from different points in time. Uh, what serializable doesn’t guarantee is particularly when that view of the data started. All you know, is that that is how the data looked when the query finished. That might be good enough for you. If it’s not, you might need to look at, uh, the snapshot isolation level. Uh, read committed snapshot isolation does, you know, also come reasonably close to that. But snapshot isolation is, uh, is the snapshotiest, especially if you need, uh, transaction level consistency and not just statement level consistency. That’s one, that’s, uh, pretty much a place where read committed snapshot isolation and snapshot isolation differ. But we’ll, we’ll talk about that more in depth later.

Uh, the view of data that serializable provides is not necessarily, uh, from like when a transaction began, especially if the query was blocked. And that’s, that’s kind of what, that’s what we’re going to look at in these demos here. So, uh, what we’re going to do is just drop a table, create it and stick 11 rows in it. Uh, you can see here that we have, if zoom, it will cooperate. We have the odd numbers from one through 11 put into, uh, this table called surly eyes, which is, uh, one of the best bars in Boston.

If you’re in case you’re, in case you’re wondering where the name come from, came from, come from, it’s getting towards the end of the day here. All right. So let’s make sure there’s no weird open transactions in here.

And we’re going to, we’re going to run through these demos in two different ways. Uh, the first way I’m going to run this is we actually need to switch this back to seven. Is we’re going to start a transaction.

We’re going to set the isolation level, the serializable. And then we’re just going to run a select from that query where ID equals seven. Now this table does have a primary key on the ID column. So we don’t like, like we can just seek right to whatever we care about.

We’re not like scanning pages or like locking the table or anything. So, uh, let’s run this and let’s grab ID seven. We can, we get that very easily, right?

But now we have, we have this, this transaction and serializable has taken a lock, right? Like it’s going to hold this shared lock on ID seven. If we come over here, what we’re going to find is that we are still allowed to insert ID eight into the table, right?

So we can put ID eight in because serializable doesn’t need a lock to make sure that like ID eight doesn’t change. But now if we come over here and we try to run this, this select query is going to get blocked because this select query is looking for where ID is less than or equal to 11.

Remember 11 was the last key in our table. So it’s essentially just looking for everything behind 11. This query is now blocked because like this, and you know, this would happen under read committed as well.

This query would get blocked because they’re like this insert has an open transaction and it’s holding the lock on ID eight. So we can’t read that row while we’re doing this. So we can’t read that. But if we run this insert, right?

We do this and this all finally commits. What I think surprises a lot of people is that the serializable transaction with two select queries in it that ran at two different points in time, essentially saw two different views of the data.

Right? So like this thing ran and it found ID seven. That’s not really all that weird. But what people find weird is that when we run this query to look for where everything where ID is less than or equal to 11, not only does this query see the ID eight that we inserted, but also the ID 11, sorry, the ID 10 that we inserted, the even numbers, not the, not the odd numbers.

So this is what kind of freaks people out. Cause like they’re expecting this, you know, unchanging, like consistent snapshot view of data, which is not what serializable provides.

Again, it’s only what the data looked like when the query finished. Like when it, when that query began or any locks that it ran into along the way can, can certainly interfere with that. So let’s make sure that we don’t have anything weird going on here.

Uh, make sure this is all fully committed. So when we refresh the table over here, we don’t get blocked and get embarrassed by ourselves. So, uh, let’s do this.

Let’s, uh, let’s insert. Let’s start a transaction and insert ID eight here. Now let’s come back over here and let’s actually change this query for a little bit of texture. And let’s run both select query.

Let’s actually just run the whole thing at once. Right. And now we have these two select queries running. This one is blocked looking for ID eight. Cause ID eight is held onto with this one, but now I can run this query and I can insert ID 10. And like, not only is this query that ran, right?

We like this query that ran up here now found ID eight. Cause this happened after the lock was released, but this query once again, saw both eight and 10. So the main thing here is that, you know, serializable, like, you know, select queries will take and hold shared locks much, much longer than they will with read committed, the default locking isolation level for all SQL servers, except for Azure SQL database.

But, um, like serializable will take and hold shared locks and where read committed doesn’t. And what that, what that sometimes leads people down the path of is think that they’re thinking that there is like, you know, some internal mechanism scheduling these queries, like queuing them up.

Like, like when you see like the, like the blog post diagrams of like, here is T1. It is scheduled for now. It’s like a train schedule.

Here is T2. It is scheduled for now, but that’s, that’s really not what happens. These queries all run and they attempt to take these locks, but they’re all working off the same, like live table data. There’s no, you know, there’s no snapshot consistent view of this data, like goat, like that’s often stored anywhere.

So like even two select queries in the same transaction can encounter different rows when they run. Again, this might be good enough for you if, if you even need this high of a level of isolation, but it is still not a snapshot view of the data.

And once again, you would still need a, a row versioning isolation level in order to obtain that. Anyway, that’s enough about serializable. You probably don’t need it.

You probably don’t want it. When you use serializable, you certainly prolong blocking operations, especially from read queries, and you very, very certainly increase your likelihood of encountering deadlocks with read queries, taking and holding shared locks in ways that are often unfamiliar to people used to working with either read committed, the locking isolation level, or just festooning your queries with no lock hints, every which way you go.

So I hope you enjoyed yourselves. I hope you understand serializable a little bit better now. And I will see you over in the next video where we’re going to, we still have a, we still have some ground to cover with the transactions and isolation level stuff.

So we’ll, we’ll, we’ll get to something a little bit less isolating than serializable in the next one. Anyway, thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.