Should An Optimistic Isolation Level Be The New Default For SQL Server Databases?

Two Words


Alright, I’ve lied to you already. There are way more than two words involved, here.

I recently had breakfast (JUST BREAKFAST, GOD) with Microsoft’s Most Handsome Man™, and the topic came up.

Partially because there are such fundamental changes coming to SQL Server, and also because it’s already the default in Azure SQL DB.

If I had to name a redeeming quality of Azure SQL DB, that would be it.

Unboxing


I’ve seen pessimistic locking causing problems and heartache on many, many servers.

While it wasn’t totally the fault of the isolation level (query and index tuning was needed, and there was some other silliness), it shouldn’t take that kind of dedication to make It Just Run Faster©.

Possibly the worst side effect is people leaning heavily on reading dirty data (via read uncommitted/nolock) to avoid issues.

You can’t preach about the evils of dirty reads without offering ways to avoid blocking.

Yes, I’m looking at you.

You all cackle at seeing NOLOCK everywhere, but I don’t hear much about solving blocking problems without it.

Standards and Futures


Right now “other major database platforms” offer optimistic locking as the default.There are implementation differences, but the net result is the same.

Readers and writers don’t suffer locking contention, and only fully committed data is read.

SQL Server currently offers optimistic locking via Read Committed Snapshot Isolation (RCSI from here), and Snapshot Isolation (SI from here).

Right now, they both send row versions to tempdb. But in the works for SQL Server 2019 is a feature called Accelerated Database Recovery (ADR from here). The mechanism that makes this possible is a Persistent Version Store (PVS from here) that’s local to the database, not tempdb.

While RCSI and SI could end up sending quite a bit of data to tempdb, which could cause contention there without a lot of precautions and setup work, any potential contention from ADR would be localized.

I know, Microsoft has been addressing tempdb, too. Setup now configures tempdb in a mostly sane way, and trace flags 1117 and 1118 are the default behavior in 2016+. That should make optimistic locking a more comfortable setting for people, but no. No. I hardly ever see it in use.

Dependencies


Breaking changes are hard. I know, there are people out there who depend on pessimistic locking for certain patterns to maintain correctness.

The thing is, it’s a lot easier to have just these processes use pessimistic locking while allowing the rest of the us to not have to sprinkle NOLOCK hints every which where to get a report to run.

An example of a process that requires some locking hints is Paul White’s post about Sequence Tables.

At this point in time, if you’re implementing a queue in SQL Server, you should be the one learning about how to get the kind of locking you need to get it working. Normal people who just want their app to work shouldn’t be the ones staring at articles about isolation levels, locking hints, lock escalation, etc.

Get It, Get It


This is a good problem to get in front of. I’d be quite happy to not have to talk about the reader/writer aspects of locking anymore.

In the same way, it would be nice to not have to talk to users who are endlessly frustrated by locking problems, explain dirty reads, explain optimistic isolation levels, caution them against certain issues, and then tell them to have fun removing all those old lock hints.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Real Value Of Optimistic Isolation Levels In SQL Server

Erroneous


While a lot of people speak highly of optimistic locking because it cuts down on blocking issues, I think it’s of equal or higher value that dirty reads are prevented.

“Imagine” a situation where someone ran a query they shouldn’t have.

Either forgetting a join, a where, or some other limiting condition.

You know, while they’re continuously integrating something.

Out, damn spot


    BEGIN TRAN;
    UPDATE h
    SET h.Score = 2147483647
    FROM dbo.HighQuestionScores AS h
    WHERE 1=1;

This will update every row in the table. I know this because once upon a time, I wrote a similar query and learned the importance of backups.

I also learned that tools that throw a warning when modification queries are missing WHERE clauses think this one is just fine.

But now if I go into another window, this query will return wrong results.

    SELECT TOP 10 *
    FROM dbo.HighQuestionScores AS hqs WITH(NOLOCK)
    ORDER BY hqs.Id DESC;
SQL Server Query Results
BIGINT FOR U

In fact, any query that runs with a nolock hint will show incorrect values here.

  • Boss runs a report? Wrong.
  • Trigger fires based on a score change? Wrong.
  • User checks their account? Wrong.

Fully Correct


    SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
    
    SELECT TOP 10 *
    FROM dbo.HighQuestionScores AS hqs
    ORDER BY hqs.Id DESC;

This query uses the snapshot isolation level.

It would do the same thing under read committed snapshot isolation, but without needing to set the isolation level.

But now we don’t get someone’s accident in our query results.

SQL Server Query Results
Big Baby

Remember folks: yeah, blocking sucks, but incorrect data is usually way, way worse.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Separating Long Running Queries From Modifications To Fix Blocking Problems In SQL Server

Let’s Say We Have A Bad Query


I know you don’t have bad queries. You’re smart people. But some people do!

Let’s also say that bad query is taking part in a modification.

UPDATE u2
SET u2.Reputation *= 2
FROM Users AS u
JOIN dbo.Users AS u2
ON CHARINDEX(u.DisplayName, u2.DisplayName) > 0
WHERE u2.Reputation >= 100000
AND u.Id <> u2.Id;

This query will run for so long that we’ll get sick of waiting for it. It’s really holding up writing this blog post.

Let’s try something a little different.

SELECT u.Id, u.DisplayName
INTO #Users
FROM dbo.Users AS u
WHERE u.Reputation >= 100000;

UPDATE u2
SET u2.Reputation *= 2
FROM #Users AS u
JOIN dbo.Users AS u2
ON CHARINDEX(u.DisplayName, u2.DisplayName) > 0
AND u.Id <> u2.Id;

This time, the query finishes in 43 seconds.

Plans and Events


The query plan looks like this:

SQL Server Query Plan
You’re not cool.

Extended events can tell us when locks were taken and released to perform the update.

SQL Server Extended Event Lock Escalation
Ducked out

An exclusive lock was held on the table for over 35 seconds, so all but ~7 seconds of the plan has locks on the table we’re updating.

In real life, that’d probably mean a lot of blocking.

Alt Rock


Another way to handle this is to remove the long-running query from the update, and take advantage of the Primary Key/Clustered Index on the table.

First, isolate the rows we care about:

SELECT u2.Id, u2.Reputation * 2 AS DoubleRep
INTO #Updater
FROM #Users AS u
JOIN dbo.Users AS u2
ON CHARINDEX(u.DisplayName, u2.DisplayName) > 0
AND u.Id <> u2.Id;

Then do the update:

UPDATE u2
SET u2.Reputation = u.DoubleRep
FROM #Updater AS u
JOIN dbo.Users AS u2
ON u.Id = u2.Id;

This query only runs a few hundred milliseconds, so the locks taken are shorter.

Keep in mind, though, that the first insert query is still horrible.

It runs for ~40ish seconds, but at least we don’t lock any objects that other queries might care about.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Bad Cardinality Estimates Can Make Blocking Worse In SQL Server

Stay Sober


I recently came across a bad locking problem. It was one of those things that happened “suddenly” and “for no reason”.

When I looked at the stored procedure, the code was something like this:

CREATE PROCEDURE dbo.bad_news(@PostTypeId INT)
AS	
BEGIN

DECLARE @PTID INT
SELECT @PTID = CASE WHEN @PostTypeId < 1
                         OR @PostTypeId > 8
				    THEN 4
			   END
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = @PTID;

END

I mean, it would look like that if we were using Stack Overflow.

We weren’t, but I’ll leave the rest to your imagination.

Outside The Proc


We have this index:

CREATE INDEX ix_spanky ON dbo.Posts(PostTypeId);

Let’s run this code, and look at what happens.

BEGIN TRAN
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = 3;
ROLLBACK

The locks look like this:

<Lock resource_type="KEY" index_name="PK_Posts__Id" request_mode="X" request_status="GRANT" request_count="28" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Posts__Id" request_mode="IX" request_status="GRANT" request_count="17" />

Our nonclustered index helps us find the rows we want to update. We don’t need to lock it because it doesn’t have the Score column in it.

We do need to lock the clustered index, but we’re able to take a small number (28) of Key locks.

This is the best outcome!

The worst outcome happens when we do this, which is what was happening in the stored procedure.

DECLARE @PostTypeId INT = 3;
BEGIN TRAN
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = @PostTypeId;
ROLLBACK

The locks look like this:

<Lock resource_type="OBJECT" request_mode="X" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Posts__Id" request_mode="X" request_status="GRANT" request_count="17" />

We not only have exclusive locks on pages, but on the entire table now.

Overestimaters Anonymous


The execution plans for these two queries are much different.

SQL Server Query plan
umop

Likewise, the estimates are much different.

SQL Server Query Plan
upside

Oh So Loco


This is a fairly well-documented outcome of using a “declared variable”, or the optimize for unknown hint.

The optimizer makes a usually-generally-bad guess at the number of rows it’ll have to deal with.

In this case, the stored procedure had been modified to account for bad values passed in from the application.

The outcome was severe blocking because modification queries were taking far more intrusive locks than necessary.

So, you know, don’t do that.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Explicit Transactions Can Make Blocking Worse In SQL Server

But I Need To Roll That Back


It all sounds like a great idea.

You want your stored procedure to be a unit of work that you can roll back if something goes awry.

At first, it’s a few quick operations.

But over the years, the code has gotten more complicated, data has gotten larger, and now you’ve got terrible blocking problems.

Do The Worm


Assuming you can’t get rid of the transaction, you need to think of it like a movie about interstellar travel.

You know the kind where someone folds a piece of paper in half and pokes a hole through it?

Your job is to make all the stuff that happens inside as fast as possible.

This won’t fix the blocking, but it will shorten it.

Other Approaches


It might be possible for you to get rid of the transaction if you can store the before and after data somewhere.

This will get rid of the transaction, so locks won’t be held for the duration of it, but it makes you do more work in the procedure.

For every row or set of rows you change, you have to get the current values, put them somewhere, and hang onto them.

If you need to roll back your work, you need to go through and put them back. This can cause all sorts of problems, especially if someone else is working on the same data at the same time. Who should win?

You can partially solve this with sp_getapplock, but then you’re locking access to code until the whole thing is done, and…

You’re back to having locking problems.

Another thing you might be able to implement if you can change table structure is add an “in use” bit column, and skip over rows that are in use by someone else. This is common in queue tables, but can be extended to other types of work tables.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Why Does Using The Rowlock Hint In SQL Server Create More Locks?

Taking A Hint


There are all sorts of fun hints you can staple onto your queries that will make them behave differently.

Probably the most common one I see is the NOLOCK hint.

Don’t worry, this post has nothing to do with that.

You’ve been finger wagged to death about it already.

Other Lock Hints


Sometimes you want to try to control lock granularity. A noble effort, indeed.

To do that, you can use hints like ROWLOCK or PAGLOCK. They’ll let you tell the weevils inside your server to prefer those types of locks.

The thing is, both are still subject to lock escalation (unless you turn that off, too).

If we take an update query like this, and use sp_WhoIsActive to look at the locks:

    BEGIN TRAN;
    WITH x AS 
    (
        SELECT TOP (5000)
                h.Score
        FROM dbo.HighQuestionScores AS h
        ORDER BY h.Id
    )
    UPDATE x
    SET  x.Score = 2147483647
    FROM x WITH (ROWLOCK);

    ROLLBACK
EXEC dbo.sp_WhoIsActive @get_locks = 1;

The lock info that comes back looks like this:

<Lock resource_type="KEY" index_name="PK__HighQues__3214EC072602ECCF" request_mode="X" request_status="GRANT" request_count="5000" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK__HighQues__3214EC072602ECCF" request_mode="IX" request_status="GRANT" request_count="28" />

The only lock type that has an X type is the Key lock. The page and object lock types are only intent exclusive. You can read more about intent locks here.

Now, if we change the 5000 to 7000, we get much different results from sp_WhoIsActive.

<Lock resource_type="OBJECT" request_mode="X" request_status="GRANT" request_count="1" />

We’ve hit lock escalation, and our precious ROWLOCK hint has been disregarded.

Funny, huh?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Isolation Levels Are Inherited In SQL Server Queries

Client Yell


While working with a client who has a nasty blocking problem involving triggers, we were talking about using Snapshot Isolation to help fix the issue.

The thing is, there’s more than just stored procedures causing problems. Some of the issue is with triggers.

The prospect of having to modify a bunch of objects was a rightfully frightful one.

Funboy & Funboy & Funboy


In my database, I have Snapshot Isolation enabled.

I have a table with a trigger on it. The trigger doesn’t do anything except show me what my session’s isolation level is

CREATE TABLE dbo.t ( id INT NULL, dimdate DATETIME NULL );
CREATE TRIGGER dbo.dim_trigger
ON dbo.t
AFTER INSERT
AS
    BEGIN
        SET NOCOUNT ON;

        DECLARE @UserOptions TABLE ( SetOption VARCHAR(100), Value VARCHAR(100));
        DECLARE @IsolationLevel VARCHAR(100);

        INSERT @UserOptions
        EXEC ( 'DBCC USEROPTIONS WITH NO_INFOMSGS' );

        SELECT @IsolationLevel = 'Trigger:' + Value
        FROM   @UserOptions
        WHERE  SetOption = 'isolation level';

        PRINT @IsolationLevel;

    END;

I also have a stored procedure that checks the isolation level, and inserts into my table so that the trigger fires off.

CREATE OR ALTER PROCEDURE dbo.insert_t
AS
    BEGIN
        SET NOCOUNT ON;

        SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

        DECLARE @UserOptions TABLE ( SetOption VARCHAR(100), Value VARCHAR(100));
        DECLARE @IsolationLevel VARCHAR(100);

        INSERT @UserOptions
        EXEC ( 'DBCC USEROPTIONS WITH NO_INFOMSGS' );

        SELECT @IsolationLevel = 'Procedure:' + Value
        FROM   @UserOptions
        WHERE  SetOption = 'isolation level';

        PRINT @IsolationLevel;

        INSERT dbo.t ( id, dimdate )
        VALUES ( 0, GETDATE());

    END;

On Notice


I’m only setting the isolation level in the stored procedure here.

When I run it, both will return the same isolation level no matter what you set it to.

Consistency

Thanks to Scott Ivey, whose code I borrowed for this.

And thank you for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Why Plan Cache Scripts For SQL Server Can Be Misleading

A Long Way From Home


Most scripts (even ones I’ve worked on!) that look at the plan cache, have had the ability to sort it by different metrics: CPU, reads, writes, duration, etc.

A lot of people are very interested in long running queries — and I am too!

Heck, they’re how I make money. Blogging pays like crap.

But there’s a slight problem with only looking at query duration.

Fake Locks


Let’s say I have a stored procedure that looks like this:

CREATE OR ALTER PROCEDURE dbo.dblock (@id INT)
AS
BEGIN
    BEGIN TRAN
        UPDATE u
        SET u.Reputation = 2147483647 
        FROM dbo.Users AS u
        WHERE u.Id = @id
        
        WAITFOR DELAY '00:00:07.000'
    ROLLBACK
END;
GO

It’s there to simulate a long running modification query. You know, the kind of thing that’ll lead to blocking.

The kind of blocking that can make other queries feel like they take a long time.

Street Team


Here’s another stored procedure:

CREATE OR ALTER PROCEDURE dbo.the_lox (@id INT)
AS
BEGIN
    SELECT u.Id
    FROM dbo.Users AS u
    WHERE u.Id = @id
END;
GO

This will finish instantly. The Id column of the Users table is the Primary Key, as well as the Clustered Index.

There is literally nothing to tune here.

But it may look like it, sometimes…

D’evils


If I run these in different windows, the lox will be blocked by dblock.

--One Window
EXEC dbo.dblock @id = 22656;
GO 

--Other window
EXEC dbo.the_lox @id = 22656;
GO

If you were wondering why I had a 7 second wait, it’s because it generally takes me two seconds to flip windows and run a query.

When the situation resolves, this is what the metrics look like:

SELECT OBJECT_NAME(deps.object_id) AS stored_procedure_name,
       ( deps.last_worker_time / 1000. ) AS cpu_time,
       ( deps.last_elapsed_time / 1000. ) AS run_time
FROM   sys.dm_exec_procedure_stats AS deps
WHERE  deps.object_id = OBJECT_ID('dbo.the_lox');
GO
Everything You Got

The query barely used any CPU, but it ran for 5 seconds.

If you order your plan cache by elapsed time, you might see blameless queries up at the top.

There are a lot of reasons why they could end up there, and blocking is one of them.

Unfortunately, there’s currently nothing to tell you why. If you’re just getting started with query tuning, this could look befuddling.

Are Averages Better?


Sometimes it helps to look at what runs for a long time on average — you can figure that out by looking at the number of executions.

That can make better sense of things, but not if a query has only run once, or if it gets blocked a lot.

It may make more sense to factor in metrics that represent physical work, like CPU, reads, or writes, rather than just relying purely on wall clock time.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Why SQL Server’s READPAST Hint Doesn’t Always Work

Oh, Bother


Sometimes, you only wanna work on one thing. Other times, you only wanna work on something if it’s not locked by something else.

Locking hints can be really handy in these situations, especially the READPAST hint. The documentation for it says that it allows you to skip over row level locks (that means you can’t skip over page or object level locks).

What it leaves out is that your READPAST query may also need to try to take row level shared locks.

Here’s an example!

Sterling Reputation


If I run this query, it’ll take out locks we don’t want (without an index on Reputation).

BEGIN TRAN
UPDATE u
SET u.Reputation += 1
FROM dbo.Users AS u
WHERE u.Reputation = 1047863;

ROLLBACK

If we use sp_WhoIsActive @get_locks = 1; we’ll get this back:

<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Users_Id" request_mode="X" request_status="GRANT" request_count="1" />

We take a single exclusive page lock, which means our readpast query can’t skip it.

SELECT TOP (1)  
        u.Id,
        u.DisplayName, 
        u.Reputation
FROM dbo.Users AS u WITH(READPAST)
ORDER BY u.Reputation DESC;

We’ll get blocked. We’ll need to change our update query to do this:

BEGIN TRAN
UPDATE u
SET u.Reputation += 1
FROM dbo.Users AS u WITH(ROWLOCK)
WHERE u.Reputation = 1047863;

ROLLBACK

We’re now asking SQL Server to only lock rows if it can, for as long as it can.

Now our locks change to this:

<Lock resource_type="KEY" index_name="PK_Users_Id" request_mode="X" request_status="GRANT" request_count="1" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Users_Id" request_mode="IX" request_status="GRANT" request_count="1" />

Rather than exclusive page locks, we’re taking an exclusive key lock, and only intent exclusive locks elsewhere.

What’s amusing here is that our select query will… Still be blocked!

But why?

Going back to look at the locks it’s trying to take, we’ll see this:

<Lock resource_type="OBJECT" request_mode="IS" request_status="GRANT" request_count="5" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Users_Id" request_mode="S" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Users_Id" request_mode="S" request_status="WAIT" request_count="1" />

So uh. Our read query is shooting itself in the foot by trying to lock pages rather than rows.

That’s… brave.

We need to change the locking hints slightly to avoid that.

SELECT TOP (1)  
        u.Id,
        u.DisplayName, 
        u.Reputation
FROM dbo.Users AS u WITH(READPAST, ROWLOCK)
ORDER BY u.Reputation DESC;

With the rowlock hint here, our query finishes.

Index Alternative


If we start the process over, but add this index:

CREATE INDEX ix_dammit_please_just_work
    ON dbo.Users(Reputation);

Some things will be different:

  • We don’t need a rowlock hint on the update
  • We don’t need a rowlock hint on the select

This is true despite there being more overall locks taken for the update:

<Lock resource_type="KEY" index_name="ix_dammit_please_just_work" request_mode="X" request_status="GRANT" request_count="2" />
<Lock resource_type="KEY" index_name="PK_Users_Id" request_mode="X" request_status="GRANT" request_count="1" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="ix_dammit_please_just_work" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Users_Id" request_mode="IX" request_status="GRANT" request_count="1" />

But the addition of the index makes row level locks a more obvious choice for both queries.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server’s sp_getapplock Is Pretty Cool

Magicool


Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.