The Art Of The SQL Server Stored Procedure: Transactions

Sitting In A Tree


Transactions and error handling often go hand-in-glove, to make better use of XACT_ABORT, manually manage rollbacks, and try to make the best use of all those ACID properties that database people seem to care deeply about.

The main things to understand about transactions in SQL Server are:

  • There’s no respect for nested transactions
  • The default read committed isolation level can be a bad time
  • Not everything requires an outer transaction

The lack of nested transactions has been well-documented over the years. Though savepoints do exist, they’re not really the same thing.

For the remaining points, let’s talk a little bit more.

Blocking, Blocking, Blocking


This is a big enough problem with singular modification queries under the default Read Committed pessimistic isolation level.

Trust me. Roughly half of my consulting life is looking at the blocked process and deadlock reports.

If you’re going to take it upon yourself to involve multiple queries in a transaction, it also falls upon you to make sure that:

  • You use the appropriate locking hints
  • You handle any errors and rollbacks
  • You make sure that thing is as short and sweet as possible

Everything you do between BEGIN TRANSACTION and COMMIT TRANSACTION will hold whatever locks are required or specified until the bitter end.

Think of the duration of a query, and then the duration of a transaction as a whole, as the surface area.

The larger that surface area gets, the more opportunity there is for it to interfere, or be interfered with, by another surface area with a similar trajectory.

You also need to really understand the isolation level in use, and how that can make results weird.

Let’s take this piece of pseudo code, and assume it’s running under the default Read Committed pessimistic isolation level.

BEGIN TRANSACTION
    DECLARE
        @UserToPromote integer = NULL;
    
    SELECT TOP (1)
        @UserToPromote = u.Id
    FROM dbo.Users AS u
    WHERE u.Reputation = 1
    AND   EXISTS
    (
        SELECT
            1/0
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = u.Id
        AND   p.PostTypeId = 1
        AND   p.Score = 0
    )
    ORDER BY
        u.CreationDate,
        u.Id;
    
    /*IRL you might bail here if this is NULL or something*/
    
    WITH
        UserToPromote
    AS
    (
        SELECT TOP (1)
            p.*
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = @UserToPromote
        AND   p.PostTypeId = 1
        AND   p.Score = 0
        ORDER BY
            p.Score,
            p.CreationDate
    )
    UPDATE utp
        SET utp.Score += 1
    FROM UserToPromote AS utp;
COMMIT TRANSACTION;

Leaving aside some of the obvious stuff that a professional performance tuner would avoid (like local variables), and ignoring the fact that I haven’t done any error handling, what’s wrong with this code?

  • We only get the first user whose data was currently committed in the Users table
  • … And who has a question in the Posts table
  • Then we try to update a row in Posts for that user

What that leaves out is:

  • The user could delete their profile after we find them
  • Someone could vote on their question after we find them
  • They could delete their question after we find them

Read Committed is not a consistent snapshot of your data during a transaction. In a highly transaction environment, all sorts of things can change right underneath you.

All Read Committed guarantees is that the data you read was committed at the time it is read. Quite flimsy once you think about it for a moment.

If you want to avoid those changes, you’d need to add hints like UPDLOCK and SERIALIZABLE to your select query (and exists subquery) to prevent those changes until the update completes.

Even if you were to do all that work in a single query rather than two, you could run into the exact same issues without those locking hints.

Once those read cursors find the row(s) they want, anything goes until the exclusive locks start being taken.

Don’t Transact Me, Bro


Let’s look at some more psuedo code. It’s a pattern I’ve noticed with more than a few clients now.

BEGIN TRY
BEGIN TRANSACTION
    INSERT
        dbo.HighQuestionScores
    (
        Id,
        DisplayName,
        Score
    )
    SELECT
        u.Id,
        u.DisplayName,
        p.Score
    FROM dbo.Users AS u
    CROSS APPLY
    (
        SELECT
            Score = 
                MAX(p.Score)
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = u.Id
        AND   p.PostTypeId = 1
    ) AS p
    WHERE NOT EXISTS
    (
        SELECT
            1/0
        FROM dbo.HighQuestionScores AS hqs
        WHERE hqs.Id = u.Id
    );
    
    UPDATE hqs
        SET hqs.LastQuestionBadge = b.Name
    FROM dbo.HighQuestionScores AS hqs
    CROSS APPLY
    (
        SELECT TOP (1)
            b.Name
        FROM dbo.Badges AS b
        WHERE b.Id = hqs.Id
        AND   b.Name LIKE N'%Question%'
        ORDER BY
            b.Date DESC
    ) AS b;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
    BEGIN
        ROLLBACK
    END;

    /*Do some error stuff maybe*/
    THROW;
END CATCH;

The idea of the code is to insert any new users to the HighQuestionScores table. For the sake of completeness, let’s say there’s another query in the mix that would update the high score for existing users too.

It’s just invisible 👻

The problem here is also fairly obvious. There’s absolutely no reason to ROLLBACK new users inserted into the table just because the ensuing update query fails for some reason.

Let’s say that the Badges table was recently modified to accommodate a new, longer, badge name, but the HighQuestionScores table wasn’t.

We would get a truncation error, obviously. But that truncation error should not invalidate the new users inserted at all.

Likewise, if our invisible Score update query produced a bigger integer and failed trying to insert it into an integer column, it should not invalidate new users inserted.

It’s not their fault.

The bottom line: Transactions should only encapsulate queries where correctness would be effected by one failing.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.



3 thoughts on “The Art Of The SQL Server Stored Procedure: Transactions

    1. Well, you have to put something in the select list, but what you put in there doesn’t matter.

      I see a lot of people doing goofy things, like adding TOP (1) or COUNT inside of (NOT) EXISTS subqueries for no good reason.

      Thanks!

Comments are closed.