Erik Darling – Page 11

The Art Of The SQL Server Stored Procedure: Locking Hints

Posted on December 18, 2023December 12, 2023 by Erik Darling

None Other

Locking hints are funny. Normally I see them in awful products like Dynamics, on select queries, and hilarious levels of blocking and deadlocking issues ensue.

Another place where they’re common is when developers think that adding hints like ROWLOCK means you’ll always only ever lock rows, and that you’ll never have lock escalation when you use it.

Both of those things are wrong. Here’s a video about why:

Where I most frequently do not see locking hints, or the use of an appropriate isolation level, is when developers still for some reason insist on using MERGE statements.

Perhaps I’m a bit salty because I spent a lot of today rewriting merge statements into separate actions. Perhaps I’m just salty generally.

When Do You Need Locking Hints?

Locking hints (aside from NOLOCK), have the most impact when you need to protect data beyond the scope of a single query in a transaction.

They have the second most impact when you’re doing strangeland things in the database, like processing queues. You can see my example of how to do that here: Building Reusable Queues In SQL Server Part 2

More often than not, though, you’ll want to use them if you do things like this:

BEGIN TRANSACTION
DECLARE
    @NextUser integer;

SELECT TOP (1)
    @NextUser = u.Id
FROM dbo.Users AS u WITH(UPDLOCK, SERIALIZABLE)
WHERE u.LastAccessDate >=
(
    SELECT
        MAX(u2.LastAccessDate)
    FROM dbo.Users AS u2 WITH(UPDLOCK, SERIALIZABLE)
)
ORDER BY
    u.LastAccessDate DESC; 

INSERT
    dbo.Badges
(
    Name,
    UserId,
    Date
)
VALUES
(
    N'Most Recent User Award',    
    @NextUser,      
    GETDATE()
);
COMMIT;

Why? What if someone else logs in after you assign this variable?

What locking hints do you need each time you touch the Users table?
Do you need a transaction to protect the entire thing?

Look at the join order

What if someone logs in after the first scan of the Users table?

Is this great for concurrency? Well, that depends on how fast the queries are. With some Good Enough™️ indexes, this’ll be just fine.

Without them, you’ll be in a HEAP of trouble. Ahem. This is the time for laughter.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Andy Pavlo and Erik Darling Talk Databases (@AndyPavlo @OtterTune)

Posted on December 15, 2023December 10, 2023 by Erik Darling

Andy Pavlo and Erik Darling Talk Databases

I somehow talked database genius and actual professor Andy Pavlo to talk to me about databases, OtterTune, the future, and who has the best query optimizer.

Going Further

Documenting What Microsoft Won’t: The sys.sp_configure_automatic_tuning Stored Procedure

Posted on December 14, 2023March 28, 2024 by Erik Darling

Moment Of Silence

Automatic tuning is a half-assed feature that has a lot of problems. I wouldn’t trust it any further than I could throw Entity Framework.

Back in October, I opened an issue with Microsoft to document a new-ish stored procedure. It was closed, and to be clear, it was the PM, not the docs team, that refused the request.

Here’s why that’s a bad choice:

This stored procedure exists and can be executed
The only documentation for it is going to be this blog post
Many other configuration procedures that can be misused are documented and regularly executed by end users

For example, sp_configure is well-documented, and there are many changes one could make via those calls to totally hose a SQL Server.

There is documentation available to force and unforce plans in Query Store. There’s also documentation to turn automatic tuning on or off.

If there’s fear that something could be misused, the only cure is to document its proper use and warn of its improper use.

Leaving things up to the imaginations of end-users is quite a mistake.

Valid Use

Just in case Microsoft decides to wipe that issue, here are the parameters currently available for sys.sp_configure_automatic_tuning:

EXECUTE sys.sp_configure_automatic_tuning
    @option = N'???',
    @type = N'???',
    @type_value = ???,
    @option_value = N'???';

If one wanted to insert valid arguments for those parameter values, one could use these:

EXECUTE sys.sp_configure_automatic_tuning
    @option = N'FORCE_LAST_GOOD_PLAN', /*Option*/
    @type = N'QUERY', /*Target*/
    @type_value = 1, /*query_id from Query Store*/
    @option_value = N'OFF'; /*Disable*/

What this would effectively do is turn automatic tuning off for a specific query based on its query_id in Query Store.

If there’s a plan already being forced, you’ll have to unforce the plan too. You can do that with sp_query_store_unforce_plan.

There you have it. Erik Darling cares about you more than Microsoft does.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: Transactions

Posted on December 13, 2023December 12, 2023 by Erik Darling

Sitting In A Tree

Transactions and error handling often go hand-in-glove, to make better use of XACT_ABORT, manually manage rollbacks, and try to make the best use of all those ACID properties that database people seem to care deeply about.

The main things to understand about transactions in SQL Server are:

There’s no respect for nested transactions
The default read committed isolation level can be a bad time
Not everything requires an outer transaction

The lack of nested transactions has been well-documented over the years. Though savepoints do exist, they’re not really the same thing.

For the remaining points, let’s talk a little bit more.

Blocking, Blocking, Blocking

This is a big enough problem with singular modification queries under the default Read Committed pessimistic isolation level.

Trust me. Roughly half of my consulting life is looking at the blocked process and deadlock reports.

If you’re going to take it upon yourself to involve multiple queries in a transaction, it also falls upon you to make sure that:

You use the appropriate locking hints
You handle any errors and rollbacks
You make sure that thing is as short and sweet as possible

Everything you do between BEGIN TRANSACTION and COMMIT TRANSACTION will hold whatever locks are required or specified until the bitter end.

Think of the duration of a query, and then the duration of a transaction as a whole, as the surface area.

The larger that surface area gets, the more opportunity there is for it to interfere, or be interfered with, by another surface area with a similar trajectory.

You also need to really understand the isolation level in use, and how that can make results weird.

Let’s take this piece of pseudo code, and assume it’s running under the default Read Committed pessimistic isolation level.

BEGIN TRANSACTION
    DECLARE
        @UserToPromote integer = NULL;
    
    SELECT TOP (1)
        @UserToPromote = u.Id
    FROM dbo.Users AS u
    WHERE u.Reputation = 1
    AND   EXISTS
    (
        SELECT
            1/0
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = u.Id
        AND   p.PostTypeId = 1
        AND   p.Score = 0
    )
    ORDER BY
        u.CreationDate,
        u.Id;
    
    /*IRL you might bail here if this is NULL or something*/
    
    WITH
        UserToPromote
    AS
    (
        SELECT TOP (1)
            p.*
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = @UserToPromote
        AND   p.PostTypeId = 1
        AND   p.Score = 0
        ORDER BY
            p.Score,
            p.CreationDate
    )
    UPDATE utp
        SET utp.Score += 1
    FROM UserToPromote AS utp;
COMMIT TRANSACTION;

Leaving aside some of the obvious stuff that a professional performance tuner would avoid (like local variables), and ignoring the fact that I haven’t done any error handling, what’s wrong with this code?

We only get the first user whose data was currently committed in the Users table
… And who has a question in the Posts table
Then we try to update a row in Posts for that user

What that leaves out is:

The user could delete their profile after we find them
Someone could vote on their question after we find them
They could delete their question after we find them

Read Committed is not a consistent snapshot of your data during a transaction. In a highly transaction environment, all sorts of things can change right underneath you.

All Read Committed guarantees is that the data you read was committed at the time it is read. Quite flimsy once you think about it for a moment.

If you want to avoid those changes, you’d need to add hints like UPDLOCK and SERIALIZABLE to your select query (and exists subquery) to prevent those changes until the update completes.

Even if you were to do all that work in a single query rather than two, you could run into the exact same issues without those locking hints.

Once those read cursors find the row(s) they want, anything goes until the exclusive locks start being taken.

Don’t Transact Me, Bro

Let’s look at some more psuedo code. It’s a pattern I’ve noticed with more than a few clients now.

BEGIN TRY
BEGIN TRANSACTION
    INSERT
        dbo.HighQuestionScores
    (
        Id,
        DisplayName,
        Score
    )
    SELECT
        u.Id,
        u.DisplayName,
        p.Score
    FROM dbo.Users AS u
    CROSS APPLY
    (
        SELECT
            Score = 
                MAX(p.Score)
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = u.Id
        AND   p.PostTypeId = 1
    ) AS p
    WHERE NOT EXISTS
    (
        SELECT
            1/0
        FROM dbo.HighQuestionScores AS hqs
        WHERE hqs.Id = u.Id
    );
    
    UPDATE hqs
        SET hqs.LastQuestionBadge = b.Name
    FROM dbo.HighQuestionScores AS hqs
    CROSS APPLY
    (
        SELECT TOP (1)
            b.Name
        FROM dbo.Badges AS b
        WHERE b.Id = hqs.Id
        AND   b.Name LIKE N'%Question%'
        ORDER BY
            b.Date DESC
    ) AS b;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
    BEGIN
        ROLLBACK
    END;

    /*Do some error stuff maybe*/
    THROW;
END CATCH;

The idea of the code is to insert any new users to the HighQuestionScores table. For the sake of completeness, let’s say there’s another query in the mix that would update the high score for existing users too.

It’s just invisible 👻

The problem here is also fairly obvious. There’s absolutely no reason to ROLLBACK new users inserted into the table just because the ensuing update query fails for some reason.

Let’s say that the Badges table was recently modified to accommodate a new, longer, badge name, but the HighQuestionScores table wasn’t.

We would get a truncation error, obviously. But that truncation error should not invalidate the new users inserted at all.

Likewise, if our invisible Score update query produced a bigger integer and failed trying to insert it into an integer column, it should not invalidate new users inserted.

It’s not their fault.

The bottom line: Transactions should only encapsulate queries where correctness would be effected by one failing.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: Batching Modifications

Posted on December 12, 2023December 12, 2023 by Erik Darling

Command

The goal of batching modifications is to accomplish large numbers of row modifications with exacerbating locking problems, and being kinder to your server’s transaction log.

There is generally an appreciable difference in transaction time in modifying 1,000 rows and modifying 10,000,000 rows. Go try it, if you don’t believe me.

Probably the canonical post about batching modifications is written by my dear friend Michael J. Swart.

In many cases, that code is good to go right out of the box. Just replace the table and column names.

But you may have other requirements, too.

Control

One post I had an enjoyable time writing was this one, about how to track before and after values when batching updates.

Batching Modification Queries Is Great Until…

I also talk a little bit at the end about how to retry modifications when they hit deadlocks, but anyway!

This is something most developers don’t think about at the outset of writing batching code: Should I keep all the prior changes if one set of changes produces an error?

Another thing you may need to think about is this: What if data changes in the course of your code?

Lemme give you a couple examples.

Champion

If you do this, which is a subset of my code from the above post:

SELECT
    @NextBatchMax  = 
        MAX(x.id)
FROM
(
    SELECT TOP (1000) 
        aia.id
    FROM dbo.as_i_am AS aia
    WHERE aia.id >= @LargestKeyProcessed
    ORDER BY 
        aia.id ASC
) AS x;

/*Updateroo*/
UPDATE
    aia
    SET aia.some_date = 
            DATEADD(YEAR, 1, aia.some_date),
        aia.some_string = 
            aia.some_string + LEFT(aia.some_string, 1)
    OUTPUT 
        Deleted.id, Deleted.some_date, Deleted.some_string
    INTO
        dbo.as_i_was (id, some_date, some_string)
FROM dbo.as_i_am AS aia
WHERE aia.id >= @LargestKeyProcessed
AND   aia.id <= @NextBatchMax;

Or even this:

/*Updateroo*/
UPDATE
    aia
    SET aia.some_date = 
            DATEADD(YEAR, 1, aia.some_date),
        aia.some_string = 
            aia.some_string + LEFT(aia.some_string, 1)
    OUTPUT 
        Deleted.id, Deleted.some_date, Deleted.some_string
    INTO
        dbo.as_i_was (id, some_date, some_string)
FROM dbo.as_i_am AS aia
WHERE aia.id IN
      (
          SELECT TOP (1000) 
              aia.id
          FROM dbo.as_i_am AS aia
          WHERE aia.id >= @LargestKeyProcessed
          AND   aia.id <= @NextBatchMax
          ORDER BY
              aia.id ASC                
      );

There’s a problem, isn’t there?

Isolation Levels

If you’re using:

Read Committed (Pessimistic)
Read Uncommitted (Anarchy)

Only the table that is either only locked when the update runs (first example), or only the instance referenced by the update (second example) will ever have exclusive locks taken against it.

The part of the operation in both examples that only reads from the table will take normal shared locks on rows or pages as the reads happen.

Even under Read Committed (Pessimistic), some lousy things can happen:

You can miss rows that are updated by other processes
You can see rows twice that are updated by other processes
You can include rows that have been deleted by other processes

Read Committed is not a point in time read of all the data your query needs.

Heck, even Repeatable Read (Quite Pessimistic) only locks rows as it needs them. That means rows can change ahead of your seek/scan position freely.

For really interesting cases, you might need to use Serializable (Perfectly Pessimistic) to do your batching.

And Stay Out

I ran into a client process recently that actually had to do this.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
/*Updateroo*/
UPDATE
    aia /* You could also do this WITH (SERIALIZABLE) */
    SET aia.some_date = 
            DATEADD(YEAR, 1, aia.some_date),
        aia.some_string = 
            aia.some_string + LEFT(aia.some_string, 1)
    OUTPUT 
        Deleted.id, Deleted.some_date, Deleted.some_string
    INTO
        dbo.as_i_was (id, some_date, some_string)
FROM dbo.as_i_am AS aia
WHERE aia.id IN
      (
          SELECT TOP (1000) 
              aia.id
          FROM dbo.as_i_am AS aia /* You could also do this WITH (SERIALIZABLE) */
          WHERE aia.id >= @LargestKeyProcessed
          AND   aia.id <= @NextBatchMax
          ORDER BY
              aia.id ASC                
      );
COMMIT TRANSACTION;

If you’re deleting a small number of rows, and you have appropriate indexes in place to support your queries finding the data they need to modify, this isn’t painful.

Just be wary of:

Triggers
Foreign keys
Indexed views

Which can make life more interesting than most people care to deal with.

Now I know, a couple posts ago I told you that there’s very little benefit to error/transaction handling if you’re not going to do anything with them — and that I’ve left that out of this pseudo code — you’ll just have to forgive me my transgression here.

I had to save something for tomorrow.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: Debugging

Posted on December 11, 2023December 6, 2023 by Erik Darling

Multitudes

Debugging, like error handling, is a design choice that is fairly easy to make at the outset of writing a stored procedure, and is usually a lot easier to get in there if you do it from the get-go.

The number of times I’ve had to go back and add debugging into something only to introduce debugging bugs is actually a bit tragic.

One of the first things I added debugging to is sp_Blitz, because sometimes queries would fail, and we’d have no idea which one was failing.

Sure, there would be an error message, and a line number, but that all goes pear shaped when you’re dealing with dynamic SQL and nested dynamic SQL and whatnot.

My bright idea was to add lines like this to know which check was starting:

IF @Debug IN (1, 2) RAISERROR('Running CheckId [%d].', 0, 1, 185) WITH NOWAIT;

They’ve changed a little over the years, but it took a few tries to get all the substitution numbers right, and to get them in the right place so they didn’t prevent checks from running.

But now if you @Debug, everything is cool. And if a check fails, you can figure out which one and be a top notch open source contributor by reporting (and even fixing) issues.

Semiglutide

I blogged fairly recently about how and why you should do this. I even recorded a video about it:

Where I’m making a really pleasant face in the thumbnail, apparently.

You should check those out if you haven’t already.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: Error Handling

Posted on December 8, 2023December 8, 2023 by Erik Darling

To Be Or Not To Be

Okay, look, the best post about this is, of course, by Erland Sommarskog: Error and Transaction Handling in SQL Server

Just like Erland, it has three parts and three appendices. If you want to learn how to do a whole lot of things right, give yourself a couple days to read through and digest all that.

What I’m here to talk about is some of the stuff you should think about before implementing any error handling at all.

Namely, what do you hope to do with certain errors and conditions?

As far as I’m concerned, there are four outcomes:

Print an informational message and keep chugging along
Raise and log the error somewhere and stop everything (maybe rollback, maybe don’t)
Raise and log the error somewhere and keep going
Use the type of error to retry the thing you’re trying to do

If you’re not going to do one of those things, there’s no point in even considering error handling (and to a large extent, transactions) in your procedures.

Let’s talk through some examples of each one.

Is This Right?

Here’s a simple example: You have a stored procedure that accepts start and end date parameters.

What do you want to do if someone passes in a start date that’s after the end date?

Throw an error and call someone a bonehead
Be nice and swap them around
Be psychic and change them to dates that they probably meant
Add or subtract time from one or both of the parameters

But let me ask you: if this is coming from a a front end application, why would you wait until the query gets to SQL Server to figure out that the start date is after the end date?

The application should tell you that. The application is capable of telling you that. The application server is a lot cheaper than the SQL Server.

You don’t need error handling in SQL Server for that.

Is This Permanent?

There may be times when it’s sensible to halt a stored procedure when it hits an error.

Let’s say a critical step is to populate a temporary object. If that step fails, all further effort is wasted because no results will be returned.

You could also have a check to look at the ROWCOUNT_BIG() after the insert to have it bail if zero rows end up in the temporary object, too. But that’s not an error, really, that’s just pure human kindness.

Other times, like if your query is just returning a bunch of selects (or if a minor snag somewhere only has a minor impact on the overall results), an error in one place might not be enough of an issue to stop everything.

If you want everything to stop and rollback, use SET XACT_ABORT ON; at the beginning of your stored procedure.
If you want to accept small errors as long as you get the larger result set, don’t do anything.
If you want to log and swallow transient errors, use an empty TRY/CATCH block around queries that may not matter to the outcome.

There are situations where all of those things make sense. Make sure you choose the right one for your arrangement.

A good example of this is in sp_BlitzLock. Sometimes part of the script will fail to get Agent job names depending on permissions and stuff, but that doesn’t impact the overall analysis much.

Try And Retry

There are plenty of good uses for retrying things. Like the start/end date example above, this is usually more wisely done in application code.

But if your stored procedure isn’t being called by an application, doing it in T-SQL isn’t a horrible alternative.

In my life, I’ve used retry logic in stored procedures to:

Catch deadlock errors and attempt the modification again
Wait for a lock timeout and try again
Check in sp_getapplock locks and try again
Look for available records in a queue and try again

I’m sure there are more, but these are some simple examples where writing retry logic into T-SQL was the best available option.

You Don’t Always Need Error Handling

Some processes require a certain amount of robustness in terms of a few things:

Error handling and logging
Retrying certain types of transactions
Self-healing in the event of certain failures

I didn’t talk much about the self-healing part, because it’s pretty complicated depending on what you’re doing.

But as a small preview, let’s say you’re building a process to move data from one big table to a more normalized schema.

You have to anticipate failures, and be able to pick back up from any point when the process is re-executed.

Exciting times we live in, no?

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: Formatting Code

Posted on December 7, 2023December 7, 2023 by Erik Darling

Spaces

First, a list of things that are horrible:

Tabs
Leading commas
All lowercase
All uppercase
Not putting new things on new lines (from join on where and or order by group by)
Not indenting things
Leaving AS out of table aliases
Leaving out column aliases
Top expressions without parentheses
Not aligning parentheses

There are others, but I’m tired of thinking about things that are demonstrably wrong to do, like putting ice in scotch or sugar in wine.

Thank you for complying.

Not All Code

When you think about formatting code, you probably think primarily of organizing a single query so that it’s inoffensive to civilized society.

But you should also apply this to your code as a whole, too. People who use words wrong will call this “holistic”.

Create all your temporary objects in the same section of code
Declare all your variables in the same section of code
Set as many of your variables as you can in the same block of code
Do as much of your validation in the same section of code as you can

Few things (aside from the list at the beginning of the post) are more grating than seeing random things being created and declared at random points in the code.

It makes things far more difficult to scope, follow, and understand.

I’ll lend some leniency to using SELECT INTO here, because sometimes that’s the best way to get a fully parallel insert.

I’ll also lend some leniency to cursors, because sometimes it’s only sensible to declare them when they’ll be used.

Consistency Is Key

Formatting should follow a consistent set of (my) rules. It is sometimes possible to get used to inferior style choices as long as long as they’re predictable.

What’s impossible to get used to is code formatting that’s constantly changing gears.

There are many ways to format code out there in the world. Some are more acceptable than others.

For example, some companies may shy away from online code formatting tools because who knows who might see the code.

So much valuable intellectual property exists in your “joins” and “column names” and such. Someone else may steal the bright idea to also key on “identifiers”.

I personally use SQL Prompt, because it gets me about 80% of the way to how I want code to look.

While I do wish there were more ways to implement single formatting rules, like changing how columns are aliased from the shabby “column AS new_name” to the correct “new_name = column” format, it’s often trivial to tweak certain things after applying everything.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: How To Leave Good Comments

Posted on December 6, 2023December 5, 2023 by Erik Darling

Attn Pls.

Most code I look at has been a disaster in the making for many years, worked on by many people who seemed to find one thing that sort of worked, and kept doing it.

I can’t lie: there are areas of my life where I do the same thing. SQL Server is the one area where I confidently know the rules, when I can break them, and explain reasons for both.

It’s not that I don’t make mistakes, it’s just that I learn well from them.

Usually.

There are others things that I’ll do the same way for ages, thinking they’re needlessly complicated, before looking into less needlessly complicated ways of doing them.

What’s shocking is that sometimes there’s no better way of doing things.

Like xquery. What were they thinking?

Find Yourself

Possibly the least helpful, but most humorous, way of leaving comments, is a large block of green text up at the top of a module.

There are all sorts of helpful insights buried in those comments to help me as a consultant understand my audience.

But… even though a lot of changes are described, there are often dozens to hundreds of queries in those stored procedures and function, or the view definition has 50 joins that join to 50 views that have 50 joins in them.

Fixed performance??? Avoided parameter sniffing??? Needed DISTINCT to fix a bug???

WHERE? WHICH QUERY?

SHOW ME TO YOUR MAGICK!

If you use a ticketing system, and there’s an issue number in your comment way up at the top of your stored procedure, add the ticket number in where you changed the code.

If there’s not, make one up. Give me some way to figure out where a change was made.

The only thing more frustrating is seeing some chunk of code quoted out, with no explanation of why it was quoted out.

Explain Yourself

In my stored procedures, I like to quote any code blocks as I go so they’re easier for me to find when I scroll through things.

Not because I can’t read my own code (okay, sometimes I get a lil’ lost), but because there’s often a lot of repetitive code, and it’s usually easier to read plain-English green text when I’m looking for something.

Here’s an example from sp_LogHunter:

This code isn’t hard to understand, but it’s a tiny chunk of code, and pretty easy to miss if I were looking for it.

Leaving little notes about even small sections of code can make life a lot easier for others trying to work for it.

Block It Off

While we’re on the subject, I must beg this of you: USE BLOCK QUOTES

Like I talk about in the post, using the -- method can screw a lot of things up.

Depending on where you get the query from, and where the query originates, the whole query might end up on one line.

Having a bunch of double-dash comments makes it really hard to distinguish what’s code, what’s comment, and what’s quoted out code.

Worse, it breaks most every automatic code formatting tool out there.

They can’t figure it out either.

Tidbits

Comments in code are good for all sorts of things. Comments on the internet generally are mistakes and should be ignored.

Leaving comments not only helps others, but can really help you.

Think about how many times you’ve written code with a bunch of BEGIN/END blocks, and you get totally lost in which begin goes with which end and how many ends you need to stop getting an error at the very end of the stored procedure.

Heck, even while writing this post I went through a couple of my bits of code to add better comments in.

Thanks for reading!

Going Further

The Art Of The SQL Server Stored Procedure: ANSI/SET Options

Posted on December 5, 2023December 7, 2023 by Erik Darling

Top Down

I start off all of my stored procedures with the following block:

SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
SET IMPLICIT_TRANSACTIONS OFF;
SET STATISTICS TIME, IO OFF;
GO

This aligns my stored procedures with the necessary settings to accomplish a couple things:

Allow the optimizer to use indexed views, filtered indexes, and computed columns
Avoid errors when modifying tables involved with indexed views, filtered indexes, and computed columns

Many ORMs mess up with ARITHABORT, and let’s be honest, most of them time ORMs aren’t calling stored procedures.

If you’ve spent some time right-clicking around in SSMS at all, you might have seen how missing index requests and various module templates are generated.

They all start off with enabling ANSI_NULLS and QUOTED_IDENTIFIER, but there are some others that it makes total sense to include.

That’s why I have that list up there.

AS BEGIN

Inside of your stored procedures, you may want to do some other stuff, depending on how you’ll use it.

Inside of a real, production-ready stored procedure, you’ll probably want to do this:

SET NOCOUNT, XACT_ABORT ON;

If your stored procedure are being called by a Java or Python app, you’ll probably want this too:

SET IMPLICIT_TRANSACTIONS OFF;

Because by default, they both use implicit transactions, and you probably don’t want those happening across all the queries in a stored procefure.

That can cause a whole hell of a lot of blocking.

NOCOUNT is of course a good idea to not send additional results back.

XACT_ABORT is a great idea for stored procedures with multiple queries that modify data, because it allows you to roll all of the changes they made back without a TRY/CATCH block surrounding the whole ordeal.

If your query is just reading a bunch of data from different places and there’s nothing to roll back, XACT_ABORT matters not a bit.

Otherwise, use it, because it is wise to do so.

Locally Owned

If you’re writing a stored procedure (like most of my troubleshooting procedures) where you don’t want to:

Fail completely if you hit one error
Get blocked by all the transient stuff that locks system views and functions
Accidentally get query plans for your troubleshooting procedures

You’ll probably want to do something like this:

SET STATISTICS XML OFF;
SET NOCOUNT ON;
SET XACT_ABORT OFF;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

That’s what I do inside all of my admin procedures to avoid those things.

Of course, you should not use some of those if you care about:

Easily troubleshooting query performance (getting actual execution plans in SSMS)
Leaving some changes in place when another part of the procedure fails (XACT_ABORT)
Not returning data from potentially dirty reads (READ UNCOMMITTED)

SET NOCOUNT on is still a generally wise idea, regardless of the intent or application of the code.

Deal with it.

Exercise

I’ve been bitten by a lot of different weird things by not doing this stuff in the past.

At one point I had suggested a really helpful filtered index to a client that used Cold Fusion as a front end, which used some weird driver (Adobe, maybe?) that used a bunch of bad ANSI settings.

Implementing the index immediately caused a bunch of queries coming from there to start failing. It was an easy fix — rebuild the index without the filter — tada!

But the better fix was to move the queries to a stored procedure that requested the correct SET options, and re-filter the index.

Consulting life teaches you a lot of weird lessons, probably the first is to be paranoid of everything, and never trust that even the largest companies are doing anything correctly.

In tomorrow’s post, we’ll talk about one of my favorite subjects: helpful and professional (ha ha ha) code comments.

Thanks for reading!