Improving The Performance Of Modifications In SQL Server

Pssst!

If you landed here from Brent’s weekly links, use this link to get my training for 90% off.

The access is for life, but this coupon code isn’t! Get it while it lasts.

Discount applies at checkout, and you have to purchase everything for it to apply.

Obviously

There are going to be situations where it’s smarter to change different aspects of code like this:

But I know how it is out there! Sometimes it’s hard to get in and change a bunch of logic and tinker with things.

In some cases, you can improve performance by wrapping chunks of code in transactions.

Fear Of Commitment

In this example, there’s an automatic commit every time the update completes. That means every time we step through the loop, we send a record to the transaction log.

This can result in very chatty behavior, which even good storage can have a tough time with. There are likely other aspects of transaction logging impacted by this, but I only have so much time before this call starts.

SET NOCOUNT ON;

DECLARE 
    @cur_user int = 0,
    @max_user int = 0;

SELECT 
    @cur_user = MIN(u.Id), 
    @max_user = MAX(u.Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL;

WHILE @cur_user <= @max_user
BEGIN

    UPDATE u
        SET u.Age = DATEDIFF(YEAR, u.CreationDate, u.LastAccessDate)
    FROM dbo.Users AS u
    WHERE u.Id = @cur_user
    AND   u.Age IS NULL;

    SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user);

END;

This code runs for nearly 5 minutes before completing. Looking at a ~60 second sample turns up some gnarly gnumbers.

Batched Commit

Without changing the logic of the update, we can get things in better shape by using transactions and periodically committing them.

SET NOCOUNT ON;

DECLARE 
    @rows bigint = 0,
    @cur_user int = 0,
    @max_user int = 0;

SELECT 
    @cur_user = MIN(u.Id), 
    @max_user = MAX(u.Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL;

BEGIN TRANSACTION;

WHILE @cur_user <= @max_user
BEGIN

    UPDATE u
        SET u.Age = DATEDIFF(YEAR, u.CreationDate, u.LastAccessDate)
    FROM dbo.Users AS u
    WHERE u.Id = @cur_user
    AND   u.Age IS NULL;

    IF @rows = (@rows + @@ROWCOUNT)
    BEGIN
        COMMIT TRANSACTION;
        RETURN;
    END;
    ELSE
    BEGIN
        SET @rows = (@rows + @@ROWCOUNT);
        SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user AND u.Age IS NULL);
    END;

    IF @rows >= 50000
    BEGIN
        RAISERROR('Restarting', 0, 1) WITH NOWAIT;
        SET @rows = 0;
        COMMIT TRANSACTION;
        BEGIN TRANSACTION;
    END;

END;

IF @@TRANCOUNT > 0
COMMIT

The first thing we’ll notice is that the code finishes in about 1 minute rather than 5 minutes.

How nice! I love when things move along. The metrics look a bit better, too.

We have almost no waits on WRITELOG, and we write far less to the transaction log (35MB vs 13MB).

We also got to do some snazzy stuff with @@ROWCOUNT. Good job, us.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

8 thoughts on “Improving The Performance Of Modifications In SQL Server”

Nice article (but I hope I never have to work with stuff that I can’t rewrite to real batches)

I had to think about two minutes about the sense of
IF @rows = (@rows + @@ROWCOUNT)
Why do you not just write
IF @@ROWCOUNT = 0
to make it absolute clear for everybody – or miss I a point?

And in the “FETCH NEXT-Statement you ar missing the check for Age IS NULL (without this it would quit the loop after the first already aged user :-))
SET @cur_user = (SELECT MIN(u.Id) FROM dbo.Users AS u WHERE u.Id > @cur_user AND u.Age IS NULL);

Erik Darling says:

February 22, 2021 at 12:49 pm

Good catch on that! Fixed!

Doesn’t the test of the value of @@ROWCOUNT with the IF end up resetting the value of @@ROWCOUNT?
i.e. the call “SET @rows = (@rows + @@ROWCOUNT);” then does nothing.
I’d normally put it (and @@ERROR too) into e.g. @local_rowcount if I needed to evaluate/use it multiple times.

Erik Darling says:

February 23, 2021 at 9:18 am

The lovely part about posts like this is that you can copy and paste the code out and test it ?
1. Richard says:
  
  February 23, 2021 at 9:41 am
  
  I did 🙂 I added a statement after the one I highlighted:
  PRINT ‘[Rows] = ‘ + CAST(@rows AS varchar(10));
  Plus more raiserror calls after the commits to see which ones fired.
  I created a small table of 10 rows with a limit of 3 per “batch” instead of 50,000.
  The PRINT displayed 0 rows every time and only called the final commit after testing @TRANCOUNT.
  1. Erik Darling says:
    
    February 23, 2021 at 9:47 am
    
    Weird, maybe I messed something up making the code blog-friendly. Running through it over here produced the results that I wanted.
    1. Richard says:
      
      February 23, 2021 at 9:56 am
      
      I know!
      I was trying to work out whether there were any SSMS settings which might have an impact.
      I’m on SQL2016 SP2 CU16 here with SSMS 18.8.
      There seems to be lots of discussion on the net about whether IF resets @@ROWCOUNT and some people convinced that only rowcount-affecting statements update it, other people not so much.

Pingback: Can you switch recovery mode to simple from full in an Always ON cluster setup?

Comments are closed.