Building Reusable Queues In SQL Server Part 2

The Tenant


In part 1, we designed a table for our process to work off of. Now we’re going to look at how we can write a procedure that works off of the table effectively.

For the table, our main priorities were indexing to make sure we can find work easily, and not allowing duplicate items.

For our worker procedure, the main goals are going to be

  • Looping until we run out of work
  • Finding and reserving work with minimal locking
  • Making that process atomic

Thankfully, this is a lot easier than it sounds. The table design does a lot of the work for us.

Queue Ball


I’m going to step through the code, and then show you the fully assembled babby at the end.

First up, making a loop happen:

WHILE EXISTS
(
  SELECT 1/0
  FROM dbo.four_queue AS fq WITH(READPAST)
  WHERE fq.in_process = 0
)

While any rows that aren’t in process exist in the table, we want this to keep picking up work. If you need an eternal loop, just use 1 = 1 or something.

We’re using readpast here, because we won’t want to hit any blocking every time we check to see if we need to go back into the loop.

Second up, finding and reserving work

WITH q4 AS
(
    SELECT TOP (1) fq.*
    FROM dbo.four_queue AS fq WITH(READPAST, ROWLOCK, UPDLOCK)
    WHERE fq.in_process = 0
    ORDER BY fq.id
)
UPDATE q4
    SET q4.in_process = 1,
        q4.start_date = SYSDATETIME(),
        @id = q4.id,
        @reputation = q4.reputation
FROM q4;

This is where all the cool stuff is, to be honest.

In our CTE, we isolate a single row. The TOP (1) with an ORDER BY gives us the lowest id that hasn’t been processed yet.

With the SELECT, we’re using some locking hints:

  • Readpast: Means we can skip over locked rows, but not ignore locks like NOLOCK would do
  • Rowlock: We nicely ask the optimizer to only lock a single row
  • Updlock: Even though this is a SELECT, take an UPDATE lock (LCK_M_U)

The UPDATE that occurs outside of the SELECT is pretty nifty, too. I tried to get OUTPUT to work here, but the variable assignment seemed to disagree with it.

A lot of people don’t know that you can assign a value to a variable like this, but anyway.

  • Mark the row as being processed
  • Get the processing start time
  • Assign the table id to a parameter
  • Assign the reputation we want to find to a parameter

Thirdly and thusly

DECLARE @sql NVARCHAR(MAX) = N'
SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
WHERE u.Reputation = @i_reputation;
';

RAISERROR(@sql, 0, 1) WITH NOWAIT;
EXEC sys.sp_executesql @sql, N'@i_reputation int', @reputation;

We’ll pass the @reputation to some dynamic SQL and execute it. Like I mentioned before, this is just to keep the example brief. There’s absolutely no need to do this.

A fifth of fifthly

UPDATE fq
    SET fq.end_date = SYSDATETIME()
FROM dbo.four_queue AS fq
WHERE id = @id
AND   in_process = 1;

We’ll use the id parameter to update the end time, so we can find any processes that took a long time and look into them.

Out Of The Way


In real life, I’d probably want some error handling in here, but for a quick blog post example, this is good enough.

Now I can use a tool like SQL Query Stress to throw a bunch of workers at it and see if I run into any blocking or deadlocking.

We know there aren’t any deadlocks, because SQS doesn’t have any errors in it.

CHEERS

All ~20k rows get processed in about 26 seconds.

We can even use sp UNDERSCORE HumanEvents to look at wait stats for the duration.

EXEC sp_HumanEvents @event_type = 'waits', @seconds_sample = 30, @database_name = 'StackOverflow2013';
Humanly

Not too shabby, eh?

Thanks for reading!

CREATE OR ALTER PROCEDURE dbo.queue_four(@id INT, @reputation INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

WHILE EXISTS
(
  SELECT 1/0
  FROM dbo.four_queue AS fq WITH(READPAST)
  WHERE fq.in_process = 0
)
    BEGIN
    
        WITH q4 AS
        (
            SELECT TOP (1) fq.*
            FROM dbo.four_queue AS fq WITH(READPAST, ROWLOCK, UPDLOCK)
            WHERE fq.in_process = 0
            ORDER BY fq.id
        )
        UPDATE q4
            SET q4.in_process = 1,
                q4.start_date = SYSDATETIME(),
                @id = q4.id,
                @reputation = q4.reputation
        FROM q4;
        
        DECLARE @sql NVARCHAR(MAX) = N'
        SELECT COUNT_BIG(*) AS records
        FROM dbo.Users AS u
        WHERE u.Reputation = @i_reputation;
        ';
        
        RAISERROR(@sql, 0, 1) WITH NOWAIT;
        EXEC sys.sp_executesql @sql, N'@i_reputation int', @reputation;
    
        UPDATE fq
            SET fq.end_date = SYSDATETIME()
        FROM dbo.four_queue AS fq
        WHERE id = @id
        AND   in_process = 1;
    
    END;

END;

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Building Reusable Queues In SQL Server Part 1

How Long Must I Stay On This Stuff?


One thing about queues in databases: hey kinda suck. But with a little finesse, there are ways to make them function pretty well.

Before you ask about Service Broker: it’s usually overkill, and unnecessary XML in these scenarios.

When you’re building queues, there are lots of different things to consider.

  • Do I want to reuse it?
  • Can I just dequeue (delete) items?
  • Do I need to process things in any order?
  • Is there a “pending” element to the processing?

An example of a pending queue is in sp_AllNightLog, where there’s a list of databases that worker Agent jobs hit to take log backups repeatedly.

Four Queue


Let’s gin up a simple scenario, and say we wanna stick a list of unique Reputations from the Users table into another table, and then have a whole bunch of queries work off of it to do something.

The something in this case is getting a simple count, which is just to be pithy, and probably a bad use of a queue IRL.

Here’s what my ideal table would look like:

CREATE TABLE dbo.four_queue
(
    id BIGINT IDENTITY NOT NULL,
    in_process BIT NOT NULL DEFAULT 0,
    reputation INT NOT NULL,
    start_date DATETIME NULL DEFAULT GETDATE(),
    end_date DATETIME NULL DEFAULT GETDATE() + 0.00001,
    duration_milliseconds AS (DATEDIFF(MILLISECOND, start_date, end_date)),
    CONSTRAINT big_idea PRIMARY KEY CLUSTERED(in_process, id),
    INDEX fq_uq_r UNIQUE NONCLUSTERED (reputation) WITH(IGNORE_DUP_KEY = ON)
);

A little explanation of the table: this is good for a reusable, ordered queue, that’ll give us some feedback on how long things took. It could also be used if there were a pending element, but we’d probably wanna change the indexing so that we could find either the last start time, or the last end time efficiently.

The thing that probably needs the most explanation here is the indexing and constraints. Because I know you, and you’re staring at the way I have my primary key, and you’re getting an itch and a twitch. That’s okay, you’ve been conditioned for years to put the most selective column first. Unfortunately, that won’t work here.

It’ll make a little more sense in part 2 when we take a closer look at the queries, but the gist of it is that when we’re searching for rows to process (in_process = 0), we want to be able to find them quickly, which is harder to do when it’s the second column in the index and we’re not searching on the id column.

We’ve also got a fun constraint on there to make sure we only have unique values for Reputation, but we’re just going to silently throw away duplicates without throwing an error. This is there just in case someone tries to be sneaky and insert a row on their own to thwart our Perfect Queue Process™

Data Dumping


The cool part about the way the table is set up is that we don’t need to do anything special to get data into it.

INSERT dbo.four_queue ( reputation )
SELECT u.Reputation
FROM dbo.Users AS u;
Insertion Technique

Fair warning here, IGNORE_DUP_KEY does fun stuff to our identity column:

Sequential is inconsequential.

Since the identity doesn’t have any meaning beyond the order that rows were inserted in, it doesn’t matter that it’s not sequential. Order is order, even with gaps.

If you need it to be sequential for some in(s)ane reason, don’t use an identity and use ROW_NUMBER instead.

Totally Tabular


We’ve got a table, and we’ve got data in it.

In Part 2, we’ll look at designing a procedure to work off the queue.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Best Worst Code That Never Was

We All Hear What We Want To Hear


When Kendra Little (b|t) announced the #BESTWORSTCODE contest, I thought we’d be showing off bad and quirky code that we’d seen or written.

So I went digging through my personal archives, did a little updating, and proudly came up with the below examples.

It turns out that wasn’t at all what was going to happen! Kendra, of course, had much better ideas. And now I have the pubs database probably forever.

Anyway, presented without comment is what I was going to submit.

Thanks for reading!

/*We don't really have to name anything*/
DECLARE @ TABLE(_ INT DEFAULT 0);
INSERT @ (_)
DEFAULT VALUES;

/*Do we ever divide?*/
SELECT ROW_NUMBER() OVER(ORDER BY 1/0) AS [1/0]
FROM @ #@
WHERE EXISTS
(
    SELECT 1/0 AS [1/0]
    FROM @ AS ##@
    WHERE #@._ = ##@._
);


/* Highly illogical. ??? */
IF (1 + 2 = 4)
    BEGIN
        DECLARE @ TINYINT;
    END;
SELECT @ = 1;
SELECT @ AS [?];
GO

IF (2 + 1 = 4)
    BEGIN
        DECLARE @ TINYINT = 1/0;
    END;
SELECT @ AS [?];
GO


/*Conventional Naming.*/
DROP TABLE IF EXISTS #;

CREATE TABLE # (int INT, INDEX xml CLUSTERED (int));
INSERT # ( int )
SELECT varbinary.sysname
FROM 
(
    SELECT TOP (1000 * 1000) 1e0 AS sysname
    FROM sys.messages AS smallmoney
    CROSS JOIN sys.messages AS money
) AS varbinary;


/*This is fine.*/
SELECT TOP (2-1) sql_variant.int
FROM # AS sql_variant;

/*This is also fine.*/
ALTER TABLE # ADD real AS int / 0e0;

/*How many errors should I get?*/
SELECT SUM(ROWCOUNT_BIG()) AS float
FROM # AS sql_variant
GROUP BY ();


/*Oh, now you care.*/
CREATE INDEX columnstore ON #(real);

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Giving Back: Free And Discounted Online Training

Launchtime


Just about a month ago, I launched my online training.

It had been in the works for a while, as I was building up material I was really happy with, alongside working with my smart, wonderful, good-looking clients.

And, you know, those family people. I can hear them once in a while.

It has been a spectacular month, especially considering all that’s going on. As of this writing (at just before the 30 day mark), I’ve given away nearly $40k in training to those in need.

That number doesn’t include other sales — it’s just the giveaways — and for some reason that report cost me a $150.

Free like bears

It’s been incredibly gratifying helping everyone going through a tough time right now. It’s something I plan to keep doing, though hopefully not because of these pesky global pandemic things, that have ruined more brunches than cheap prosecco and broken heels.

After that, you’ll have to wait for my Spooky Samhain sale.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2019 Disappointment: sp_estimate_data_compression_savings

Missed It The First Time


This is documented, but I know a lot of people are allergic to documentation, whether it’s writing or reading.

In SQL Server 2019:

  • Exciting stuff: In memory tempdb!
  • Exciting stuff: sp_estimate_data_compression_savings can evaluate columnstore compression!
  • Disappointing stuff: If you use in memory tempdb, you can’t have any columnstore anything in tempdb

That means if you’re using sneaky tricks like clustered columnstore indexes on temp tables to induce batch mode, you’re gonna get a lot of errors.

Likewise, you won’t be able to evaluate if columnstore will help your tables.

EXEC sp_estimate_data_compression_savings 
     @schema_name = 'dbo',
     @object_name = 'Badges',
     @index_id = 1,
     @partition_number = 1,
     @data_compression = 'COLUMNSTORE';

Msg 11442, Level 16, State 1, Line 4
Columnstore index creation is not supported in tempdb when memory-optimized metadata mode is enabled.

There’s no workaround for this, either. You can’t tell it to use a different database, this is just the way it’s built.

Hopefully in the future, there will be more cooperation between these two features.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Constraints And Foreign Keys Can Hurt Data Loading Performance In SQL Server

Oh So Helpful


I got a question from a past client recently about how a new data loading tool was side-stepping constraints. After taking a quick look at what was running, it was using BULK INSERT statements to do the load.

There’s nothing wrong that that! I’ve used it quite a bit, and found it very handy to efficiently load data from files. There really is a ton you can do with it.

But with flexibility comes, well, happy accidents.

Check Me Out


To show you what I mean, I’m going to use some quick and dirty examples. This isn’t meant to show you a perfect export process by any means.

I’m going to use BCP to output a small text file, and load it into a table in another database.

EXEC xp_cmdshell 'bcp "SELECT TOP (1000) * FROM StackOverflow2013.dbo.Badges ORDER BY Id;" queryout "c:\temp\Badges.txt" -w -T -S NADABRUTO\SQL2019';

Queue Ball


Let’s look at the table for a minute. There’s not a lot going on with it, but there is a check constraint:

ALTER TABLE Crap.dbo.Badges 
    ADD CONSTRAINT ck_bootypie 
        CHECK(LOWER(Name) NOT IN (N'beavis', N'butthead'));

Yes, it’s a bit of nonsense, but it doesn’t need to be realistic to show you the side effect.

It just has to entertain me for the duration of writing this post while stone cold sober.

Mostly.

See, right now, it’s trusted. Or not not trusted.

SELECT cc.name,
       SCHEMA_NAME(cc.schema_id) AS schema_name,
       OBJECT_NAME(cc.parent_object_id) AS object_name,
       cc.type_desc,
       cc.is_not_trusted
FROM sys.check_constraints AS cc
So solo

Inserts From A Different Room


Let’s take a regular old insert. If we do this, our constraint will still be trusted.

INSERT Crap.dbo.Badges ( Name, UserId, Date )
SELECT TOP (1000) Name, UserId, Date 
FROM StackOverflow2013.dbo.Badges ORDER BY Id;

You’ll have to trust me on that. I mean, would anyone really know if I reused the same picture of the output?

Now here’s an insert with BULK INSERT.

BULK INSERT dbo.Badges
FROM 'c:\temp\Badges.txt'
WITH
    ( DATAFILETYPE = 'widechar',
      BATCHSIZE = 1048576,
      CODEPAGE = 'RAW',
      FIRSTROW = 1,
      FIELDTERMINATOR = '\t',
      ROWTERMINATOR = '\n',
      KEEPNULLS,
      MAXERRORS = 2147483647,
      CHECK_CONSTRAINTS, /*Quote me in, Quote me out*/
      TABLOCK );

If you notice the second to last option there, CHECK_CONSTRAINTS? That’s what SQL Server uses to figure out if it should evaluate constraints when the file imports.

This goes for both foreign keys and check constraints. It does not go for primary key or unique constraints. Those will still throw errors if violated.

If you run your BULK INSERT commands without that included, they’ll be marked as untrusted after you insert data.

But now it comes time to choose your own adventure.

The Optimizing Game


Now that we know the behavior, we can examine our choices.

Option 1: Evaluate Constraints On Load

  • This can slow down data loads
  • May cause additional blocking
  • You won’t have any work to do later
  • But… your loads may fail.
  • Then you’ll have to go look in text files to figure out why

Option 2: Load it all, deal with issues later

  • Data loads will be quicker
  • You’ll likely see less blocking
  • But you’ll have to try to re-trust your check constraints afterwards
  • If you’ve got large tables, or no good indexes to support constraints, this can be awful
  • Trusting constraints may fail, but figuring out which rows are broken is probably easier

If we want to re-trust our constraint, we’ll need to run a command like this:

ALTER TABLE Crap.dbo.Badges 
    WITH CHECK CHECK CONSTRAINT ck_bootypie;

If you wanted to build dynamic commands to re-trust them across a bunch of tables, you’d need something like this:

SELECT N'ALTER TABLE ' 
       + QUOTENAME(s.name) 
       + N'.' 
       + QUOTENAME(o.name) 
       + N' WITH CHECK CHECK CONSTRAINT '
       + QUOTENAME(f.name) AS utrusted_fk
FROM sys.foreign_keys AS f
INNER JOIN sys.objects AS o
    ON f.parent_object_id = o.object_id
INNER JOIN sys.schemas AS s
    ON o.schema_id = s.schema_id
WHERE f.is_not_trusted = 1;


SELECT N'ALTER TABLE ' 
       + QUOTENAME(s.name) 
       + N'.' 
       + QUOTENAME(o.name) 
       + N' WITH CHECK CHECK CONSTRAINT '
       + QUOTENAME(c.name) AS utrusted_ck
FROM sys.check_constraints AS c
INNER JOIN sys.objects AS o
    ON c.parent_object_id = o.object_id
INNER JOIN sys.schemas AS s
    ON o.schema_id = s.schema_id
WHERE c.is_not_trusted = 1;

Go ahead and give those a run. You might be surprised what you find.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Error Handling In SQL Server Cursors

Spit My Last Error


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

If you wanna play with the code I used to try things out on your own, here it is:

CREATE TABLE zero_ten (id INT NOT NULL PRIMARY KEY);

INSERT dbo.zero_ten ( id )
SELECT 0 AS id
UNION ALL 
SELECT TOP (10) ROW_NUMBER() OVER(ORDER BY 1/0)
FROM sys.messages AS m

SELECT * FROM dbo.zero_ten AS zt;
RETURN;

DECLARE c CURSOR LOCAL STATIC FOR
SELECT N'DECLARE @atmosphere INT; SELECT @atmosphere = 1/' + RTRIM(zt.id)
FROM dbo.zero_ten AS zt;

DECLARE @joydivision NVARCHAR(MAX) = N'';
DECLARE @errors TABLE
(
    id INT PRIMARY KEY IDENTITY,
    error_number INT,
    error_severity INT,
    error_state INT,
    error_message NVARCHAR(MAX)
);

OPEN c;
FETCH NEXT FROM c
INTO @joydivision;

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        RAISERROR(@joydivision, 0, 1) WITH NOWAIT;
        EXEC sys.sp_executesql @joydivision;
        FETCH NEXT FROM c INTO @joydivision;
    END TRY
    BEGIN CATCH
        INSERT @errors ( error_number, error_severity, error_state, error_message )
        SELECT ERROR_NUMBER(), ERROR_SEVERITY(), ERROR_STATE(), ERROR_MESSAGE();
        --FETCH NEXT FROM c INTO @joydivision;
    END CATCH;
END;

SELECT *
FROM @errors AS e;

CLOSE c;
DEALLOCATE c;
GO

 

Extracting Only Letters Or Numbers From A String In SQL Server

All The Bad


There’s a lot of terrible advice out there about how to do this. It’s terrible because it involves the kinds of functions that really hurt performance: the scalar and multi-statement variety.

Worse, they’re usually full of while loops that build strings. These kinds of scalar functions aren’t currently eligible for inlining in 2019 either, so don’t think you’re going to get helped there, because they build strings.

SELECT with variable accumulation/aggregation (for example, SELECT @val += col1 FROM table1) is not supported for inlining.

Ain’t Perfect


I don’t think my solutions are perfect. Heck, doing this with T-SQL at all is a bad idea. You should be using CLR for this, but CLR has had so little support or betterment over the years, I don’t blame you for not embracing it. My dear friend Josh has taken the liberty of doing this part for you.

It would be nice if SQL Server had the kind of native support for writing in other languages that free databases do (especially since SQL Server supports Python, R, and Java now). But you know, we really needed uh… Well, just pick any dead-end feature that’s been added since 2005 or so.

My solutions use a numbers table. You’re free to try replacing that aspect of them with an inlined version like Jeff Moden uses in his string splitter, but I found the numbers table approach faster. Granted, it’s also less portable, but that’s a trade-off I’m willing to make.

What I don’t like about either solution is that I have to re-assemble the string using XML PATH. If you’ve got another way to do that, I’m all ears. I know 2017 has STRING_AGG, but that didn’t turn out much better, and it wouldn’t be usable in other supported versions.

Both scripts are hosted on my GitHub repo. I don’t want to set the example of using a blog post as version control.

Getting Numbers


If you need to isolate only the numbers from a string:

SELECT u.DisplayName, gn.*
FROM dbo.Users AS u
    CROSS APPLY dbo.get_numbers(u.DisplayName) AS gn
WHERE u.Reputation = 11;

Getting Letters


If you need to get just the letters from a string:

SELECT u.DisplayName, gl.*
FROM dbo.Users AS u
    CROSS APPLY dbo.get_letters(u.DisplayName) AS gl
WHERE u.Reputation = 11;

Complaint Department


If you’ve got ideas, bugs, or anything else, please let me know on GitHub. I realize that both scripts have holes in them, but you may find them good enough to get you where you’re going.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Best SQL Server Training On The Internet

Binoculars


Look, there’s a lot of training out there. If it’s not expensive, it’s outdated, and that’s not fair to you.

Getting a quality, up to date education in SQL Server performance tuning should be affordable, even when there’s not a global pandemic.

That’s what I’m looking to provide to you, dear reader.

About You


You’re a DBA or Developer who needs to buckle down and focus on performance tuning SQL Server, and you need a clear path to gain understanding and expertise. Maybe you need to learn about something that you’re seeing in a query you’re trying to tune. It could be parameter sniffing; it could be a key lookup — doesn’t matter.

You’re busy, and you don’t have hours or days to watch long courses, hoping what you need to learn is in there. Over here, you can zoom to what you care about, and spend focused time learning about it. No distractions, no fast forwarding, no trying to remember where the solution to your problem was.

Whether you need to know how to fix a problem quickly, or you want to deeply understand it and gain knowledge you can use for the rest of your career, my training library has got you covered.

Best of all, learning doesn’t stop when the video ends. I’m here to help you keep learning.

About Me


I’m Erik, and all I focus on is SQL Server performance tuning.

Queries, indexes, and hardware are my breakfast, lunch, and dinner. Oxford commas are light snacks.

Right now, I’ve got over 24 hours of training videos dedicated to those subjects available at prices that can’t be beat.

Unless you have a coupon!

And you do! Wow! What a sales pitch. Four seams on that one.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Are Table Variables Different In SQL Server 2019?

WallaWallaBingBang


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.