An Undocumented Restriction For Parallel Inserts In SQL Server That Can Kill Performance

Insert Card


I’d like to start this post off by thanking my co-blogger Joe Obbish for being lazy and not blogging about this when he first ran into it three years ago.

Now that we’re through with pleasantries, let’s talk turkey.

Over in this post, by Arvind Shyamsundar, which I’m sure Microsoft doesn’t consider official documentation since it lacks a GUID in the URL, there’s a list of… things about parallel inserts.

  • Just as it is with SQL Server 2016, in order to utilize the parallel insert in Azure SQL DB, do ensure that your compatibility level is set to 130. In addition, it is recommended to use a suitable SKU from the Premium service tier to ensure that the I/O and CPU requirements of parallel insert are satisfied.
  • The usage of any scalar UDFs in the SELECT query will prevent the usage of parallelism. While usage of non-inlined UDFs are in general ‘considered harmful’ they end up actually ‘blocking’ usage of this new feature.
  • Presence of triggers on the target table and / or indexed views which reference this table will prevent parallel insert.
  • If the SET ROWCOUNT clause is enabled for the session, then we cannot use parallel insert.
  • If the OUTPUT clause is specified in the INSERT…SELECT statement to return results to the client, then parallel plans are disabled in general, including INSERTs. If the OUTPUT…INTO clause is specified to insert into another table, then parallelism is used for the primary table, and not used for the target of the OUTPUT…INTO clause.
  • Parallel INSERT is used only when inserting into a heap without any additional non-clustered indexes. It is also used when inserting into a Columnstore index.
  • Watch out when IDENTITY or SEQUENCE is present!

In the actual post, some of these points are spread out a bit; I’ve editorially condensed them here. Some of them, like OUTPUT and UDFs, I’ve blogged about a bazillion times over here.

Others may come as a surprise, like well, the rest of them. Hm.

But there’s something missing from here, too!

Lonesome


Let’s create a #temp table, here.

DROP TABLE IF EXISTS 
    #parallel_insert;

CREATE TABLE
    #parallel_insert
(
    id int NOT NULL
)

Now let’s look at a parallel insert. I’m using an auxiliary Numbers table for this demo because whatever it’s my demo.

INSERT
    #parallel_insert WITH (TABLOCK)
(
    id
)
SELECT
    n.Number
FROM dbo.Numbers AS n
JOIN dbo.Numbers AS n2
    ON n2.Number = n.Number
OPTION
(
    USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')
);

The query plan does exactly what we want it to do, and stays parallel through the insert.

SQL Server Query Plan
heavy bags

Referential


If we drop and re-create the #temp table, and then run this insert instead, that doesn’t happen:

INSERT
    #parallel_insert WITH (TABLOCK)
(
    id
)
SELECT
    n.Number
FROM dbo.Numbers AS n
JOIN dbo.Numbers AS n2
    ON n2.Number = n.Number
AND NOT EXISTS
    (
        SELECT
            1/0
        FROM #parallel_insert AS p
        WHERE p.id = n.Number
    )
OPTION
(
    USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')
);

Note that the not exists is against an empty table, and will not eliminate any rows. The optimizer estimates this correctly, and yet…

SQL Server Query Plan
collect call

The insert is in a fully serial zone. This happens because we reference the table that we’re inserting into in the select portion of the query. It’s no longer just the target for the select to insert into.

Zone Out


If you’re tuning queries like this and hit situations where this limitation kicks in, you may need to use another #temp table to stage rows first.

I’m not complaining about this limitation. I can only imagine how difficult it would be to guarantee correct results in these situations.

I do want to point out that the fully parallel insert finishes in around 250ms, and the serial zone insert finishes in 1.4 seconds. This likely isn’t the most damning thing, but in larger examples the difference can be far more profound.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Avoiding Normalization Contributes To Skewed Data In SQL Server Tables

One Who Overflew


In the Stack Overflow database, the biggest (and probably most important) table is Posts.

The Comments table should be truncated hourly. Comments were a mistake.

But the Posts table suffers from a serious design flaw in the public data dump: Questions and Answers are in the same table.

I’ve heard that it’s worse behind the scenes, but I don’t have any additional details on that.

Aspiring Aspirin


This ends up with some weird data distributions. Certain attributes can only ever be “true” for a question or an answer.

For example, only questions can have a non-zero AcceptedAnswerId, or AnswerCount. Some questions might have a ClosedDate, or a FavoriteCount, too. In the same way, only answers can have a ParentId. This ends up with some really weird patterns in the data.

SQL Server Query Results
breezy

Was it easier at first to design things this way? Probably. But introducing skew like this only makes dealing with parameter sniffing issues worse.

Even though questions and answers are the most common types of Posts, they’re not the only types of Posts. Even if you make people specify a type along with other things they’re searching for, you can end up with some really different query plans.

SQL Server Query Results
4:44

Designer Drugs


When you’re designing tables, try to keep this sort of stuff in mind. It might not be a big deal for small tables, but once you realize your data is getting big, it might be too late to make the change. It’s not just a matter of changes to the database, but the application, too.

Late stage redesigns often lead to the LET’S JUST REWRITE THE WHOLE APPLICATION FROM THE GROUND UP projects that take years and usually never make it.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Changes Coming To SQL Server’s STRING_SPLIT Function: Optional Ordinal Position

Blueprint


In 2016, we got the STRING_SPLIT function. That was nice, because prior implementations had a lot of problems

But out of the gate, everyone looked at what we got and couldn’t figure why this would drop without a column to tell you the position of each element in the string.

Recently I updated SSMS to 18.10, and went to work on a couple scripts that use STRING_SPLIT.

I was immediately confronted by a bunch of RED SQUIGGLY LINES.

Why?

Not Yet But Soon


Huh.

STRING_SPLIT
reminder

Insufficient! You’re insufficient!

STRING_SPLIT
stones

Oh, enable_ordinal. Neat.

STRING_SPLIT
selected

At least it’s a bigint.

STRING_SPLIT
get it

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Azure Managed Instance Doesn’t Manage tempdb Well

Update!


You can now configure size and autogrowth settings for tempdb, but the article doesn’t mention anything about in-memory settings.

Twelve Angry Files


This post is admittedly short, but since we’re short of First Party Solutions™ to gripe to Microsoft with… I need to save this stuff somewhere.

  1. You get 12 tempdb files, even if your instance has 16, 24, 32, 40, 64, or 80 cores
  2. There’s no way to turn on in-memory tempdb
death of auto tune

If your workload is tempdb-heavy, and you have a special configuration for it on-prem, you might have a tough time moving it to Managed Instance. Also consider that disk performance (even though tempdb is “local”) is garbage, and you can’t do anything about it.

I think the lack of in-memory tempdb is odd too, since many features are cloud-first. One may wonder aloud why a performance feature that’s a couple years old now still isn’t available in flagship business critical cloud offerings.

It was only recently (September 2021) that scalar UDF inlining became available up in Azure SQL DB and Managed Instance.

Who’s really on the cutting edge? All those stodgy holdouts hanging about in their earthly garb.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Inline Table Valued Functions And Catch All Queries In SQL Server

This Is Not A Solution


Right off the bat, I want you to know that this is not a solution, and I’ll get to why in a minute. I’m writing this mainly because every once in a while I’ll try something different to get this working, and it always ends up disappointing.

I wish I had better news for you, here. Hell, I wish I had better news for me here. But alas we’re at the mercy of parameters.

And yeah, I know, recompile, recompile, recompile. All the live long day. But I’ve seen some weird stuff happen with that too under high concurrency.

So what’s the point? Let’s talk about that.

Dot Dot Dot


CREATE INDEX p1 ON dbo.Posts(OwnerUserId, CreationDate);

CREATE INDEX p2 ON dbo.Posts(Score, LastActivityDate);

We need some indexes. That’s a fact. I’m intentionally creating them in this way to show you that SQL Server can sometimes be smart about catch all queries.

And here’s the inline table valued function we’ll be working with:

CREATE OR ALTER FUNCTION
    dbo.kitchen_sink
(
    @OwnerUserId int,
    @CreationDate datetime,
    @Score int,
    @LastActivityDate datetime
)
RETURNS table
AS
RETURN
SELECT
    c = COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE 
        (p.OwnerUserId = @OwnerUserId OR @OwnerUserId IS NULL)
    AND (p.CreationDate >= @CreationDate OR @CreationDate IS NULL)
    AND (p.Score >= @Score OR @Score IS NULL)
    AND (p.LastActivityDate >= @LastActivityDate OR @LastActivityDate IS NULL);

This pattern usually eats the optimizer alive, and there’s a lot of posts about using dynamic SQL to fix it.

But when we call this function with literal values, it does just fine.

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, '20130101', NULL, NULL) AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(NULL, NULL, 100, '20130101') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, NULL, NULL, '20130101') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(NULL, '20131225', NULL, '20131225') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, NULL, NULL, '20131215') AS ks;

Das Plan


You can run those all yourself and look at the plans. I’m just gonna throw a couple of the more interesting examples in the post, though.

The first two queries do exactly what we’d hope to see.

SQL Server Query Plan
sparkling

We use the right indexes, we get seeks. Cardinality estimation is about as reliable as ever with the “””””default””””” estimator in place 🙄

And at one point, we even get a really smart index intersection plan where the optimizer uses both of our nonclustered indexes.

SQL Server Query Plan
units

Parameter Problem


The problem is that no one really makes database calls like that.

If you’re using an ORM, you could intentionally not parameterize your queries and get this to “work”, but there are downsides to that around the plan cache. Being honest, most plan caches are useless anyway.

Long Live Query Store, or something.

Most people have their catch all code parameterized, so the query looks like what’s in the function. I’m going to throw the function in a stored procedure now.

CREATE OR ALTER PROCEDURE
    dbo.kitchen_wrapper
(
    @OwnerUserId int,
    @CreationDate datetime,
    @Score int,
    @LastActivityDate datetime
)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

    SELECT
        ks.c
    FROM dbo.kitchen_sink
    (
        @OwnerUserId, 
        @CreationDate, 
        @Score, 
        @LastActivityDate
    ) AS ks;

END;

If we execute the procedure like this, everything goes to hell rather quickly.

EXEC dbo.kitchen_wrapper
    @OwnerUserId = 22656,          
    @CreationDate = '20131215',    
    @Score = NULL,                  
    @LastActivityDate = NULL;

EXEC dbo.kitchen_wrapper
    @OwnerUserId = NULL,
    @CreationDate = NULL, 
    @Score = 100, 
    @LastActivityDate = '20131215';

Baywatch


The first execution uses the “right” index, but we lose our nice index seek into the p1 index.

SQL Server Query Plan
barfbag

We also end up with Predicates on the Key Lookup, just in case they end up not being NULL. And boy, when they end up not being NULL, we end up with a really slow query.

SQL Server Query Plan
me one too

We re-use the execution plan we saw before, because that’s how SQL Server works. But since we don’t filter any rows from p1 since those parameters are NULL now, we pass all 17 million rows to the key lookup to filter them there, but since it’s a Nested Loops Join, we do it… one row at a time.

Fun.

Floss Too Much


There’s no great fix for this, either. This is a problem we’re stuck with when we write queries this way without using dynamic SQL, or a recompile hint.

I’ve seen people try all sorts of things to “fix” this problem. Case expressions, ISNULL and COALESCE, magic values, and more. They all have this exact same issue.

And I know, recompile, recompile, recompile.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

I’m Not Ready To Go To Conferences Without My Friends

Icky Yucky


Conferences have always been gross. Always.

This was true before COVID, too. You’d be hard-pressed to not find “nerd flu” trending around them, and they were gross for a lot of the same reasons that are being calling out now.

At the first SQL Saturday I ever went to, the janitorial staff had to tape signs to the bathroom wall asking people not to pee on the floor, and that was before lunch. For the record, my aim is impeccable. You just try me.

That SQL Saturday was also the first time I learned that you should always leave for lunch. Always.

Exposed finger food or not, the eating situation at conferences is historically horrible. You have one big room full of the smell of what’s essentially cafeteria food and stale coffee and the farts everyone has been saving up since breakfast.

Always avoid the food hall at conferences. Always.

Why Bother?


What I liked most about conferences was traveling to new places and socializing with people. Getting away from conferences with the dozen or so people I consider friends, and maybe some new ones I made along the way.

And always to get food somewhere that wasn’t in the fart hut. Always.

You couldn’t pay me to eat food 1000 other people had been breathing on, regardless of any demics: Pan or Epi or Aca. That’s foul, and I’m not anywhere close to having a germ phobia. My only phobia these days is getting blog comments.

Despite being  a regular precon speaker, I never got rich off that. I’ve made good money — no complaints at all — but only at bigger conferences like SQLBits.

Played correctly, they were break-a-little-better-than-even scenarios, once you factor in travel and hotels and time away from client work and all that.

I don’t teach precons for the money, I teach because I love teaching.

And traveling. And seeing my friends. Always.

Personable


Now, look, I’ve got all my shots, and I don’t sweat wearing a mask when required. I do what I do in the hope that it will help things get back to how they were. I do not have strong political convictions.

In that way, if a conference wanted that stuff for entry, I’d be able to go, but… What it really comes down to is that I’m not likely to go to a conference that my friends aren’t going to. And I don’t know if just those two things are enough to get people going back.

As much as I love being followed into the bathroom by someone asking me how to set MAXDOP, I hate eating alone (drinking alone is fine, though).

Gotta have balance, here.

I suppose it depends on what the world looks like when the bigger conferences start up in person again. I’m sure some people will never go to another conference, and I totally understand that point of view.

Just like some people will never go back to working in an office. And I’ll never eat in the food hall at a conference.

I do hope that I can go back to teaching in person again soon, though. I miss it, and teaching online classes didn’t scratch the itch in the same way.

Plus, my wife really wants me the hell out of the house, and I want to see my friends again.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Simple test results for query I/O performance on Azure SQL Managed Instance

Lately I’ve been doing some exploratory performance testing on Azure SQL Managed Instances in preparation for a migration to that platform. This blog post documents some storage testing results and may even have practical advice near the end. All testing was done on a gen5 general purpose instance with 8 vCores.

The Test Query

The test case is relatively simple. I want to do about 1 GB of physical reads without doing any read-ahead reads. To accomplish that, I loaded 130k rows into a table making sure that only one row could fit on each data page. I also wrote a simple nested loop join query that doesn’t allow for nested loop prefetching. Here’s a picture of the query plan:

The presence of the concat operator on the inner side of the nested loop prevents the prefetch optimization. This limitation is unusually annoying, but I’m using it to my advantage here to stress I/O as much as I can. Here’s the T-SQL used to generate the query plan in the picture:

use tempdb;
 
CREATE TABLE #outer (
    ID BIGINT IDENTITY(1, 1) NOT NULL,
    DUMMY BIGINT NOT NULL,
    PRIMARY KEY (ID)
);

INSERT INTO #outer (DUMMY)
SELECT TOP (130000) 0
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;


CREATE TABLE #inner (
    ID BIGINT IDENTITY(1, 1) NOT NULL,
    BIG_DUMMY CHAR(5000) NOT NULL,
    PRIMARY KEY (ID)
);

INSERT INTO #inner (BIG_DUMMY)
SELECT TOP (130000) '0'
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

CHECKPOINT;
DBCC DROPCLEANBUFFERS;
 

SELECT COUNT_BIG(*)
FROM #outer o
WHERE o.DUMMY = 1
OR EXISTS (
    SELECT 1
    FROM #inner i
    WHERE i.ID = o.ID
)
OPTION (MAXDOP 1, QueryRuleOff BuildSpool);

For the different cases, I ran the SELECT query at MAXDOP 1 and MAXDOP 2 and cleared the buffer pool before each query execution.

Is Tempdb no longer the outhouse?

For user databases on managed instances, the documentation states that you should expect I/O latency of around 5-10 ms. However, tempdb is attached locally so we might get better latency there. Wanting to start off on a positive note, I elected to try testing in tempdb first. Here are the test results for the tempdb database:

The managed instance spends more time waiting for I/O compared to testing on my local machine’s tempdb, but the MAXDOP 2 query has the nice property of nearly being twice as fast as the MAXDOP 1 query. Both queries on the managed instance have roughly the same amount of I/O wait time. The MAXDOP 2 query is primarily faster because the I/O waits are spread fairly evenly between both threads. I will note that I was a bit lazy with my test query and I didn’t ensure that an appropriate amount of work is sent to each thread, but things worked out well enough for this kind of test.

My next test was performed in a user database that was created solely for the purpose of this testing. The initial database size was about 1.5 GB. The code is the same as before, but I just created user tables instead of temp tables. The test results were not good. Both queries ran for over a minute and the MAXDOP 2 query was slower than the MAXDOP 1 query. The documentation says the following:

If you notice high IO latency on some database file or you see that IOPS/throughput is reaching the limit, you might improve performance by increasing the file size.

In this case, I might improve performance if I increase my data file size to greater than 128 GB. I didn’t want to do that for a few reasons:

  1. It costs money and I don’t like spending money.
  2. There’s an instance level storage limit of 2 or 8 TB for general purpose (depending on vCore count).
  3. Creating a database with 99% free space is silly and a well-meaning DBA could shrink it without realizing the performance implications.

For you, dear reader, I increased the database size to 132 GB. I did not observe any performance improvements after doing so, despite testing many times. Here are the test results so far:

Tempdb has 12 data files btw

The chart below makes me wonder if creating multiple files for a small database would be helpful:

If I can get 500 IOPS per database file and I have four database files, logically speaking I would expect to get 2000 IOPS for a single database with four files. I’m personally a big fan of databases with multiple files. I’ve seen a lot of performance problems fixed or mitigated by going beyond the default one data file per database rule.

There is at least one downside to creating multiple small files for one database on managed instances: there’s a documented limit of 280 total files for the general purpose tier. This limit is there because each file takes up a minimum of 128 GB on the storage backend and a general purpose instance can only use up to 35 TB of storage on the backend: 280 * 128 = 35 * 1024. It is amusing to consider how master, model, and msdb take up about 750 GB of storage behind the scenes.

On my instance, I’m nowhere near the 280 file limit/35 TB storage backend limit, so I created a small database (significantly below 128 GB) with about 4 files and tested the query again. Here are the full results:

MAXDOP 1 query performance is pretty much the same as before, but the MAXDOP 2 query runtime decreased from 77 seconds to 37 seconds. That’s a huge improvement. The MAXDOP 2 query is also roughly twice as fast as the MAXDOP 1 query which is nice to see. The only thing that I did to improve parallel query performance was to create a new database with 4 data files instead of 1.

Practical Applications

To get the obvious out of the way: if you’re writing temporary data to user tables instead of temp tables on Managed Instance, you probably shouldn’t be doing that.

I don’t have any production workloads in managed instances yet, but I’ll go ahead and attempt to give some guidance on file counts. You should consider exceeding 1 data file per database if all of the following are true:

  1. Your database is under 128 GB
  2. You care about I/O performance for that database or your I/O waits for that database are higher than you’d like
  3. Your instance isn’t close to the 35 TB backend limit

It will always be workload dependent, but you may see a performance improvement by splitting your database into multiple files. Do note that you’ll need to spread your data over all of the files (rebuild your tables and indexes after adding files) and you’ll want the files to be the same sizes with the same autogrowth settings. Also, there are other situations where you may want more than one data file for a database. Do not interpret the above to mean that databases already above 128 GB only need a single file.

If you’ve got a managed instance with only one database, perhaps you’re wondering if it would be a good idea to give it 96 data files. In theory, that will allow that database to hit the instance IOPS limit of 30k-40k. I can say that I’ve run with 96 file user databases in Enterprise and I didn’t observe any issues other than an annoying initial setup. The scenario for that was a large ETL system and I was trying to reduce PFS and GAM contention, so it’s quite a bit different than what you would run on a managed instance.

I would be somewhat cautious with creating significantly more data files for one database than your vCore count. In that configuration, I would also try to avoid excess autogrowth events as that is one place you might run into trouble. In general, when doing unusual things, you should test very carefully. That same advice is applicable here. It might work out well or it might not help at all. Stop asking me, I’ve used the cloud for like a week.

Final Thoughts

In some scenarios on Azure SQL Managed Instances, you may be above to improve I/O performance for small databases FOR FREE by spreading your data over multiple data files. Isn’t that wonderful?

Thanks for reading!

Forced Parameterization Extended Events That Don’t Work In SQL Server

Would I Could I


There are a couple Extended Events that I’ve tried to get working at various times:

  • forced_param_clause_skipped_reason: Fired for every clause that was skipped during forced parameterization.
  • forced_param_statement_ignored_reason: Fired when forced parameterization was not applied for the whole statement.

Mostly because I wanted to see if the list of limitations on this ancient documentation page held up, or if there were new ones.

Somewhat curious was that there’s stuff in dm_xe_map_values for them.

SELECT
    map_value
FROM sys.dm_xe_map_values
WHERE
    name LIKE '%forced_param_clause_skipped_reason%';

SELECT
    map_value
FROM sys.dm_xe_map_values
WHERE
    name LIKE '%forced_param_statement_ignored_reason%';

forced_param_clause_skipped_reason:

+---------------------------+
|         map_value         |
+---------------------------+
| CheckSum                  |
| ConstFoldableExpr         |
| EcDMLOutput               |
| EcDoubleColonFunctionCall |
| EcGroupBy                 |
| EcHaving                  |
| EcLike                    |
| EcOpenRowset              |
| EcOrderBy                 |
| EcSelectList              |
| EcTableSample             |
| EcTopOrPaging             |
| EcTSEqualCall             |
| None                      |
| StyleConvert              |
| XvtDate                   |
| XvtGuid                   |
| XvtNull                   |
+---------------------------+

forced_param_statement_ignored_reason:

+------------------------------------+
|             map_value              |
+------------------------------------+
| AnsiNullsOff                       |
| AnsiPaddingOff                     |
| BucketEndStatementNoVarsBucketized |
| BucketEndStatementUnreplacedVar    |
| BucketVarCursor                    |
| BucketVarNotParameter              |
| BucketVarOutput                    |
| BucketXvtEmpty                     |
| BucketXvtXml                       |
| Cursor                             |
| Error                              |
| HintCompileVarValue                |
| HintRecompile                      |
| HintSimpleParam                    |
| InsideFunction                     |
| InsideReplProc                     |
| InsideStoredProc                   |
| InsideTrigger                      |
| InsideView                         |
| Max                                |
| MaxVars                            |
| None                               |
| QueryStoreHintSimpleParam          |
| RegularPlanGuide                   |
| ReplacedTooMuchConstants           |
| TableVariable                      |
| Variable                           |
| VariableAssignment                 |
| WhereCurrentOf                     |
+------------------------------------+

Ain’t No Love


Unfortunately — and I’ve confirmed recently with support — these events don’t actually fire for anything.

They’re just empty shells, but at least there’s some interesting details in the DMVs about what might work someday, and the reasons that just might fire.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

A SARGability Riddle In SQL Server: Why Do Some Queries Seek and Some Queries Scan?

Use The Force


I had a RABID FAN ask me an interesting question about a query. I can’t use theirs, but I can repro the question.

The question was: if using ISNULL in a where clause isn’t SARGable, how come I can use a FORCESEEK hint in this query?

Formatting and linking my own, of course.

The query looked something like this:

SELECT
    c = 
        COUNT_BIG(*)
FROM dbo.Users AS u WITH(FORCESEEK)
WHERE ISNULL(u.Id, 0) = 22656;
GO
SQL Server Query Plan
seekable!

What Gives?


The first thing you should notice is that the optimizer throws out ISNULL, here.

Why? Because the Id column isn’t NULL-able, and since ISNULL is Microsoft’s special non-ANSI baby, it can do this one special thing.

If we use COALESCE instead, we’ll get an error.

SELECT
    c = 
        COUNT_BIG(*)
FROM dbo.Users AS u WITH(FORCESEEK)
WHERE COALESCE(u.Id, 0) = 22656;
GO

Msg 8622, Level 16, State 1, Line 8
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

And if we try to use ISNULL on a NULL-able column like Age, we’ll get the same error:

SELECT
    c = 
        COUNT_BIG(*)
FROM dbo.Users AS u WITH(FORCESEEK)
WHERE ISNULL(u.Age, 0) = 22656;
GO

Coacase? Caselesce?


Under the covers, COALESCE is just a crappy band CASE expression.

Without the FORCESEEK hint, we can get the query to actually run.

SELECT
    c = 
        COUNT_BIG(*)
FROM dbo.Users AS u
WHERE COALESCE(u.Id, 0) = 22656;
GO 
SQL Server Query Plan
southa

And ISNULL is just… ISNULL.

SELECT
    c = 
        COUNT_BIG(*)
FROM dbo.Users AS u
WHERE ISNULL(u.Age, 0) = 22656;
GO
SQL Server Query Plan
ribs

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.