When Index Sort Direction Matters For Query Performance In SQL Server

Ever Helpful


I got a mailbag question recently about some advice that floats freely around the internet regarding indexing for windowing functions.

But even after following all the best advice that Google could find, their query was still behaving poorly.

Why, why why?

Ten Toes Going Up


Let’s say we have a query that looks something like this:

SELECT
    u.DisplayName,
    u.Reputation,
    p.Score, 
    p.PostTypeId
FROM dbo.Users AS u
JOIN
(
    SELECT
        p.Id,
    	p.OwnerUserId,
    	p.Score,
    	p.PostTypeId,
    	ROW_NUMBER() OVER
    	(
    	    PARTITION BY
    		    p.OwnerUserId,
    			p.PostTypeId
    		ORDER BY
    		    p.Score DESC
    	) AS n
    FROM dbo.Posts AS p
) AS p
    ON  p.OwnerUserId = u.Id
    AND p.n = 1
WHERE u.Reputation >= 500000
ORDER BY u.Reputation DESC,
         p.Score DESC;

Without an index, this’ll drag on forever. Or about a minute.

But with a magical index that we heard about, we can fix everything!

Ten Toes Going Down


And so we create this mythical, magical index.

CREATE INDEX bubble_hard_in_the_double_r
ON dbo.Posts
(
    OwnerUserId ASC, 
    PostTypeId ASC, 
    Score ASC
);

But there’s still something odd in our query plan. Our Sort operator is… Well, it’s still there.

SQL Server Query Plan
grinch

Oddly, we need to sort all three columns involved in our Windowing Function, even though the first two of them are in proper index order.

OwnerUserId and PostTypeId are both in ascending order. The only one that we didn’t stick to the script on is Score, which is asked for in descending order.

Dram Team


This is a somewhat foolish situation, all around. One column being out of order causing a three column sort is… eh.

We really need this index, instead:

CREATE INDEX bubble_hard_in_the_double_r
ON dbo.Posts
(
    OwnerUserId ASC, 
    PostTypeId ASC, 
    Score DESC
);
SQL Server Query Plan
mama mia

Granted, I don’t know that I like this plan at all without parallelism and batch mode, but we’ve been there before.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Signs You Need Dynamic SQL To Fix Query Performance Problems In SQL Server

Nothing Works


There are things that queries just weren’t meant to do all at once. Multi-purpose queries are often just a confused jumble with crappy query plans.

If you have a Swiss Army Knife, pull it out. Open up all the doodads. Now try to do one thing with it.

If you didn’t end up with a corkscrew in your eye, I’m impressed.

En Masse


The easiest way to think of this is conditionals. If what happens within a stored procedure or query depends on something that is decided based on user input or some other state of data, you’ve introduced an element of uncertainty to the query optimization process.

Of course, this also depends on if performance is of some importance to you.

Since you’re here, I’m assuming it is. It’s not like I spend a lot of time talking about backups and crap.

There are a lot of forms this can take, but none of them lead to you winning an award for Best Query Writer.

IFTTT


Let’s say a stored procedure will execute a different query based on some prior logic, or an input parameter.

Here’s a simple example:

IF @i = 1
BEGIN
    SELECT
        u.*
    FROM dbo.Users AS u
    WHERE u.Reputation = @i;
END;

IF @i = 2
BEGIN
    SELECT
        p.*
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = @i;
END;

If the stored procedure runs for @i = 1 first, the second query will get optimized for that value too.

Using parameterized dynamic SQL can get you the type of optimization separation you want, to avoid cross-optimization contamination.

I made half of that sentence up.

For more information, read this article.

Act Locally


Local variables are another great use of dynamic SQL, because one query’s local variable is another query’s parameter.

DECLARE @i int = 2;
SELECT
    v.*
FROM dbo.Votes AS v
WHERE v.VoteTypeId = @i;

Doing this will get you weird estimates, and you won’t be happy.

You’ll never be happy.

For more information, read this article.

This Or That


You can replace or reorder the where clause with lots of different attempts at humor, but none of them will be funny.

SELECT
    c.*
FROM dbo.Comments AS c
WHERE (c.Score >= @i OR @i IS NULL);

The optimizer does not consider this SARGable, and it will take things out on you in the long run.

Maybe you’re into that, though. I won’t shame you.

We can still be friends.

For more information, watch this video.

Snortables


Dynamic SQL is so good at helping you with parameter sniffing issues that I have an entire session about it.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Mind Your OUTPUT Targets In SQL Server, Some Of Them Hurt Query Performance

Browser History


I’ve blogged about OUTPUT a couple times, and those posts are Still Accurate™

But it’s worth noting that, for the second post OUTPUT forced the query to run serially with no target; just returning data back to SSMS.

Depending on the query behind the putting of the out, parallelism could be quite important.

That’s why in the first post, the put out into a real table didn’t cause performance to suffer.

Of course, if you OUTPUT into a table variable, you still have to deal with table variables being crappy about modifications.

Samesies


If you compare the performance of queries that output into a @table variable vs one that outputs into a #temp table, you’ll see a difference:

SQL Server Query Plan
bang bang bang

Even though the parallel zone is limited here, there’s a big difference in overall query time. Scanning the Votes table singe-threaded vs. in parallel.

When you’re designing processes to be as efficient as possible, paying attention to details like this can make a big difference.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

DROPCLEANBUFFERS: A Better Test Of Storage Than Query Performance For SQL Server

Relics


When I’m tuning queries, people will often ask me what metrics I look at to judge efficiency. Usually, it’s just getting things to be done faster.

Sometimes it’s okay to use more CPU via a parallel plan to get your query faster. Sometimes it’s okay to do more reads to get a query faster.

Sure, it’s cool when it works out that you can reduce resources overall, but every query is special. It all sort of depends on where the bottleneck is.

One thing I’ve been asked about several times is about how important it is to clear out the plan cache and drop clean buffers between runs.

While this post is only about the dropping of cleanly buffers, let’s touch on clearing the plan cache in some way quickly.

Dusted


Clearing out the plan cache (or recompiling, or whatever) is rarely an effective query tuning mechanism, unless you’re working on a parameter sniffing issue, or trying to prove that something else about a query is causing a problem. Maybe it’s local variables, maybe it’s a bad estimate from a table variable.

You get the point.

But starting with a new plan every time is overblown — if you change things like indexes or the way the query is written, you’re gonna get a new plan anyway.

If you’re worried about long compile times, you might also want to do this to prove it’s not necessarily the query that’s slow.

Busted


Let’s look at a big picture. The script that generates this picture is as follow:

--Calgon
DBCC DROPCLEANBUFFERS;

--o boy
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p;

--table manners
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p;

--so considerate
CREATE INDEX p ON dbo.Posts(Id);

--y tho?
DBCC DROPCLEANBUFFERS;

--woah woah woah
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p;

--hey handsome
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p;

We’re gonna drop them buffferinos, run the same count query twice, add a real narrow index, then count twice again.

Great. Great. Great.

SQL Server Query Plan
different lifetime

Annalieses


For the first two executions, we performance tuned the query by about 30 seconds, just by… reading data from memory.

Hm. Okay. Unless you’re trying to prove that you don’t have enough memory, or that storage sucks, you’re not really convincing me of much.

Yes, RAM is faster than disk. Now what?

For the second two executions, query performance got way better. But reading the smaller index from disk hardly changed overall execution time.

If it’s not a strong enough argument that getting a query from 14 seconds down to half a second with a better index means you need an index, you might be working for difficult people.

Of course, Size Matters™

brick to back

The second query finishes much faster because we have a much smaller amount of data to read, period. If we had a where clause that allowed our index to seek to a specific chunk of data, we could have done even less work. This shouldn’t surprise anyone, though. Reading 450MB is faster than reading 120,561MB.

This is not a math test.

Coasting


Starting queries out with an empty buffer pool doesn’t really offer any insights into if you’ve tuned the query. It only exaggerates a difference that is usually not a reality.

It is a useful tool if you want to prove that:

  • You need more memory
  • You need better storage
  • You need a different index

But I sure wouldn’t use it to prove that I made a query better.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Residual Predicates In SQL Server Query Plans

We Will Talk About Things And Have Fun Now


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

USE StackOverflow;

EXEC dbo.DropIndexes; 

/*
CREATE INDEX east 
    ON dbo.Posts
        (PostTypeId, Score, OwnerUserId) 
WITH ( MAXDOP = 8, 
       SORT_IN_TEMPDB = ON, 
       DATA_COMPRESSION = ROW );
*/

DROP TABLE IF EXISTS #t;
GO 

SELECT   
    u.Id,
    u.Reputation,
    u.DisplayName,
    p.Id AS PostId,
    p.Title
INTO #t
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.OwnerUserId = u.Id
WHERE u.Reputation >= 1000
AND   p.PostTypeId = 1
AND   p.Score >= 1000 
ORDER BY u.Reputation DESC;



/*
CREATE INDEX east 
    ON dbo.Posts(PostTypeId, Score, OwnerUserId);
*/
SELECT 
    t.Id, 
    t.Reputation, 
    ( 
        SELECT 
            MAX(p.Score) 
        FROM dbo.Posts AS p 
        WHERE p.OwnerUserId = t.Id 
        AND   p.PostTypeId IN (1, 2) 
    ) AS TopPostScore,
    t.PostId, 
    t.Title
FROM #t AS t
ORDER BY t.Reputation DESC;


/*
Usually I love replacing select 
list subqueries with APPLY

Just show the saved plan here
*/
SELECT 
    t.Id, 
    t.Reputation, 
    pq.Score,
    t.PostId, 
    t.Title
FROM #t AS t
OUTER APPLY --We have to use outer apply to not restrict results!
(
    SELECT 
        MAX(p.Score) AS Score
    FROM dbo.Posts AS p 
    WHERE p.OwnerUserId = t.Id 
    AND   p.PostTypeId IN (1, 2)
) AS pq
ORDER BY t.Reputation DESC;


/*
TOP (1) also spools
*/
SELECT 
    t.Id, 
    t.Reputation, 
    ( 
        SELECT TOP (1) 
            p.Score
        FROM dbo.Posts AS p
        WHERE p.PostTypeId IN (1, 2)
        AND   p.OwnerUserId = t.Id
        ORDER BY p.Score DESC 
    ) AS TopPostScore,
    t.PostId, 
    t.Title
FROM #t AS t
ORDER BY t.Reputation DESC;

SELECT 
    t.Id, 
    t.Reputation, 
    pq.Score,
    t.PostId, 
    t.Title
FROM #t AS t
OUTER APPLY
(
    SELECT TOP (1) 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.PostTypeId IN (1, 2)
    AND   p.OwnerUserId = t.Id
    ORDER BY p.Score DESC
) AS pq
ORDER BY t.Reputation DESC;


/*
CREATE INDEX east 
    ON dbo.Posts(PostTypeId, Score, OwnerUserId);
*/
SELECT 
    t.Id, 
    t.Reputation, 
    pq.Score,
    t.PostId, 
    t.Title
FROM #t AS t
OUTER APPLY --This one is fast
(
    SELECT TOP (1) 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 1
    AND   p.OwnerUserId = t.Id
    ORDER BY p.Score DESC
) AS pq
ORDER BY t.Reputation DESC;

SELECT 
    t.Id, 
    t.Reputation, 
    pa.Score,
    t.PostId, 
    t.Title
FROM #t AS t
OUTER APPLY --This two is slow...
(
    SELECT TOP (1) 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 2
    AND   p.OwnerUserId = t.Id
    ORDER BY p.Score DESC
) AS pa
ORDER BY t.Reputation DESC;


/*
Use the Score!
*/
SELECT 
    t.Id, 
    t.Reputation, 
    ISNULL(pa.Score, pq.Score) AS TopPostScore,
    t.PostId, 
    t.Title
FROM #t AS t
OUTER APPLY --This one is fast
(
    SELECT TOP (1) 
        p.Score --Let's get the top score here
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 1
    AND   p.OwnerUserId = t.Id
    ORDER BY p.Score DESC
) AS pq
OUTER APPLY --This two is slow...
(
    SELECT TOP (1) 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 2
    AND   p.OwnerUserId = t.Id
    AND   pq.Score < p.Score --Then use it as a filter down here
    ORDER BY p.Score DESC
) AS pa
ORDER BY t.Reputation DESC;


SELECT 
    t.Id, 
    t.Reputation, 
    ISNULL(pq.Score, 0) AS Score,
    t.PostId, 
    t.Title
INTO #t2
FROM #t AS t
OUTER APPLY --This one is fast
(
    SELECT TOP (1) 
        p.Score --Let's get the top score here
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 1
    AND   p.OwnerUserId = t.Id
    ORDER BY p.Score DESC
) AS pq;


SELECT 
    t.Id, 
    t.Reputation, 
    ISNULL(pa.Score, t.Score) AS TopPostScore, 
    t.PostId, 
    t.Title
FROM #t2 AS t
OUTER APPLY 
(
    SELECT TOP (1) 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.PostTypeId = 2
    AND   p.OwnerUserId = t.Id
    AND   t.Score < p.Score --Then use it as a filter down here
    ORDER BY p.Score DESC
) AS pa
ORDER BY t.Reputation DESC;



/*
What happened?
 * Index key column order
   * (PostTypeId, Score, OwnerUserId)

Other things we could try:
 * Shuffling index key order, or creating a new index
   * (PostTypeId, OwnerUserId, Score)
 
 * Rewriting the query to use ROW_NUMBER() instead
  * Have to be really careful here, probably use Batch Mode

*/

/*
CREATE TABLE dbo.t
(
id int NOT NULL,
INDEX c CLUSTERED COLUMNSTORE
);
*/

SELECT 
    t.Id, 
    t.Reputation, 
    pa.Score,
    t.PostId, 
    t.Title
FROM #t AS t
LEFT JOIN dbo.t AS tt ON 1 = 0
OUTER APPLY
(
    SELECT 
        rn.*
    FROM 
    (
        SELECT
            p.*,
            ROW_NUMBER()
                OVER
                (
                    PARTITION BY 
                        p.OwnerUserId
                    ORDER BY
                        p.Score DESC
                ) AS n
        FROM dbo.Posts AS p
        WHERE p.PostTypeId IN (1, 2)
    ) AS rn
    WHERE rn.OwnerUserId = t.Id
    AND   rn.n = 1
) AS pa
ORDER BY t.Reputation DESC;


DROP TABLE #t, #t2;

 

Your Reputation = 1

Untechnical


The title might look like it’s technical, since usually when I’m talking about Reputation = 1, I’m talking about querying the Stack Overflow database.

This post isn’t about that Reputation, it’s about your Reputation, and hopefully how you choose to protect it while you exist, and after you not exist.

This also isn’t a horn-tooting post. If you think I am horn-tooting, refer back to this sentence. There is, likewise, no high-horsing here. There are no horses, high or otherwise.

I try to teach people things here. This is just another thing.

Hazy Shade Of Writer


Late last year I agreed to write a book. It came to my attention after the fact that the publisher had recently released a work by someone who, several years ago, was found to have exhibited a pattern of untoward behavior with female members of the SQL Server community. I brought this up, and they were dismissive of my concerns on the matter.

I’m stopping short of adding any further details, because I don’t want anyone who has experienced trauma from this to experience more. Quite the opposite.

After that interaction, I withdrew from the project. There are plenty of ways for me to help people learn about SQL Server that don’t make profits for companies that turn a blind eye to that sort of behavior.

Continuing would have meant I turn a blind eye to it, and that’s not how I want to be known by anyone, especially people who are far more likely to be victimized than I am. I’m your friend, too.

I’m not pretending that I have the strongest morals and ethics, or that I’m a paragon of virtue, but continuing with the project just didn’t sit right with me.

After all, from where I sit in my office, I can hear my two daughters playing.

Fighting.

Playing again.

Not sure.

Something in between.

They’ll figure it out.

Choice Cuts


It’s easy to build a bad reputation, quickly or slowly. Anyone can do it. It’s hard to build and keep a good reputation.

When choosing who to be involved with, it’s important to consider who they’re involved with. I don’t mean that you have to blacklist people who you disagree with (that can be a dangerous personal precedent to set), but do think carefully about the kind of behavior you’re not okay with.

If you see that behavior, you’ve probably found someone you don’t want to associate with. If you see someone condoning that behavior, you’ve probably found two.

Redemption needs to exist, otherwise there would be no point in trying to learn and change. I’ve learned some tough lessons in life and changed because of them. Many people I know have, too. It’s generally a good thing, lest we all remain babbling and unwashed.

Not that that’s a bad way to spend Saturday night, but I digress.

People also need to be able to make mistakes and come back from them. There are good ways to do that: taking ownership, apologizing, etc. I’d never give sincerity the cold shoulder.

Lunch Alone


This doesn’t just go for when it’s easy to make the right choice. One reason there is no horn nor horse for me here is that it was a very easy decision to make.

But that does make it an easy lesson to teach, hopefully. Even if no one ever said a word about the entanglement, I would have felt wrong about it. If you don’t have a good reputation with yourself, you probably won’t have one with other people.

When you hit those moments in your life, and in your career, follow your instincts. I realize that it’s hard when a project promises some esteem, or monetary gain when monetary gains are scarce, but in the long run it’s worth it.

The least regret is no regret.

Or regerts, as a wise tattoo artist once inscribed.

Thanks for reading!

Keep Away From Runnable Queues

Re-Run


There are a lot of posts about THREADPOOL on the internet, but what happens before your server threads go into the red?

You might see other signs of CPU contention — lots of waits on stuff like the CX waits, or SOS_SCHEDULER_YIELD.

But there are other signs, too, and they can show up in the lifecycle of a query.

Zombie Skeleton


The main states a query can be in are:

  • Runing: Chugging happily along
  • Runnable: Waiting for a CPU
  • Suspended: Waiting on something else (locks, pages, whatever)
  • Sleeping: Should probably disconnect maybe, I guess

If you’ve got a lot of queries that are runnable, they’re ready to run, but not getting CPU time.

There are a ton of reasons why you might see this rack up, like:

  • Too few CPUs
  • Totally untuned queries
  • VM issues like noisy neighbors, CPU limits, etc.
  • RBAR-style code

Good News!


Checks for this are coming to sp_BlitzFirst, and sp_PressureDetector.

If > 25% of  your queries are runnable, we’ll warn you about it. What you do with that information is up to your skilled hands.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Signs Your Schema Is Hurting SQL Server Performance

To The Point


If you need to search for:

  • Leading wildcards
  • Substrings
  • Charindexes
  • Patindexes
  • Replaces
  • Lefts
  • Rights
  • Concats
  • Any combination of TRIMs
  • Columns with prefixed values
  • Columns with suffixed values
  • Concatenated columns
  • String split columns

Something is broken in the way that you store data.

You’re overloading things, and you’re going to hit big performance problems when your database grows past puberty.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Stressing tempdb and Observing Contention In SQL Server

As Recently As Now


Even on SQL Server 2019, with in-memory tempdb metadata enabled, and an appropriate number of evenly sized data files, you can experience certain types of contention in tempdb.

It’s better. It’s definitely and totally better, but it’s still there. With that in mind, I wrote a stored procedure that you can stick in your favorite stress tool, to see how tempdb handles different numbers of concurrent sessions. You can download it here, on GitHub.

If you need a tool to run a bunch of concurrent sessions against SQL Server, my favorite two free ones are:

If you need tools to help you identify tempdb contention, the ones I use are:

Strikey


While I’m running one of those stress tools, this is how I use the stored procedures above to look for contention.

For sp_WhoIsActive, it’s really simple:

EXEC sp_WhoIsActive @get_task_info = 2;

The results will start to look like this when contention heats up. Again, things are a lot better now, but it can still happen.

sp_WhoIsActive
hello… you

For sp_HumanEvents, it’s still pretty simple:

EXEC dbo.sp_HumanEvents 
    @event_type = 'waits', 
    @seconds_sample = 10, 
    @wait_type = N'PAGELATCH_UP, PAGELATCH_EX, PAGELATCH_SH';

Since I want to specifically look for waits that indicate tempdb is mixing it up, I’ll put the most common PAGELATCH waits in.

SQL Server Query Results
cheesey plate

The output here is admittedly a bit truncated, because of limitations with the ring buffer extended event. But, you know, I think we can see enough.

If You See Contention


Check the basics first.

  • You might need more data files
  • Data files might be unevenly sized
  • If you’re on < SQL Server 2016, you might need trace flags 1117 and 1118
  • You might have a bunch of other stuff hemming up tempdb, too

Check out this video for some other things that can cause problems too.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server Filtered Indexes Need Serious Help

Minorly


Filtered indexes are really interesting things. Just slap a where clause on your index definition, and you can do all sorts of helpful stuff:

  • Isolate hot data
  • Make soft delete queries faster
  • Get a histogram specific to the span of data you care about

Among other things, of course. There are some annoying things about them though.

  • They only work with specific ANSI options
  • If you don’t include the filter definition columns in the index, it might not get used
  • They only work when queries use literals, not parameters or variables

Majorly


Part of the optimizer’s process consists of expression matching, where things like computed columns, filtered indexes, and indexed views are considered for use in your query.

I mean, if you have any of them. If you don’t, it probably just stares inwardly for a few nanoseconds, wondering why you don’t care about it.

Something that this part of the process is terrible at is any sort of “advanced” expression matching. It has to be exact, or you get whacked.

Here’s an example:

DROP TABLE IF EXISTS dbo.is_deleted;

CREATE TABLE dbo.is_deleted
(
    id int PRIMARY KEY,
    dt datetime NOT NULL,
    thing1 varchar(50) NOT NULL,
    thing2 varchar(50) NOT NULL,
    is_deleted bit NOT NULL
);

INSERT dbo.is_deleted WITH(TABLOCK)
(
    id,
    dt,
    thing1,
    thing2,
    is_deleted
)
SELECT 
    x.n,
    DATEADD(MINUTE, x.n, GETDATE()),
    SUBSTRING(x.text, 0, 50),
    SUBSTRING(x.text, 0, 50),
    x.n % 2
FROM (
SELECT 
    ROW_NUMBER() OVER 
    (
        ORDER BY 1/0
    ) AS n,
    m.*
FROM sys.messages AS m
) AS x;

CREATE INDEX isd 
ON dbo.is_deleted
    (dt)
INCLUDE
    (is_deleted)
WHERE 
    (is_deleted = 0);

Overly


If you run that setup script, you’ll get yourself a table that’s ripe for a filtered index on the is_deleted column.

But it doesn’t work with every type of query pattern. Some people are super fancy and want to find NOTs!

SELECT 
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id
WHERE id.dt >= GETDATE() + 200
AND   (NOT 1 = id.is_deleted)
AND   1 = (SELECT 1);

I have the 1 = (SELECT 1) in there for reasons. But we still get no satisfaction.

SQL Server Query Plan
hurtin

If we try to force the matter, we’ll get an error!

SELECT 
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id WITH(INDEX = isd)
WHERE id.dt >= GETDATE() + 200
AND   (NOT 1 = id.is_deleted)
AND   1 = (SELECT 1);

The optimizer says non.

Msg 8622, Level 16, State 1, Line 84
Query processor could not produce a query plan because of the hints defined in this query. 
Resubmit the query without specifying any hints and without using SET FORCEPLAN.

It has no problem with this one, though.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id
WHERE id.dt >= GETDATE() + 200
AND   (0 = id.is_deleted)
AND   1 = (SELECT 1);

Underly


It would be nice if there were some more work put into filtered indexes to make them generally more useful.

In much the same way that a general set of contradictions can be detected, simple things like this could be too.

Computed columns have a similar issue, where if the definition is col1 + col2, a query looking at col2 + col1 won’t pick it up.

It’s a darn shame that such potentially powerful tools don’t get much love.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.