SQL Server Filtered Indexes Need Serious Help

Minorly

Filtered indexes are really interesting things. Just slap a where clause on your index definition, and you can do all sorts of helpful stuff:

Isolate hot data
Make soft delete queries faster
Get a histogram specific to the span of data you care about

Among other things, of course. There are some annoying things about them though.

They only work with specific ANSI options
If you don’t include the filter definition columns in the index, it might not get used
They only work when queries use literals, not parameters or variables

Majorly

Part of the optimizer’s process consists of expression matching, where things like computed columns, filtered indexes, and indexed views are considered for use in your query.

I mean, if you have any of them. If you don’t, it probably just stares inwardly for a few nanoseconds, wondering why you don’t care about it.

Something that this part of the process is terrible at is any sort of “advanced” expression matching. It has to be exact, or you get whacked.

Here’s an example:

DROP TABLE IF EXISTS dbo.is_deleted;

CREATE TABLE dbo.is_deleted
(
    id int PRIMARY KEY,
    dt datetime NOT NULL,
    thing1 varchar(50) NOT NULL,
    thing2 varchar(50) NOT NULL,
    is_deleted bit NOT NULL
);

INSERT dbo.is_deleted WITH(TABLOCK)
(
    id,
    dt,
    thing1,
    thing2,
    is_deleted
)
SELECT 
    x.n,
    DATEADD(MINUTE, x.n, GETDATE()),
    SUBSTRING(x.text, 0, 50),
    SUBSTRING(x.text, 0, 50),
    x.n % 2
FROM (
SELECT 
    ROW_NUMBER() OVER 
    (
        ORDER BY 1/0
    ) AS n,
    m.*
FROM sys.messages AS m
) AS x;

CREATE INDEX isd 
ON dbo.is_deleted
    (dt)
INCLUDE
    (is_deleted)
WHERE 
    (is_deleted = 0);

Overly

If you run that setup script, you’ll get yourself a table that’s ripe for a filtered index on the is_deleted column.

But it doesn’t work with every type of query pattern. Some people are super fancy and want to find NOTs!

SELECT 
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id
WHERE id.dt >= GETDATE() + 200
AND   (NOT 1 = id.is_deleted)
AND   1 = (SELECT 1);

I have the 1 = (SELECT 1) in there for reasons. But we still get no satisfaction.

If we try to force the matter, we’ll get an error!

SELECT 
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id WITH(INDEX = isd)
WHERE id.dt >= GETDATE() + 200
AND   (NOT 1 = id.is_deleted)
AND   1 = (SELECT 1);

The optimizer says non.

Msg 8622, Level 16, State 1, Line 84
Query processor could not produce a query plan because of the hints defined in this query. 
Resubmit the query without specifying any hints and without using SET FORCEPLAN.

It has no problem with this one, though.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.is_deleted AS id
WHERE id.dt >= GETDATE() + 200
AND   (0 = id.is_deleted)
AND   1 = (SELECT 1);

Underly

It would be nice if there were some more work put into filtered indexes to make them generally more useful.

In much the same way that a general set of contradictions can be detected, simple things like this could be too.

Computed columns have a similar issue, where if the definition is col1 + col2, a query looking at col2 + col1 won’t pick it up.

It’s a darn shame that such potentially powerful tools don’t get much love.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.