Common SQL Server Consulting Advice: Enabling Forced Parameterization

Not The Bad Kind


I need to start this post off by saying something that may not be obvious to you: Not all parameter sniffing is bad.

Sure, every time you hear someone say “parameter sniffing” they want to teach you about something bad, but there’s a lot more to it than that.

Parameter sniffing is normally great. You heard me. Most of the time, you don’t want SQL Server generating new query plans all the time.

And yet I see people go to extreme measures to avoid parameter sniffing from ever happening, like:

  • Local variables
  • Recompiling

What you care about is parameter sensitivity. That’s when SQL Server comes up with totally different execution plans for the same query depending on which parameter value it gets compiled with. In those cases, there’s usually a chance that later executions with different parameter values don’t perform very well using the original query plan.

The thing is, sometimes you need to introduce potentially bad parameter sensitivity in order to fix other problems on a server.

What’s Your Problem?


The problem we’re trying to solve here is application queries being sent in with literal values, instead of parametrized values.

The result is a plan cache that looks like this:

SQL Server Plan Cache
unethical

Of course, if you can fix the application, you should do that too. But fixing all the queries in an application can take a long time, if you even have access to make those changes, or a software vendor who will listen.

The great use case for this setting is, of course, that it happens all at once, unless you’re doing weird things.

You can turn it on for a specific database by running this command:

ALTER DATABASE 
   [YourDatabase] 
SET PARAMETERIZATION FORCED;

Good or Bad?


The argument for doing this is to drastically reduce CPU from queries constantly compiling query plans, and to reduce issues around constantly caching and evicting plans, and creating an unstable plan cache.

Of course, after you turn it on, you now open your queries up to parameter sensitivity issues. The good news is that you can fix those, too.

99% of parameter sniffing problems I see come down to indexing issues.

  • Non-covering indexes that give the optimizer a choice between Seek + Lookup and Clustered Index Scan
  • Lots of single key column indexes that don’t make sense to use across different searches
  • Suboptimal indexes suggested by various tooling that got implemented without any critical oversight

And of course, if you’ve got Query Store enabled, you can pretty easily force a plan.

Speaking of which, I still have to talk a lot of folks into turning that on, too. Let’s talk about that tomorrow.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Enabling Resource Governor To Fix Memory Grants

With Or Without You


One thing I really hate about Standard Edition is the lack of a way to globally reduce the allowed memory grant default.

Out of the box, any query can come along and ask for ~25% of your server’s max server memory setting, and SQL Server is willing to loan out ~75% at once across a bunch of queries.

That impacts Standard Edition users way more than Enterprise Edition users, because of the Standard Edition buffer pool limit of 128GB.

A lot of folks misinterpret that limit — I’ve had several exchanges with big name hardware vendors where they insist 128GB is the overall RAM limit, so you’re likely getting bad advice from everywhere — and they end up with a server that only has 128GB of RAM in it.

Big mistake. Bump that up to 192GB and set Max Server Memory to ~180GB or so.

But I digress. Or whatever the word if for getting back to the real point.

I forget, if I ever knew.

Control Top


For all you high rollers out there on Enterprise Edition, you have an easy way to fight against SQL Server’s bad memory grant habits.

Before we do that, it’s important to make a few notes here:

  • SQL Server introduced batch mode memory grant feedback in 2016
  • SQL Server introduced batch mode on row store in 2019

Depending on your compatibility level, you may not be taking advantage of those things, but in either case the feedback might be kicking in too late. See, it’s not a runtime decision, it’s a decision that takes place after a query runs.

By then, it’s already sucked up 25% of your memory and probably stolen a whole bunch of space from your precious buffer pool. A properly filled buffer pool is important so your queries don’t get dry-docked going out to slowpoke disk all the live long day.

A lot of the time, folks I work with will have a ton of queries asking for bunk memory grants that are way bigger than they should be.

You can use this query to examine your plan cache for those things.

WITH 
    unused AS
(
    SELECT TOP (100)
        oldest_plan = 
            MIN(deqs.creation_time) OVER(),
        newest_plan = 
            MAX(deqs.creation_time) OVER(),
        deqs.statement_start_offset,
        deqs.statement_end_offset,
        deqs.plan_handle,
        deqs.execution_count,
        deqs.max_grant_kb,
        deqs.max_used_grant_kb,
        unused_grant = 
            deqs.max_grant_kb - deqs.max_used_grant_kb,
        deqs.min_spills,
        deqs.max_spills
    FROM sys.dm_exec_query_stats AS deqs
    WHERE (deqs.max_grant_kb - deqs.max_used_grant_kb) > 1024.
    AND   deqs.max_grant_kb > 5242880.
    ORDER BY 
        unused_grant DESC
)
SELECT      
    plan_cache_age_hours = 
        DATEDIFF
        (
            HOUR,
            u.oldest_plan,
            u.newest_plan
        ),
    query_text = 
        (
            SELECT [processing-instruction(query)] =
                SUBSTRING
                (
                    dest.text, 
                    ( u.statement_start_offset / 2 ) + 1,
                    (
                        ( 
                            CASE u.statement_end_offset 
                                 WHEN -1 
                                 THEN DATALENGTH(dest.text) 
                                 ELSE u.statement_end_offset 
                            END - u.statement_start_offset 
                        ) / 2 
                    ) + 1
                )
                FOR XML PATH(''), 
                    TYPE
        ),
    deqp.query_plan,
    u.execution_count,
    u.max_grant_kb,
    u.max_used_grant_kb,
    u.min_spills,
    u.max_spills,
    u.unused_grant
FROM unused AS u
OUTER APPLY sys.dm_exec_sql_text(u.plan_handle) AS dest
OUTER APPLY sys.dm_exec_query_plan(u.plan_handle) AS deqp
ORDER BY 
    u.unused_grant DESC
OPTION (RECOMPILE, MAXDOP 1);

If you have a bunch of those, and you want a quick fix until you can do more meaningful query and index tuning, you can use Resource Governor to reduce the 25% default to a lower number.

Scripted, For Your Pleasure


You can use this script to enable and reconfigure Resource Governor to use a lower memory grant percent.

/*The first time enables Resource Governor*/
ALTER RESOURCE GOVERNOR 
    RECONFIGURE;

/*This reduces the memory grant cap to 10%*/
ALTER WORKLOAD GROUP 
    [default] 
WITH
(
    REQUEST_MAX_MEMORY_GRANT_PERCENT = ?
);

/*This completes the change*/
ALTER RESOURCE GOVERNOR 
    RECONFIGURE;

You’ll have to fill in the question mark yourself, of course. Without looking at your system, I have no idea what it should be.

If you’d like help with that, hit the link below to schedule a sales call.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Enabling An Optimistic Isolation Level

Shame Shame Shame


One of the very first performance problems that developers will run into when using SQL Server is locking, and often deadlocks.

Though deadlocks have their root in (b)locking behavior, not all blocking leads to deadlocks.

Some (b)locking will just go on forever, ha ha ha.

The reason there are so many SQL Server blog posts about NOLOCK hints, and so much confusion about what it does, is because of defaults.

The worst part that it’s a default that shouldn’t be, and… somehow people have this sunny view of what NOLOCK does, but they all have a very negative view of better solutions to the problem.

What NOLOCK Really Does


I have this conversation at least twice a week, despite having written about it plenty of times.

And other people writing about it many times.

For years.

It does not mean your query doesn’t take locks, it means your query doesn’t respect locks taken by other queries. It’s not that read committed is so great, it’s that read uncommitted is so bad.

Think of modifications sort of like an Infinite Improbability Drive.

As soon as the drive reaches infinite Improbability, it passes through every conceivable point in every conceivable universe simultaneously. An incredible range of highly improbable things can happen due to these effects.

Perhaps not quite that eccentric, but you get the idea. While a modification in flight, a query with a NOLOCK (or READ UNCOMMITTED) hint may read those changes while they’re happening.

  • Incomplete inserts
  • Incomplete deletes
  • Incomplete updates

Inserts and deletes are a bit more straight forward. Say you’re inserting or deleting 10 rows, and either one is halfway done when your select query that is running with flaming knives and scissors a NOLOCK hint comes along.

You would read:

  • The first five inserted rows
  • The remaining five rows to be deleted

For updates, things are a little trickier because you might end up with an in-place update or per-index update.

You can read:

  • Partially changed rows
  • From an index that hasn’t been modified yet
  • Something in between

This is not what you want to happen.

Even if you have a million excuses as to why it works okay for you (it’s just a mobile app; they can refresh, we only need close-enough reports; users make changes and then read them later) I promise you that it’s not something you want to happen, because you can’t fully predict the ramifications of many concurrent modifications and selects running all together.

What You Really Want To Happen Instead


The utter beauty of these solutions is that they give you reliable results. They may not be perfect for every situation, but for probably like 99% of cases where you’re using NOLOCK hints everywhere anyway, they do.

Rather than futz about with the Infinite Improbabilities that could be read from modifications that are neither here nor there but are certainly not completed, you read the last known good version of a row or set of rows that are currently locked.

There you have it! No more uncertainty, puzzled users, additional database requests to refresh wonky-looking data, or anything like that.

It’s just you and your crisp, clean data.

If you’re on SQL Server 2019 and using Accelerated Database Recovery, the known-good versions of your precious data will be stored locally, per-database.

In all other scenarios, the row versioning goes off to tempdb.

Your select queries can read that data without being impeded by locks, and without all of incorrectness.

What options do you have to take advantage of these miraculous functionalities?

Those sound pretty close, but let’s talk a little bit more about them.

Isolation Levels, Not In Depth


It’s difficult to cover every potential reservation or concern someone may have about isolation levels. If you have an application that depends on certain locking guarantees to correctly process certain events, you may need read committed, or something beyond read committed (like repeatable read or serializable) to hold the correct locks.

If that’s your app design, then your job becomes query and index tuning to make sure that your queries run as quickly as possible to reduce the locking surface area of each one. This post is not for you.

This post is largely geared towards folks who have NOLOCK everywhere like some sort of incantation against performance problems, who hopefully aren’t totally stuck in their ways.

Here are some of the potential downsides of optimistic isolation levels:

  • Prior to SQL Server 2019, you’re going to add some load to tempdb
  • You’re going to add an up to 14-byte pointer to each row for tracking in the version store (but that happens on any table where there’s a trigger anyway)
  • You need to remove local locking hints on queries you want to use row versioning
  • You need to set the isolation level to allow queries to use Snapshot Isolation
  • Read query performance may slow down if the version store gets particularly large
  • You need to more closely monitor long running modifications to make sure they don’t fill tempdb

But this stuff is all worth it, because you can get around lots of weird, transient application issues:

If you’re currently using NOLOCK everywhere, or if someone starts suggesting you use it everywhere for better performance, know that you have better options out there.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Adding Indexed Views

I am a heading


In the age of column store indexes, indexed views have a bit less attractiveness about them. Unless of course you’re on Standard Edition, which is useless when it comes to column store.

I think the biggest mark in favor of indexed views over column store in Standard Edition is that there is no DOP restriction on them, where batch mode execution is limited to DOP 2.

https://erikdarling.com/sql-server/how-useful-is-column-store-in-standard-edition/

One of the more lovely coincidences that has happened of late was me typing “SQL Server Stranded Edition” originally up above.

Indeed.

There are some good use cases for indexed views where column store isn’t a possibility, though. What I mean by that is they’re good at whipping up big aggregations pretty quickly.

Here are some things you oughtta know about them before trying to use them, though. The first point is gonna sound really familiar.

First, there are some session-level settings that need to be appropriately applied for them to be considered by the optimizer. This is especially important if you’re putting any logic into a SQL Server Agent job, because it uses the wrong settings for some reason.

Here are the correct settings:

  • QUOTED_IDENTIFIER ON
  • ANSI_NULLS ON
  • ANSI_PADDING ON
  • ANSI_WARNINGS  ON
  • ARITHABORT ON
  • CONCAT_NULL_YIELDS_NULL ON
  • NUMERIC_ROUNDABORT OFF

Second, you’ll wanna use the NOEXPAND hint when you touch an indexed view. Not only because that’s the only way to guarantee the view definition doesn’t get expanded by the optimizer, but also because (even in Enterprise Edition) that’s the only way to get statistics generated on columns in the view.

If you’ve ever seen a warning for missing column statistics on an indexed view, this is likely why. Crazy town, huh?

Third, indexed views maintain changes behind the scenes automatically, and that maintenance can really slow down modifications if you don’t have indexes that support the indexed view definition.

https://erikdarling.com/sql-server/indexed-view-maintenance-is-only-as-bad-as-your-indexes/

Fourth, you have to be REALLY careful if your indexed view is going to span more than one table.

Locking can get really weird, and as tables get super big maintenance can turn into a nightmare even with good indexes to back the join up.

Fifth, there are a ridiculous number of restrictions. The current docs look like this:

SQL Server Indexed View Limitations
ouch

Sixth, you need to be really careful when you alter and indexed view.

When you do that, all of the indexes and statistics get dropped.

Seventh, indexed views can be used a lot like other constructs we’ve talked about this week:

Eighth, if your indexed view has an aggregation in it, you need to have a COUNT_BIG(*) column in the view definition.

Buuuuuut, if you don’t group by anything, you don’t need one.

Ninth, yeah, you can’t use DISTINCT in the indexed view, but if you can use GROUP BY, and the optimizer can match queries that use DISTINCT to your indexed view.

CREATE OR ALTER VIEW 
    dbo.shabu_shabu
WITH SCHEMABINDING
AS 
SELECT
    u.Id,
    u.DisplayName,
    u.Reputation,
    Dracula = 
        COUNT_BIG(*)
FROM dbo.Users AS u
WHERE u.Reputation > 100000
GROUP BY
    u.Id,
    u.Reputation,
    u.DisplayName;
GO 

CREATE UNIQUE CLUSTERED INDEX 
    cuqadoodledoo
ON dbo.shabu_shabu
(
    Id
);

SELECT DISTINCT   
    u.Id
FROM dbo.Users AS u
WHERE u.Reputation > 100000;

Ends up with this query plan:

SQL Server Query Plan
balance

Tenth, the somewhat newly introduced GREATEST and LEAST functions do work in indexed views, which certainly makes things interesting.

I suppose that makes sense, since they’re probably just CASE expressions internally, but after everything we’ve talked about, sometimes it’s surprising when anything works.

Despite It All


When indexed views are the right choice, they can really speed up a lot of annoying aggregations among their other utilities.

This week we talked a lot about different things we can do to tables to make queries faster. This is stuff that I end up recommended pretty often, but there’s even more stuff that just didn’t make the top 5 cut.

Next week we’ll talk about some database and server level settings that can help fix problems that I end up telling clients to flip the switch on.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Adding Computed Columns

Active Blooper


Remember yesterday? Yeah, me either. But I do have access to yesterday’s blog post, so I can at least remember that.

What a post that was.

We talked about filtered indexes, some of the need-to-know points, when to use them, and then a sad shortcoming.

Today we’re going to talk about how to overcome that shortcoming, but… there’s stuff you need to know about these things, too.

We’re gonna start off with some Deja Vu!

First, there are some session-level settings that need to be appropriately applied for them to be considered by the optimizer. This is especially important if you’re putting any logic into a SQL Server Agent job, because it uses the wrong settings for some reason.

Here are the correct settings:

  • QUOTED_IDENTIFIER ON
  • ANSI_NULLS ON
  • ANSI_PADDING ON
  • ANSI_WARNINGS  ON
  • ARITHABORT ON
  • CONCAT_NULL_YIELDS_NULL ON
  • NUMERIC_ROUNDABORT OFF

Second, computed columns are sort of like regular columns: you can only search them efficiently if you index them.

This may come as a surprise to you, but indexes put data in order so that it’s easier to find things in them.

The second thing you should know about the second thing here is that you don’t need to persist computed columns to add an index to them, or to get statistics generated for the computed values (but there are some rules we’ll talk about later).

For example, let’s say you do this:

ALTER TABLE dbo.Users ADD TotalVotes AS (UpVotes + DownVotes);
CREATE INDEX u ON dbo.Users (TotalVotes) WITH(SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

The index gets created just fine. This is incredibly handy if you need to add a computed column to a large table, because there won’t be any blocking while adding the column. The index is another matter, depending on if you’re using Enterprise Edition.

Third, SQL Server is picky about them, kind of. The problem is a part of the query optimization process called expression matching that… matches… expressions.

For example, these two queries both have expressions in them that normally wouldn’t be SARGable — meaning you couldn’t search a normal index on (Upvotes, Downvotes) efficiently.

But because we have an indexed computed column, one of them gets a magic power, and the other one doesn’t.

Because it’s backwards.

SELECT c = COUNT_BIG(*) FROM dbo.Users AS u WHERE (u.UpVotes + u.DownVotes) > 1000;
SELECT c = COUNT_BIG(*) FROM dbo.Users AS u WHERE (u.DownVotes + u.UpVotes) > 1000;
SQL Server Query Plan
connection

See what happens when you confuse SQL Server?

If you have full control of the code, it’s probably safer to reference the computed column directly rather than rely on expression matching, but expression matching can be really useful when you can’t change the code.

Fourth,  don’t you ever ever never ever ever stick a scalar UDF in a computed column or check constraint. Let’s see what happens:

CREATE FUNCTION dbo.suck(@Upvotes int, @Downvotes int)
RETURNS int
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS
BEGIN
    RETURN (SELECT @Upvotes + @Downvotes);
END;
GO 

ALTER TABLE dbo.Users ADD TotalVotes AS dbo.suck(UpVotes, DownVotes);
CREATE INDEX u ON dbo.Users (TotalVotes) WITH(SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

SELECT c = COUNT_BIG(*) FROM dbo.Users AS u WHERE (u.DownVotes + u.UpVotes) > 1000;

Remember that this is the query that has things backwards and doesn’t use the index on our computed column, but look what happened to the query plan:

SQL Server Query Plan
me without makeup

Querying a completely different index results in a plan that SQL Server can’t parallelize because of the function.

Things that won’t fix this:

  • SQL Server 2019 scalar UDF inlining
  • Persisting the computed column
  • Using a different kind of T-SQL function; you can’t use them in computed columns

Things that might fix this:

Fifth: Column store indexes are weird with them. There’s an odd bit of a matrix, too.

  • Anything before SQL Server 2017, no dice
  • Any nonclustered columnstore index through SQL Server 2019, no dice
  • For 2017 and 2019, you can create a clustered columnstore index on a table with a computed column as long as it’s not persisted
--Works
CREATE TABLE clustered_columnstore (id int, some_date datetime, next_date datetime, diff_date AS DATEDIFF(MILLISECOND, some_date, next_date));
CREATE CLUSTERED COLUMNSTORE INDEX c ON dbo.clustered_columnstore;

--Doesn't work
CREATE TABLE nonclustered_columnstore (id int, some_date datetime, next_date datetime, diff_date AS DATEDIFF(MILLISECOND, some_date, next_date));
CREATE NONCLUSTERED COLUMNSTORE INDEX n ON dbo.nonclustered_columnstore(id, some_date, next_date, diff_date);

--Clean!
DROP TABLE dbo.clustered_columnstore, dbo.nonclustered_columnstore;

--Doesn't work, but throws a misleading error
CREATE TABLE clustered_columnstore (id int, some_date datetime, next_date datetime, diff_date AS DATEDIFF(MILLISECOND, some_date, next_date) PERSISTED);
CREATE CLUSTERED COLUMNSTORE INDEX c ON dbo.clustered_columnstore;

--Still doesn't work
CREATE TABLE nonclustered_columnstore (id int, some_date datetime, next_date datetime, diff_date AS DATEDIFF(MILLISECOND, some_date, next_date) PERSISTED);
CREATE NONCLUSTERED COLUMNSTORE INDEX n ON dbo.nonclustered_columnstore(id, some_date, next_date, diff_date);

--Clean!
DROP TABLE dbo.clustered_columnstore, dbo.nonclustered_columnstore;

General Uses


The most general use for computed columns is to materialize an expression that a query has to filter on, but that wouldn’t otherwise be able to take advantage of an index to locate rows efficiently, like the UpVotes and DownVotes example above.

Even with an index on UpVotes, DownVotes, nothing in your index keeps track of what row values added together would be.

SQL Server has to do that math every time the query runs and then filter on the result. Sometimes those expressions can be pushed to an index scan, and other times they need a Filter operator later in the plan.

Consider a query that inadvisably does one of these things:

  • function(column) = something
  • column + column = something
  • column + value = something
  • value + column = something
  • column = case when …
  • value = case when column…
  • convert_implicit(column) = something

As long as all values are known ahead of time — meaning they’re not a parameter, variable, or runtime constant like GETDATE() — you can create computed columns that you can index and make searches really fast.

Take this query and index as an example:

SELECT c = COUNT_BIG(*) FROM dbo.Posts AS p WHERE DATEDIFF(YEAR, p.CreationDate, p.LastActivityDate) > 9;
CREATE INDEX p ON dbo.Posts(CreationDate, LastActivityDate) WITH(SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

The best we can do is still to read every row via a scan:

SQL Server Query Plan
still not good

But we can fix that by computing and indexing:

ALTER TABLE dbo.Posts ADD ComputedDiff AS DATEDIFF(YEAR, CreationDate, LastActivityDate);
CREATE INDEX p ON dbo.Posts(ComputedDiff) WITH(SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE, DROP_EXISTING = ON);

And now our query plan is much faster, without needing to go parallel, or more parallel, to get faster:

SQL Server Query Plan
improvement!

SQL Server barely needs to flinch to finish that query, and we get an actually good estimate to boot.

Crappy Limitations


While many computed columns can be created, not all can be indexed. For example, something like this would be lovely to have and to have indexed:

ALTER TABLE dbo.Users ADD RecentUsers AS DATEDIFF(DAY, LastAccessDate, SYSDATETIME());
CREATE INDEX u ON dbo.Users (RecentUsers);

While the column creation does succeed, the index creation failed:

Msg 2729, Level 16, State 1, Line 177

Column ‘RecentUsers’ in table ‘dbo.Users’ cannot be used in an index or statistics or as a partition key because it is non-deterministic.

You also can’t reach out to other tables:

ALTER TABLE dbo.Users ADD HasABadge AS CASE WHEN EXISTS (SELECT 1/0 FROM dbo.Badges AS b WHERE b.UserId = Id) THEN 1 ELSE 0 END;

SQL Server doesn’t like that:

Msg 1046, Level 15, State 1, Line 183

Subqueries are not allowed in this context. Only scalar expressions are allowed.

There are other, however these are the most common disappointments I come across.

Some of the things that computed columns fall flat with are things we can remedy with indexed views, but boy howdy are there a lot of gotchas.

We’ll talk about those tomorrow!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Adding Filtered Indexes

Home Star


For all the good indexes do, sometimes you just don’t need them to cover all of the data in a table for various reasons.

There are a number of fundamentals that you need to understand about them (that any good consultant will tell you about), but I wanna cover them here just in case you had an unfortunate run-in with a less-than-stellar consultant.

First, there are some session-level settings that need to be appropriately applied for them to be considered by the optimizer. This is especially important if you’re putting any logic into a SQL Server Agent job, because it uses the wrong settings for some reason.

Here are the correct settings:

  • QUOTED_IDENTIFIER ON
  • ANSI_NULLS ON
  • ANSI_PADDING ON
  • ANSI_WARNINGS  ON
  • ARITHABORT ON
  • CONCAT_NULL_YIELDS_NULL ON
  • NUMERIC_ROUNDABORT OFF

Second, you need to be careful about parameterized queries. If your filtered index and query looks like this:

CREATE INDEX u ON dbo.Users(Reputation) WHERE Reputation > 100000;

SELECT c = COUNT_BIG(*) FROM dbo.Users AS u WHERE u.Reputation > @Reputation;

The optimizer will not use your filtered index, because it has to pick a safe and cache-able plan that works for any input to the @Reputation parameter (this goes for local variables, too).

To get around this, you can use:

  • Recompile hint
  • Literal value
  • Potentially unsafe dynamic SQL

Third, you need to have the columns in your filtering expression be somewhere in your index definition (key or include) to help the optimizer choose your index in some situations.

Let’s say you have a filtered index that looks like this:

CREATE INDEX u ON dbo.Users(DisplayName) WHERE Reputation > 100000;

As thing stand, all the optimizer knows is that the index is filtered to Reputation values over 100k. If you need to search within that range, like 100k-200k, or >= 500k, it has to get those values from somewhere, and it has the same options as it does for other types of non-covering indexes:

  • Ignore the index you thoughtfully created and use another index
  • Use a key lookup to go back to the clustered index to filter specific values

General Uses


The most common uses that I see are:

  • Indexing for soft deletes
  • Indexing unique values and avoiding NULLs
  • Indexing for hot data
  • Indexing skewed data to create a targeted histogram

There are others, but one thing to consider when creating filtered indexes is how much of your data will be excluded by the definition.

If more than half of your data is going to end up in there, you may want to think hard about what you’re trying to accomplish.

Another potential use that I sometimes find for them is using the filter to avoid needing an overly-wide index key.

Let’s say you have a super important query that looks like this:

SELECT
    u.DisplayName,
    TotalScore = 
        SUM(p.Score)
FROM dbo.Posts AS p
JOIN dbo.Users AS u
    ON p.OwnerUserId = u.Id
WHERE p.PostTypeId = 1
AND   p.AcceptedAnswerId > 0
AND   p.ClosedDate IS NULL
AND   p.Score > @Score
AND   p.OwnerUserId = @OwnerUserId
ORDER BY TotalScore DESC;

You’re looking at indexing for one join clause predicate and four additional where clause predicates. Do you really want five key columns in your index? No?

How about this?

CREATE INDEX 
    p 
ON dbo.Posts
    (OwnerUserId, Score)
INCLUDE
    (PostTypeId, AcceptedAnswerId, ClosedDate)
WHERE
    (PostTypeId = 1 AND AcceptedAnswerId > 0 AND ClosedDate IS NULL)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

This will allow your query to pre-seek to the the literal value predicates, and then evaluate the parameterized predicates in the key of the index.

Crappy Limitations


There are lots of things you might want to filter an index on, like an expression. But you can’t.

I think one of the best possible use cases for a filtered index that is currently not possible is to isolate recent data. For example, I’d love to be able to create a filtered index like this:

CREATE INDEX u ON dbo.Users(DisplayName) WHERE CreationDate > DATEADD(DAY, -30, CONVERT(date, SYSDATETIME()));

So I could just just isolate data in the table that was added in the last 30 days. This would have a ton of applications!

But then the index would have to be self-updating, churning data in and out on its own.

For something like that, you’ll need a computed column. But even indexing those can be tricky, so we’ll talk about those tomorrow.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Adding Indexes To Replace Multiple Indexes

Buffet Tables


If you’ve ever spent any time reading about relational database theory, you’ve probably read all sorts of things about normal forms and all that good stuff. You’ve also probably immediately forgotten all of it.

At least that’s what I’d guess is what happened based on what I see when I look at database designs out there. Tables are often dozens of columns wide, with all sorts of  things in there that should be in their own table.

  • Columns with a common prefix, like Customer_Name, Customer_Address, etc.
  • Columns with a common suffix, like PhoneNumber_1, PhoneNumber_2, etc.

Even fairly-well designed tables often get somewhat overloaded with information, especially when years of additional development requirements lead to a choice between adding one more column to a table and normalizing several column out to a new table.

Sound familiar?

Let’s move on.

Indexing


The more columns you have in a table, the more potential column combinations there are for indexes. Much like columns, indexes tend to get added following the path of least resistance.

Very rarely does someone consider current indexes when deciding to add an index. They might be doing something like:

  • Evaluating missing index requests in a query plan
  • Evaluating missing index requests in the DMVs
  • Running DTA like a darn fool and saying yes to everything

I’ll often run into tables that have indexes like these:

CREATE INDEX p1 ON dbo.Posts(OwnerUserId);
CREATE INDEX p2 ON dbo.Posts(OwnerUserId, PostTypeId);
CREATE INDEX p3 ON dbo.Posts(OwnerUserId, CreationDate, LastActivityDate);
CREATE INDEX p4 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate);
CREATE INDEX p5 ON dbo.Posts(OwnerUserId, PostTypeId, Score);
CREATE INDEX p6 ON dbo.Posts(CreationDate, LastActivityDate, Score);

I’m leaving off includes here, because you can combine those in any order without disturbing much.

Key columns are a different story, of course. Order matters quite a bit, because via the magic of co-dependency key column order defines how queries can efficiently access data.

Merging


Looking at this list of indexes, we know a couple things immediately:

  • The index on just OwnerUserId isn’t of any obvious value, and we can get rid of it
  • The index that leads on CreationDate  serves a totally different type of query, and we should leave it alone for now

Assuming all of these indexes have a not-crappy read to write ratio, what do we do next?

  • Make sure none of them are unique
  • Make sure none of them are filtered

These indexes all have multiple key columns that sort of line up:

CREATE INDEX p2 ON dbo.Posts(OwnerUserId, PostTypeId); 
CREATE INDEX p3 ON dbo.Posts(OwnerUserId, CreationDate, LastActivityDate); 
CREATE INDEX p4 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate); 
CREATE INDEX p5 ON dbo.Posts(OwnerUserId, PostTypeId, Score);

Of these, three of them are on OwnerUserId and PostTypeId, and one isn’t. We can throw that one out.

CREATE INDEX p2 ON dbo.Posts(OwnerUserId, PostTypeId); 
CREATE INDEX p4 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate); 
CREATE INDEX p5 ON dbo.Posts(OwnerUserId, PostTypeId, Score);

Okay, now what? Just looking at the definitions here, we can safely get rid of the first index, because any query that uses it can safely use the other two indexes.

But for the second two, we can’t do much. It’s not necessarily safe to add the third key column in either one as a fourth key column.

We have to look at how they’re used, first. If one index is unused or rarely used, we could make that change.

Things might get more difficult to figure out if we have another index like this:

CREATE INDEX p4 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate); 
CREATE INDEX p5 ON dbo.Posts(OwnerUserId, PostTypeId, Score);
CREATE INDEX p7 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate, LastActivityDate);

Sure, it means we could get rid of the index that ends with CreationDate, but it makes the picture a little less clear about how we should deal with Score.

Ending


To be safe, I’d probably stop after just getting rid of the ones where the key columns are a match as far as order goes, and are a subset of wider indexes.

CREATE INDEX p1 ON dbo.Posts(OwnerUserId); --Drop this one
CREATE INDEX p2 ON dbo.Posts(OwnerUserId, PostTypeId); --Drop this one
CREATE INDEX p3 ON dbo.Posts(OwnerUserId, CreationDate, LastActivityDate); --Keep this one
CREATE INDEX p4 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate); --Drop this one
CREATE INDEX p5 ON dbo.Posts(OwnerUserId, PostTypeId, Score); --Keep this one 
CREATE INDEX p6 ON dbo.Posts(CreationDate, LastActivityDate, Score); --Keep this one
CREATE INDEX p7 ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate, LastActivityDate); --Keep this one

After that, I’d probably look at usage stats, to see if any of them just aren’t holding up their end of the bargain.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Adding And Removing Indexes

A Good Thing


Since you’re here reading this, I’m going to assume you know what food is. Further, I’m going to assume that you also know what salt is, because salt plays a pretty big part in how good or bad food tastes.

  • Skimp on salt, and you’ll see a lot of disappointed faces
  • Go overboard on salt, and you’ll see a lot of sour faces

This is a lot like what you’ll see if you do either one of those things with indexes in your database, unless you have Special Medical Exemptions©.

  • If it’s a data warehouse, you’ll probably want fewer indexes, but you’ll also probably want to use column store indexes
  • If it’s normal throughput OLTP, you’ll probably want more indexes than a data warehouse, but with narrower key column choices to support seek and lookup plans
  • If it’s OLTP + Reporting (or analytics or whatever you wanna call it) you’re going to need to make a lot of tough indexing choices for them to live together
  • If it’s high throughput OLTP, you’ll probably want even fewer overall indexes, and you may need to look into in-memory options for the busiest tables

With all that in mind, there might be some local factors that make some if this stuff not align particularly well with your workload. Things like version, edition, query patterns, and hardware might be holding you back. And of course, being at the mercy of a terrible vendor application would also likely put a damper on many of these plans.

If you want a quick and easy way to find indexes to both remove and add, go ahead and grab sp_BlitzIndex.

Practically Common


In my consulting work, the most common type of workload I see is OLTP with reporting on top of it, followed by analytics workloads. Pure OLTP workloads are pretty rare.

The index strategy for these workloads might look something like this:

  • Narrow key indexes with very few includes or OLTP queries
  • Fairly narrow key indexes with wider includes for reporting queries, or…
  • Nonclustered column store indexes for reporting queries

It’s always tricky figuring out what the best set of indexes to have around for your workload is. There are so many missing index requests, new queries, users complaining, and new tables getting added that it’s hard not to feel like you’re drowning.

You can’t have the perfect index for every query, and you shouldn’t try to. You’d end up with the sour-salt face kind of database. But what I need you to do is make a deal with yourself: Any time you want to add an index, you have to look for an index to get rid of.

Why?

Because without fail, I see these patterns all the time:

  • Indexes with a single key column along wider indexes that lead with the same column
  • Indexes with the exact same key columns and key column order (sometimes one has more key columns at the end)
  • Indexes that are totally unused by read queries but get modified a lot
  • Indexes with only a few reads and a ton of writes

It’s all very bothersome clutter, and you’ve got to clear these out before you can add in more.

Commonly Practical


There are many good reasons to avoid over-indexing. Everyone says “writes get slow”, but that’s a shabby meme. Your writes are slow because your storage is crappy and possibly because of your poorly thought out Availability Group.

Why having an excessive number of indexes (again, barring Special Medical Exemptions©) can be bad for your database performance:

  • Locking:
    • More objects to lock means more locking generally
    • More objects to lock means a greater chance of lock escalation attempts
  • Buffer pool:
    • More indexes means more objects in the buffer pool
    • Even indexes unused by reads need to be modified when the table changes
  • Transaction logging:
    • More indexes to modify means more to keep track of in the transaction log
    • Even databases in simple recovery log the same amount 99% of the time

So the next time you want to go add an index to a table, I want you to try a few things:

  • Look for indexes on the table that are totally unused
  • Look for indexes on the table with overlapping key columns to merge
  • Look for indexes that could be tweaked rather than creating a whole new one

To do that, sp_BlitzIndex run like so:

EXEC dbo.sp_BlitzIndex
    @DatabaseName = N'YourDatabase',
    @SchemaName = N'ProbablyDBO',  
    @TableName = N'YourIndexingCatastrophe';

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Setting Target Recovery Interval

Everywhere You Look


Finding out about this was probably like when some probably guy probably in a wig or something discovered atoms and was like “ohh wowwe therre are’st atommes ev’rywhislt I doth looketh, happie dais!”.

If that’s not historically accurate, I apologize. I’m a child of divorce and public school systems.

If you want a quick and dirty way to see if your server is having a tough time with checkpoint activity, you can run this to search your most recent error log for relevant messages.

  • If you recently restarted your server, you may need to search additional error logs
  • If you cycle your error logs frequently, you may need to search additional error logs
  • If your server is associated with Cloud Nonsense© these commands may not work
CREATE TABLE 
    #ErikLog
(
    LogDate datetime,
    ProcessInfo varchar(50),
    Text varchar(MAX)
);

INSERT 
    #ErikLog WITH(TABLOCK)
(
    LogDate,
    ProcessInfo,
    Text
)
EXEC sys.xp_readerrorlog 
    0, --Just the first log file
    1, --SQL Server error log
    N'FlushCache'; --Search string

INSERT 
    #ErikLog WITH(TABLOCK)
(
    LogDate,
    ProcessInfo,
    Text
)
EXEC sys.xp_readerrorlog 
    0, --Just the first log file
    1, --SQL Server error log
    N'saturation'; --Search string

SELECT
    el.*
FROM #ErikLog AS el
ORDER BY el.LogDate;

DROP TABLE IF EXISTS
    #ErikLog;

You can adjust the first parameter there to look back at older log files, too, if you need to.

So what happened? Was it everything you imagined?

Did You Find Anything?


If you got some hits searching through the error log, it may be a sign that you need to make a change.

  • Who can change this? You!
  • What change? Target Recovery Interval!
  • Where do I change it? At the database level!
  • When should I change it? Whenever, my friend!
  • Why database level and not at the server level? If your databases are in an AG or something, you want the change to follow them to other servers!
  • How do I change it? Like so!
DECLARE 
    @sql nvarchar(max) = N'';
 
SELECT 
    @sql += 
        N'ALTER DATABASE ' + 
        QUOTENAME(d.name) + 
        N' SET TARGET_RECOVERY_TIME = 60 SECONDS;' + 
        NCHAR(13)
FROM sys.databases AS d 
WHERE d.target_recovery_time_in_seconds <> 60
AND   d.database_id > 4 
AND   d.state = 0 
AND   d.is_in_standby = 0 
AND   d.is_read_only = 0;
 
SELECT 
    cmd = @sql;

Why Are We Doing This?


Long checkpoints can cause all sorts of weirdball issues while they’re happening, and they don’t just happen when you run a checkpoint command.

Log and differential backups cause them, as well as internal stuff around recovery time.

The good news is that this is the default for databases creating on SQL Server 2016 or better. The bad news is that most of you out there started creating your databases way before that, and you’ve just been restoring them to newer versions without checking out any of the hot new gadgets and gizmos available.

Not that I blame you. Stuff like this doesn’t exactly make headlines, and you’re busy with important stuff, like hemming and hawing over fill factor.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Setting Instant File Initialization

Did You Forget?


This is another one of those settings that’s so important, Microsoft put a switch for it in the installer. And yet, I end up looking at servers where it’s not enabled.

When I ask why, I usually expect to hear some 2003 wisdom about security. But that doesn’t happen; it’s just blank stares.

How do you know if this is enabled?

SELECT  
    dss.servicename,
    dss.service_account,
    dss.instant_file_initialization_enabled 
FROM sys.dm_server_services AS dss;

Simple as that.

Why Enable Instant File Initialization?


Turning this on allows SQL Server data files to grow without writing a bunch of zeroes out to disk. That can be useful in a number of ways.

Just picture SQL Server giving disk space a big ol’ hug and Windows not even getting jealous.

But Erik, you said data files, what about all those important transaction log files?

Sorry, there’s no helping them.

Put them on the fastest storage you can get and pray.

Enabling Instant File Initialization


You can use this command from dbatools to enable IFI. If you’re not comfortable with PowerShell, you can get it done this way:

If for some reason you didn’t turn this on when you installed SQL Server, just take the service account you got from up yonder, and type “secpol.msc” into the search bar.

Security Policy
tell somebody

After that, click on “User Rights Assignment” and look for the “Perform Volume Maintenance Tasks” entry in the Policy tab.

Security Policy
no mistake

From there, it’s a matter of adding the service account here. If you’re using the default SQL Server service account, “NT Service\MSSQLSERVER…” it will likely already have it.

But you know, just in case. G’head and click that.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.