Erik Darling, Author at Darling Data

Dealing With Wide Missing Index Requests In SQL Server

Posted on November 5, 2021May 16, 2022 by Erik Darling

We’ve All Been There

You’re running a query that selects a lot of columns, and you get a missing index request.

For the sake of brevity, let’s say it’s a query like this:

SELECT
    p.*
FROM dbo.Posts AS p
WHERE p.ParentId = 0;

The missing index request I get for this query is about like so:

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Posts] ([ParentId])
INCLUDE ([AcceptedAnswerId],[AnswerCount],[Body],[ClosedDate],[CommentCount],[CommunityOwnedDate],
[CreationDate],[FavoriteCount],[LastActivityDate],[LastEditDate],[LastEditorDisplayName],[LastEditorUserId],
[OwnerUserId],[PostTypeId],[Score],[Tags],[Title],[ViewCount])

But that’s laughable, because it’s essentially a shadow clustered index. It’s every column in the table ordered by <some column>.

And Again

Under many circumstances, you can trim all those included columns off and make sure there’s a usable index with ParentId as the leading column.

I’m not a fan of single key column indexes most of the time, so I’d avoid that practice.

But sure, if you have reasonably selective predicates, you’ll get a decent seek + lookup plan. That’s not always going to be the case, though, and for various reasons you may end up getting a poor-enough estimate on a reasonably selective predicate, which will result in a bad-enough plan.

Of course, other times you may not have very selective predicates at all. Take that query up there, for example. There are 17,142,169 rows in the Posts table (2013), and 6,050,820 of them qualify for our predicate on ParentId.

This isn’t a case where I’d go after a filtered index, either, because it’d only be useful for this one query. And it’d still be really wide.

There are four string columns in there, all nvarchar.

Title (250)
Tags (150)
LastEditorDisplayName(40)
Body(max)

Maybe Something Different

If I’m going to create an index like that, I want more out of it than I could get with the one that the optimizer asked for.

On a decently recent version of SQL Server (preferably Enterprise Edition), I’d probably opt for creating a nonclustered column store index here.

You get a lot of benefits from that, which you wouldn’t get from the row store index.

Column independence for searching
High compression ratio
Batch Mode execution

That means you can use the index for better searching on other predicates that aren’t terribly selective, the data source is smaller and less likely to be I/O bound, and batch mode is aces for queries that process a lot of rows.

Column store indexes still have some weird limitations and restrictions. Especially around data types and included columns, I don’t quite understand why there isn’t better parity between clustered and nonclustered column store.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

LOB Data, Recompile, And Too Much tempdb Usage In SQL Server

Posted on November 4, 2021May 16, 2022 by Erik Darling

Stumper

I got a very interesting issue about sp_HumanEvents recently, and I couldn’t quite figure out why things were going wonky.

The problem was that at some point when monitoring blocking long-term, something was taking up a whole lot of tempdb.

I’m gonna say up front that the problem was using a recompile hint with LOB variable assignment in a loop. That doesn’t mean you should avoid any one of those things, but be careful when you use them in concert.

This post wouldn’t be possible without my friends Joe and Paul, who helped me track down the issue and with the internals of it.

Thumper

Rather than make you go through running and causing blocking, the issue is reproduced through this demo:

DECLARE @lob nvarchar(max) = N'x';
DECLARE @x xml = N'<x>' + REPLICATE(@lob, 1024 * 1024) + N'</x>';
DECLARE @loop integer = 0;
DECLARE @t table (x nvarchar(1));

WHILE @loop < 5
BEGIN

    INSERT 
        @t
    (
        x
    )
    SELECT 
        x = @x.value('(./x/text())[1]', 'nvarchar(1)')
    OPTION (RECOMPILE);

    DELETE @t;

    SELECT 
        ddtsu.internal_objects_alloc_page_count, 
        ddtsu.internal_objects_dealloc_page_count,
        internal_object_reserved_page_count = 
        (
            SELECT 
                SUM(internal_object_reserved_page_count)
            FROM tempdb.sys.dm_db_file_space_usage
        )
    FROM sys.dm_db_task_space_usage AS ddtsu
    WHERE ddtsu.session_id = @@SPID;

    SET @loop += 1;
END;

The trick here is to run it with the recompile hint available, and then quoted out.

Jumper

With recompile available, this is the result:

The internal objects keep going up, except the deallocated number. That’s the column in the middle.

With recompile quoted out, the numbers are a lot different.

All numbers stay stable.

Lumper

As Mr. White points out in his post, assigning to a variable prevents parameter embedding.

When it’s allowed, tempdb objects get cleaned up at the end of the statement.
When it’s not, it gets cleaned up at the end of the batch.

This has been reported to Microsoft for some analysis. Hopefully there’s some remedy for it in the future.

Thanks for reading!

Going Further

Updates To sp_HumanEvents, sp_QuickieStore, And More!

Posted on November 3, 2021May 16, 2022 by Erik Darling

For Free!

Here’s some stuff I’ve been working on lately:

Added an inline table valued function that strips characters from a string
Added a script to create useful Alerts in SQL Server

sp_HumanEvents:

Added the recompile_cause column to the output of the recompile event
Added the wait_resource column to the blocking event output
Fixed a bug around table creation
MsSQLServerDBA fixed an issue where GO statements were preventing deploying with DBA Tools
Remove a recompile hint that was leading to high tempdb allocations (more on that tomorrow)

sp_QuickieStore:

Fixed a bug where date filtering was messing up query results

Enjoy, and as always, please let me know if there’s anything I can do to improve the scripts usability, or any issues you encounter while using them.

Thanks for reading!

Going Further

How To Use sp_HumanEvents To Troubleshoot A Slow Stored Procedure In SQL Server

Posted on November 2, 2021October 4, 2023 by Erik Darling

Number One Record

You’ve got a stored procedure that runs slowly, but the problem is that it’s… long. It’s really hard to tell which part is slow, sometimes.

With shorter procedures you can probably just collect actual execution plans and slam F5 like a tall glass of gin at 6am.

But you don’t wanna do that with the larger procedures, for a few practical reasons:

Lots of little queries run quickly, and we don’t care about those
Navigating through lots of plans in SSMS is tedious
There’s no differentiation when other procedures, etc. are invoked
You introduce a lot of overhead retrieving and rendering all those plans
The full query text might not be captured, which is a limitation in many places

Let’s save the day with sp_HumanEvents, my stored procedure that makes using Extended Events really easy.

Wanna Ride

There are a lot of issues you can run into with Extended Events. They’re rather unpleasant, and there’s almost zero guidance from Microsoft about usage.

Wouldn’t it be nice to just hit F5?

Once you have a window open with your procedure ready to run, take note of the session id that it’s using, and tailor this command for your situation:

EXEC sp_HumanEvents
    @event_type = 'query',                   
    @query_duration_ms = 1000,               
    @session_id = N'58',                    
    @keep_alive = 1;

Let’s break down what this does:

The events we want to focus on are for executed queries
We only care about queries that run for more than one second
We’re going to focus in on the session id for the query window we’re tuning in
We want the session to stay running so we can watch and re-watch after we make changes

I know it looks a little funny that @session_id is a string, but that’s because there are some sampling capabilities if you want to look at a workload instead of a single procedure without overwhelming a server by collecting everything.

Once that’s done, you’ll have a new Extended Event session like this, and you’ll wanna watch live data from it:

Watch Out Now

Once you’ve got that window open and you run your procedure, you’ll see any queries that meet the duration criteria, and you should see something that looks like this.

The procedure that I’m looking at in here is sp_BlitzCache, because it’s a good example of a procedure with a lot of queries in it, where only some of them (like the XML parsing) might get slowed down.

Some notes on the output:

The statement doesn’t get collected with the query plan (more on that in a second)
INSERT…EXEC shows up as two statements (lines 2 and 3 over there)

You only see these two columns at first, but you can go through and add in any other columns that you find useful from each specific session type.

Two Point Two

The output can be a little confusing at first. Generally the pattern is query plan, then statement associated with it.

If you click on each event, you can choose different elements from it to show in the main result window.

Here’s how I usually set things up to find what I wanna go after:

I like to grab CPU, duration, information about requested, granted, and used memory, and the statement.

Not every element is available in every event. Where things are NULL, that’s where they’re not available. I wish there were a general extended event that captured everything I need in one go, but that doesn’t seem to exist.

One last point on the output is that if you click on the query_post_execution_showplan lines, the query plan is available in the second result set:

Crossover

Before you ask, the lightweight profiling events are useless. They’re so nerfed and collect so little helpful detail that you might as well be looking at a cached plan or Query Store plan.

Lightweight, baby 🙄

To that point, I don’t recommend running something like this across a whole workload. Though Extended Events are purported to have far less observer overhead than Profiler, I’ve seen even heavily-filtered sessions like this slow workloads down quite a bit.

That goes for other traces and forms of monitoring as well. Don’t think this is a special case.

Anyway, this is how I work when I’m trying to tune things for clients. Hopefully you find it useful as well.

Grab sp_HumanEvents and lemme know how it goes on GitHub.

Thanks for reading.

Going Further

An Undocumented Restriction For Parallel Inserts In SQL Server That Can Kill Performance

Posted on November 1, 2021May 16, 2022 by Erik Darling

Insert Card

I’d like to start this post off by thanking my co-blogger Joe Obbish for being lazy and not blogging about this when he first ran into it three years ago.

Now that we’re through with pleasantries, let’s talk turkey.

Over in this post, by Arvind Shyamsundar, which I’m sure Microsoft doesn’t consider official documentation since it lacks a GUID in the URL, there’s a list of… things about parallel inserts.

Just as it is with SQL Server 2016, in order to utilize the parallel insert in Azure SQL DB, do ensure that your compatibility level is set to 130. In addition, it is recommended to use a suitable SKU from the Premium service tier to ensure that the I/O and CPU requirements of parallel insert are satisfied.
The usage of any scalar UDFs in the SELECT query will prevent the usage of parallelism. While usage of non-inlined UDFs are in general ‘considered harmful’ they end up actually ‘blocking’ usage of this new feature.
Presence of triggers on the target table and / or indexed views which reference this table will prevent parallel insert.
If the SET ROWCOUNT clause is enabled for the session, then we cannot use parallel insert.
If the OUTPUT clause is specified in the INSERT…SELECT statement to return results to the client, then parallel plans are disabled in general, including INSERTs. If the OUTPUT…INTO clause is specified to insert into another table, then parallelism is used for the primary table, and not used for the target of the OUTPUT…INTO clause.
Parallel INSERT is used only when inserting into a heap without any additional non-clustered indexes. It is also used when inserting into a Columnstore index.
Watch out when IDENTITY or SEQUENCE is present!

In the actual post, some of these points are spread out a bit; I’ve editorially condensed them here. Some of them, like OUTPUT and UDFs, I’ve blogged about a bazillion times over here.

Others may come as a surprise, like well, the rest of them. Hm.

But there’s something missing from here, too!

Lonesome

Let’s create a #temp table, here.

DROP TABLE IF EXISTS 
    #parallel_insert;

CREATE TABLE
    #parallel_insert
(
    id int NOT NULL
)

Now let’s look at a parallel insert. I’m using an auxiliary Numbers table for this demo because whatever it’s my demo.

INSERT
    #parallel_insert WITH (TABLOCK)
(
    id
)
SELECT
    n.Number
FROM dbo.Numbers AS n
JOIN dbo.Numbers AS n2
    ON n2.Number = n.Number
OPTION
(
    USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')
);

The query plan does exactly what we want it to do, and stays parallel through the insert.

Referential

If we drop and re-create the #temp table, and then run this insert instead, that doesn’t happen:

INSERT
    #parallel_insert WITH (TABLOCK)
(
    id
)
SELECT
    n.Number
FROM dbo.Numbers AS n
JOIN dbo.Numbers AS n2
    ON n2.Number = n.Number
AND NOT EXISTS
    (
        SELECT
            1/0
        FROM #parallel_insert AS p
        WHERE p.id = n.Number
    )
OPTION
(
    USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')
);

Note that the not exists is against an empty table, and will not eliminate any rows. The optimizer estimates this correctly, and yet…

The insert is in a fully serial zone. This happens because we reference the table that we’re inserting into in the select portion of the query. It’s no longer just the target for the select to insert into.

Zone Out

If you’re tuning queries like this and hit situations where this limitation kicks in, you may need to use another #temp table to stage rows first.

I’m not complaining about this limitation. I can only imagine how difficult it would be to guarantee correct results in these situations.

I do want to point out that the fully parallel insert finishes in around 250ms, and the serial zone insert finishes in 1.4 seconds. This likely isn’t the most damning thing, but in larger examples the difference can be far more profound.

Thanks for reading!

Going Further

How Avoiding Normalization Contributes To Skewed Data In SQL Server Tables

Posted on October 28, 2021June 17, 2022 by Erik Darling

One Who Overflew

In the Stack Overflow database, the biggest (and probably most important) table is Posts.

The Comments table should be truncated hourly. Comments were a mistake.

But the Posts table suffers from a serious design flaw in the public data dump: Questions and Answers are in the same table.

I’ve heard that it’s worse behind the scenes, but I don’t have any additional details on that.

Aspiring Aspirin

This ends up with some weird data distributions. Certain attributes can only ever be “true” for a question or an answer.

For example, only questions can have a non-zero AcceptedAnswerId, or AnswerCount. Some questions might have a ClosedDate, or a FavoriteCount, too. In the same way, only answers can have a ParentId. This ends up with some really weird patterns in the data.

Was it easier at first to design things this way? Probably. But introducing skew like this only makes dealing with parameter sniffing issues worse.

Even though questions and answers are the most common types of Posts, they’re not the only types of Posts. Even if you make people specify a type along with other things they’re searching for, you can end up with some really different query plans.

Designer Drugs

When you’re designing tables, try to keep this sort of stuff in mind. It might not be a big deal for small tables, but once you realize your data is getting big, it might be too late to make the change. It’s not just a matter of changes to the database, but the application, too.

Late stage redesigns often lead to the LET’S JUST REWRITE THE WHOLE APPLICATION FROM THE GROUND UP projects that take years and usually never make it.

Thanks for reading!

Going Further

Changes Coming To SQL Server’s STRING_SPLIT Function: Optional Ordinal Position

Posted on October 27, 2021May 16, 2022 by Erik Darling

Blueprint

In 2016, we got the STRING_SPLIT function. That was nice, because prior implementations had a lot of problems

But out of the gate, everyone looked at what we got and couldn’t figure why this would drop without a column to tell you the position of each element in the string.

Recently I updated SSMS to 18.10, and went to work on a couple scripts that use STRING_SPLIT.

I was immediately confronted by a bunch of RED SQUIGGLY LINES.

Why?

Not Yet But Soon

Huh.

Insufficient! You’re insufficient!

Oh, enable_ordinal. Neat.

At least it’s a bigint.

Thanks for reading!

Going Further

Azure Managed Instance Doesn’t Manage tempdb Well

Posted on October 26, 2021January 5, 2023 by Erik Darling

Update!

You can now configure size and autogrowth settings for tempdb, but the article doesn’t mention anything about in-memory settings.

Twelve Angry Files

This post is admittedly short, but since we’re short of First Party Solutions™ to gripe to Microsoft with… I need to save this stuff somewhere.

You get 12 tempdb files, even if your instance has 16, 24, 32, 40, 64, or 80 cores
There’s no way to turn on in-memory tempdb

death of auto tune

If your workload is tempdb-heavy, and you have a special configuration for it on-prem, you might have a tough time moving it to Managed Instance. Also consider that disk performance (even though tempdb is “local”) is garbage, and you can’t do anything about it.

I think the lack of in-memory tempdb is odd too, since many features are cloud-first. One may wonder aloud why a performance feature that’s a couple years old now still isn’t available in flagship business critical cloud offerings.

It was only recently (September 2021) that scalar UDF inlining became available up in Azure SQL DB and Managed Instance.

Who’s really on the cutting edge? All those stodgy holdouts hanging about in their earthly garb.

Thanks for reading!

Going Further

Inline Table Valued Functions And Catch All Queries In SQL Server

Posted on October 25, 2021May 16, 2022 by Erik Darling

This Is Not A Solution

Right off the bat, I want you to know that this is not a solution, and I’ll get to why in a minute. I’m writing this mainly because every once in a while I’ll try something different to get this working, and it always ends up disappointing.

I wish I had better news for you, here. Hell, I wish I had better news for me here. But alas we’re at the mercy of parameters.

And yeah, I know, recompile, recompile, recompile. All the live long day. But I’ve seen some weird stuff happen with that too under high concurrency.

So what’s the point? Let’s talk about that.

Dot Dot Dot

CREATE INDEX p1 ON dbo.Posts(OwnerUserId, CreationDate);

CREATE INDEX p2 ON dbo.Posts(Score, LastActivityDate);

We need some indexes. That’s a fact. I’m intentionally creating them in this way to show you that SQL Server can sometimes be smart about catch all queries.

And here’s the inline table valued function we’ll be working with:

CREATE OR ALTER FUNCTION
    dbo.kitchen_sink
(
    @OwnerUserId int,
    @CreationDate datetime,
    @Score int,
    @LastActivityDate datetime
)
RETURNS table
AS
RETURN
SELECT
    c = COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE 
        (p.OwnerUserId = @OwnerUserId OR @OwnerUserId IS NULL)
    AND (p.CreationDate >= @CreationDate OR @CreationDate IS NULL)
    AND (p.Score >= @Score OR @Score IS NULL)
    AND (p.LastActivityDate >= @LastActivityDate OR @LastActivityDate IS NULL);

This pattern usually eats the optimizer alive, and there’s a lot of posts about using dynamic SQL to fix it.

But when we call this function with literal values, it does just fine.

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, '20130101', NULL, NULL) AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(NULL, NULL, 100, '20130101') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, NULL, NULL, '20130101') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(NULL, '20131225', NULL, '20131225') AS ks;

SELECT
    ks.c
FROM dbo.kitchen_sink(22656, NULL, NULL, '20131215') AS ks;

Das Plan

You can run those all yourself and look at the plans. I’m just gonna throw a couple of the more interesting examples in the post, though.

The first two queries do exactly what we’d hope to see.

We use the right indexes, we get seeks. Cardinality estimation is about as reliable as ever with the “””””default””””” estimator in place 🙄

And at one point, we even get a really smart index intersection plan where the optimizer uses both of our nonclustered indexes.

Parameter Problem

The problem is that no one really makes database calls like that.

If you’re using an ORM, you could intentionally not parameterize your queries and get this to “work”, but there are downsides to that around the plan cache. Being honest, most plan caches are useless anyway.

Long Live Query Store, or something.

Most people have their catch all code parameterized, so the query looks like what’s in the function. I’m going to throw the function in a stored procedure now.

CREATE OR ALTER PROCEDURE
    dbo.kitchen_wrapper
(
    @OwnerUserId int,
    @CreationDate datetime,
    @Score int,
    @LastActivityDate datetime
)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

    SELECT
        ks.c
    FROM dbo.kitchen_sink
    (
        @OwnerUserId, 
        @CreationDate, 
        @Score, 
        @LastActivityDate
    ) AS ks;

END;

If we execute the procedure like this, everything goes to hell rather quickly.

EXEC dbo.kitchen_wrapper
    @OwnerUserId = 22656,          
    @CreationDate = '20131215',    
    @Score = NULL,                  
    @LastActivityDate = NULL;

EXEC dbo.kitchen_wrapper
    @OwnerUserId = NULL,
    @CreationDate = NULL, 
    @Score = 100, 
    @LastActivityDate = '20131215';

Baywatch

The first execution uses the “right” index, but we lose our nice index seek into the p1 index.

We also end up with Predicates on the Key Lookup, just in case they end up not being NULL. And boy, when they end up not being NULL, we end up with a really slow query.

We re-use the execution plan we saw before, because that’s how SQL Server works. But since we don’t filter any rows from p1 since those parameters are NULL now, we pass all 17 million rows to the key lookup to filter them there, but since it’s a Nested Loops Join, we do it… one row at a time.

Fun.

Floss Too Much

There’s no great fix for this, either. This is a problem we’re stuck with when we write queries this way without using dynamic SQL, or a recompile hint.

I’ve seen people try all sorts of things to “fix” this problem. Case expressions, ISNULL and COALESCE, magic values, and more. They all have this exact same issue.

And I know, recompile, recompile, recompile.

Thanks for reading!

We’ve All Been There

And Again

Maybe Something Different

Going Further

Stumper

Thumper

Jumper

Lumper

Going Further

For Free!

Going Further

Number One Record

Wanna Ride

Watch Out Now

Two Point Two

Crossover

Going Further

Insert Card

Lonesome

Referential

Zone Out

Going Further

The Teeny-Tiny Woman

One Who Overflew

Aspiring Aspirin

Designer Drugs

Going Further

Blueprint

Not Yet But Soon

Going Further

Update!

Twelve Angry Files

Going Further

This Is Not A Solution

Dot Dot Dot

Das Plan

Parameter Problem

Baywatch

Floss Too Much

Going Further