DBA Archives | Page 103 of 141

SQL Server IF Branches And Query Performance Part 1: The Problem

Posted on May 23, 2022September 15, 2022 by Erik Darling

Manifesto

This is a problem I deal with quite a bit when helping people track down performance problems and, you know, solve them.

The basic scenario is something like this:

CREATE PROCEDURE
    dbo.iffy_kid
(
    @p1 int,
    @p2 int,
    @decider varchar(10)
)
AS
SET NOCOUNT, XACT_ABORT ON;

IF @decider = 'this_table'
BEGIN

    SELECT
        this.*
    FROM dbo.this_table AS this
    WHERE this.this_column = @p1;

END;

IF @decider = 'that_table'
BEGIN

    SELECT
        that.*
    FROM dbo.that_table AS that
    WHERE that.that_column = @p2;

END;

ELSE
BEGIN

    /*Do something else*/

END;

You have some parameter that decides which logical execution path that a query will take, and different queries that run based on that path.

What this does not control is query optimization paths, or cardinality estimation paths, at least not written in this manner.

First Blood

When this stored procedure is executed for the first time, or when some recompilation event happens, both queries will get a query plan generated and cached.

For simplicity, let’s say that when a query plan is cached, it it’s compiled and executed with

@p1 = 100
@p2 = NULL
@decider = ‘this_table’

SQL Server’s query optimizer will generate a query plan for the entire stored procedure based on cardinality estimation for:

@p1 = 100 as a predicate on this_table
@p2 = NULL as a predicate on that_table

On future executions, if the runtime execution parameters change to:

@p1 = NULL
@p2 = 200
@decider = ‘that_table’

The query plan with cardinality estimation for @p2 = NULL will be reused.

You’ve essentially multiplied any parameter sensitivity issue by:

The number of separate IF branched queries
The number of parameters fed into the stored procedure

Exploration

Over the rest of the week, I’m going to cover this topic from a few different angles to show you what works and what doesn’t work for fixing the problem.

Clients that I work with are often very surprised by the gotchas, intricacies, and weird details that crop up when writing queries like this.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Week In Other Databases: IO in PostgreSQL: Past, Present, Future

Posted on May 20, 2022May 5, 2022 by Erik Darling

Cool Cool Cool

Grand Bonne

I know what you’re thinking: Who cares about that free database?

Well, it’s not necessarily the Postgres part that you might care about, but more the fact that a third party is developing software to do what major vendors aren’t doing.

This sort of thing might come to SQL Server someday, and it probably should. The self-tuning features in Azure are ass.

Thanks for reading.

Going Further

Indexed Views In SQL Server: No Filtered Indexes Or Filtered Statistics

Posted on May 12, 2022May 16, 2022 by Erik Darling

Half Baked

In my quest to love indexed views more, I’m always trying new things with them to solve problems.

Occasionally, I am pleasantly surprised by what can be accomplished with them. Occasionally.

Today was not an occasion. Let’s take an unfortunate look.

CREATE TABLE
    dbo.IndexedViewMe
(
    id int PRIMARY KEY CLUSTERED
);
GO 

CREATE VIEW 
    dbo.TheIndexedView
WITH SCHEMABINDING
AS
SELECT
    ivm.id
FROM dbo.IndexedViewMe AS ivm;
GO 

CREATE UNIQUE CLUSTERED INDEX
    uqi
ON dbo.TheIndexedView
    (id);

INSERT 
    dbo.IndexedViewMe
(
    id
)
SELECT
    x.c
FROM 
(
    SELECT 1 
      UNION ALL 
    SELECT 2
) AS x(c);

This gives us a tiny little table and indexed view. If we try to do either of these things, it doesn’t go well:

CREATE INDEX 
    i
ON dbo.TheIndexedView
    (id)
WHERE 
    id = 2;

Msg 10610, Level 16, State 1, Line 40

Filtered index ‘i’ cannot be created on object ‘dbo.TheIndexedView’ because it is not a user table. Filtered indexes are only supported on tables.

If you are trying to create a filtered index on a view, consider creating an indexed view with the filter expression incorporated in the view definition.

CREATE STATISTICS 
    s
ON dbo.TheIndexedView
    (id)
WHERE 
    id = 2;

Msg 10623, Level 16, State 1, Line 47

Filtered statistics ‘s’ cannot be created on object ‘dbo.TheIndexedView’ because it is not a user table. Filtered statistics are only supported on user tables.

Sort of a bummer, that. And it strikes me that it’s an odd limitation — especially for the statistics — but what can you do?

Indexed views haven’t changed aside from bug fixes in forever and a day. I doubt there’ll be any real investment in enhancing them anytime soon.

Thanks for reading!

Going Further

Using Views To Reduce Memory Grants In SQL Server

Posted on May 11, 2022May 16, 2022 by Erik Darling

We All Have It

You know those tables, right? The ones where developers went and got lazy or didn’t know any better and decided every string column was going to be gigantic.

They may have read, of course, that SQL Server’s super-smart variable length data types only consume necessary space.

It’s free real estate.

Except it isn’t, especially not when it comes to query memory grants.

The bigger a string column’s defined byte length is, the bigger the optimizer’s memory grant for it will be.

Memory Grant Primer

In case you need some background, the short story version is:

All queries ask for some memory for general execution needs
Sorts, Hashes, and Optimized Nested Loops ask for additional memory grants
Memory grants are decided based on things like number of rows, width of rows, and concurrently executing operators
Memory grants are divided by DOP, not multiplied by DOP
By default, any query can ask for up to 25% of max server memory for a memory grant
Approximately 75% of max server memory is available for memory grants at one

Needless to say, memory grants are very sensitive to misestimates by the optimizer. Going over can be especially painful, because that memory will most often get pulled from the buffer pool, and queries will end up going to disk more.

Underestimates often mean spills to disk, of course. Those are usually less painful, but can of course be a problem when they’re large enough. In particular, hash spills are worth paying extra attention to.

Memory grant feedback does supply some relief under modern query execution models. That’s a nice way of saying probably not what you have going on.

Query Noogies

Getting back to the point: It’s a real pain in the captain’s quarters to modify columns on big tables, even if it’s reducing the size.

SQL Server’s storage engine has to check page values to make sure you’re not gonna lose any data fidelity in the process. That’ a nice way of saying you’re not gonna truncate any strings.

But if you do something cute like run a MAX(LEN(StringCol) and see what you’re up against, you can use a view on top of your table to assuage SQL Server’s concerns about such things.

After all, functions are temporary. Data types are forever (usually).

An easy way to illustrate what I mean is to look at the details of these two queries:

SELECT TOP (1000)
    p.Body
FROM dbo.Posts AS p
ORDER BY p.Score DESC
OPTION(RECOMPILE);

SELECT TOP (1000)
    Body = 
        CONVERT
        (
            nvarchar(100), 
            p.Body
        )
FROM dbo.Posts AS p
ORDER BY p.Score DESC
OPTION(RECOMPILE);

Some of this working is dependent on the query plan, so let’s look at those.

Pink Belly Plans

You can ignore the execution times here. The Body column is not a good representation of an oversized column.

It’s defined as nvarchar(max), but (if I’m remembering my Stack lore correctly) is internally limited to 30k characters. Many questions and answers are longer than 100 characters anyway, but on to the plans!

In the plan where the Body column isn’t converted to a smaller string length, the optimizer asks for a 16GB memory grant, and in the second plan the grant is reduced to ~3.5GB.

This is dependent on the compute scalar occurring prior to the Top N Sort operator, of course. This is where the convert function is applied to the Body column, and why the grant is reduced

If you were to build a view on top of the Posts table with this conversion, you could point queries to the view instead. That would get you the memory grant reduction without the pain of altering the column, or moving the data into a new table with the correct definition.

Thanks for reading!

Going Further

SQL Server Performance Mysteries: Why Does Production Get A Bad Plan That Development Doesn’t?

Posted on May 10, 2022May 16, 2022 by Erik Darling

Statistical Legacy

A client question that I get quite a bit is around why queries in production get a bad query plan that queries in dev, QA, or staging don’t get is typically answered by looking at statistics.

Primarily, it’s because of the cardinality estimates that queries get around ascending keys. It usually gets called the ascending key problem, but the gist is that:

You have a pretty big table
You’re using the legacy cardinality estimator
A whole bunch of rows get inserted, but not enough to trigger an auto stats update
You’re not using compatibility level >= 130 or trace flag 2371
Queries that look for values off an available histogram get a one row estimate using the legacy Cardinality Estimator or a 30% estimate using the default Cardinality Estimator

Which is a recipe for potentially bad query plans.

Reproductive Script

Here’s the full repro script. If you’re using a different Stack Overflow database, you’ll need to adjust the numbers.

USE StackOverflow2013;

/*Figure out the 20% mark for stats updates using legacy compat levels*/
SELECT
    c = COUNT_BIG(*),
    c20 = CEILING(COUNT_BIG(*) * .20)
FROM dbo.Users AS u;

/*Stick that number of rows into a new table*/
SELECT TOP (493143)
    u.*
INTO dbo.Users_Holder
FROM dbo.Users AS u
ORDER BY u.Id DESC;


/*Delete that number of rows from Users*/
WITH 
    del AS
(
SELECT TOP (493143)
    u.*
FROM dbo.Users AS u
ORDER BY u.Id DESC
)
DELETE
FROM del;

/*I'm using this as a shortcut to turn off auto stats updates*/
UPDATE STATISTICS dbo.Users WITH NORECOMPUTE;

/*Put the rows back into the Users Table*/
SET IDENTITY_INSERT dbo.Users ON;

INSERT
    dbo.Users
(
    Id,
    AboutMe,
    Age,
    CreationDate,
    DisplayName,
    DownVotes,
    EmailHash,
    LastAccessDate,
    Location,
    Reputation,
    UpVotes,
    Views,
    WebsiteUrl,
    AccountId
)
SELECT
    uh.Id,
    uh.AboutMe,
    uh.Age,
    uh.CreationDate,
    uh.DisplayName,
    uh.DownVotes,
    uh.EmailHash,
    uh.LastAccessDate,
    uh.Location,
    uh.Reputation,
    uh.UpVotes,
    uh.Views,
    uh.WebsiteUrl,
    uh.AccountId
FROM dbo.Users_Holder AS uh;

SET IDENTITY_INSERT dbo.Users OFF;

/*Figure out the minimum Id we put into the holder table*/
SELECT
    m = MIN(uh.Id)
FROM dbo.Users_Holder AS uh;

/*Compare estimates*/
SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
WHERE u.Id > 2623772
OPTION(USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));


SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u
WHERE u.Id > 2623772
OPTION(USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION'));

/*Cleanup*/
UPDATE STATISTICS dbo.Users;

TRUNCATE TABLE dbo.Users_Holder;

Query Plans

Here are the plans for the stars of our show:

In these query plans, you can see the legacy cardinality estimator gets a one row estimate, and the default cardinality estimator gets a 30% estimate.

There isn’t necessarily a one-is-better-than-the-other answer here, either. There are times when both can cause poor plan choices.

You can think of this scenario as being fairly similar to parameter sniffing, where one plan choice does not fit all executions well.

Checkout

There are a lot of ways that you can go about addressing this.

In some cases, you might be better off using trace flag 2371 to trigger more frequent auto stats updates on larger tables where the ~20% modification counter doesn’t get hit quickly enough. In others, you may want to force one estimator over the other depending on which gets you a better plan for most cases.

Another option is to add hints to the query in question to use the default cardinality estimator (FORCE_DEFAULT_CARDINALITY_ESTIMATION), or to generate quick stats for the index/statistics being used (ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS). Documentation for both of those hints is available here. Along these lines, trace flags 2389, 2390, or 4139 may be useful as well.

Of course, you could also try to address any underlying query or index issues that may additionally contribute to poor plan choices, or just plan differences. A common problem in them is a seek + lookup plan for the one row estimate that doesn’t actually make sense when the actual number of rows and lookup executions are encountered at runtime.

Thanks for reading!