SQL Server Consultant – Page 69

SQL Server 2022 Parameter Sensitive Plan Optimization: Sometimes There’s Nothing To Fix

Posted on July 6, 2022June 24, 2022 by Erik Darling

Best Intentions

After seeing places where the Parameter Sensitive Plan (PSP) optimization quite stubbornly refuses to kick in, it’s somewhat amusing to see it kick in where it can’t possibly have any positive impact.

Even though some parameters are responsible for filtering on columns with highly skewed data, certain other factors may be present that don’t allow for the type of plan quality issues you might run into under normal parameter sensitivity scenarios:

Adequate indexing
Row goals
Other filtering elements

This isn’t to say that they can always prevent problems, but they certainly tend to reduce risks much of the time.

If only everything were always ever perfect, you know?

Setup

Let’s start by examining some data in the Posts table.

First, PostTypeIds:

Questions and answers are the main types of Posts. The data is clearly skewed, here, and in my testing this does qualify for PSP on its own.

The thing is, there are several attributes that Questions can have that Answers can’t. One of those is a ParentId. Looking through how the top 15 or so of those counts breaks down:

Okay, so! Wikis don’t have ParentIds, neither do Moderator Nominations. More importantly, Questions don’t.

The Question with the Most answers is Id 184618, with 518. A far cry from the next-nearest Post Types, and light years from the number of Questions with a ParentId of zero.

More important than loving your data is knowing your data.

To Query A Butterfly

Let’s say we have this query:

SELECT TOP (5000)
    p.Id,
    p.OwnerUserId,
    p.Score
FROM dbo.Posts AS p
WHERE p.PostTypeId = @po
AND   p.ParentId = @pa
ORDER BY 
    p.Score DESC;

The three things we care about getting done are:

Filtering to PostTypeId
Filtering to ParentId
Ordering by Score

Either of these indexes would be suitable for that:

CREATE INDEX 
    popa
ON dbo.Posts
(
    PostTypeId,
    ParentId,
    Score DESC
)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

CREATE INDEX 
    papo
ON dbo.Posts
(
    ParentId,
    PostTypeId,
    Score DESC
)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

With No PSP At All

Under compatibility level 150, we can run the query in a variety of ways and get nearly identical performance results:

SQL Server Query Plan — PostTypeId = 1, ParentId = 0

There’s a 27 millisecond difference between the two to find the first 5000 rows that match both predicates. You would have to run these in a very long loop to accumulate a meaningful overall difference.

In this case, both queries use and reuse the same execution plan. You can see that in the estimates.

With All The PSP

Switching to compat level 160, the queries are injected with the PLAN PER VALUE hint.

SELECT TOP (5000)
    p.Id,
    p.OwnerUserId,
    p.Score
FROM dbo.Posts AS p
WHERE p.PostTypeId = @po
AND   p.ParentId = @pa
ORDER BY 
    p.Score DESC 
OPTION 
(
    PLAN PER VALUE
    (
        QueryVariantID = 2, 
        predicate_range
        (
            [StackOverflow2013].[dbo].[Posts].[PostTypeId] = @po, 
            100.0, 
            10000000.0
        )
    )
)

SELECT TOP (5000)
    p.Id,
    p.OwnerUserId,
    p.Score
FROM dbo.Posts AS p
WHERE p.PostTypeId = @po
AND   p.ParentId = @pa
ORDER BY 
    p.Score DESC 
OPTION 
(
    PLAN PER VALUE
    (
        QueryVariantID = 3, 
        predicate_range
        (
            [StackOverflow2013].[dbo].[Posts].[PostTypeId] = @po, 
            100.0, 
            10000000.0
        )
    )
)

The thing is, both queries end up with identical execution times to when there was no PSP involved at all.

In other words, there is no parameter sensitivity in this scenario, despite there being skew in the column data.

Even searching for the “big” result — Questions with a ParentId of zero, finishes in <30 milliseconds.

Ah well. Gotta train the models somehow.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2022 Parameter Sensitive Plan Optimization: Indexing Matters

Posted on July 5, 2022July 5, 2022 by Erik Darling

Starter Pistol

I started writing this post to answer a different question about if the Parameter Sensitive Plan (PSP) optimization worked with computed columns.

Of course it does! Even non-persisted computed columns have statistics generated on them. The only gatekeeping activity from there is whether or not your statistics qualify for the PSP optimization to kick in.

My excitement for the PSP optimization reminds me of the first time I drank Johnnie Walker Blue. I was all riled up on the hype around a $200 bottle of scotch.

Then it tasted like dessert syrup.

That’s a lot like what seeing this pop up in related Extended Events feels like, even for columns with massive amounts of skew in them.

SkewnessThresholdNotMet — how i didn’t meet your mother

Microsoft really need to introduce a hint that tells the optimizer to treat a parameter as being sensitive, and/or a way to tweak the threshold for Skewness not being met.

The heuristics are really falling short, here.

Hash And Rehash

Anyway, let’s use the example that I had started with here, to illustrate that the PSP optimization does work with a computed column, but… like any other column, indexes make all the difference.

I’m using the same example query over and over again, because a lot of the other great examples of parameter sensitivity that I have demo queries written for don’t seem to trigger it.

Here’s the computed column I started with:

ALTER TABLE 
    dbo.Posts 
ADD 
    WhatIsIt 
AS ISNULL
   (
       ParentId,
       0
   );

I’m not actually sure that there are NULL values in there, but the column is marked as NULLable so the ISNULL function doesn’t short-circuit.

Here’s the stored procedure we’ll be using:

CREATE OR ALTER PROCEDURE 
    dbo.FindMeSomePostsPlease 
(
    @ParentId int
)
AS 
SET NOCOUNT, XACT_ABORT ON;
BEGIN

    SELECT
        c = COUNT_BIG(*)
    FROM dbo.Posts AS p
    WHERE ISNULL(p.ParentId, 0) = @ParentId

END;

Bruise the lard that our optimizer can match expressions on its own.

No Index, No Cry

Using the calls that have worked to trigger the PSP optimization in the past, we do not get two important things:

Different cardinality estimates between executions
The option(plan per value... text added to the end of the plan

EXEC dbo.FindMeSomePostsPlease 
    @ParentId = 0;

EXEC dbo.FindMeSomePostsPlease 
    @ParentId = 184618;

The plans are expectedly disappointing:

Indexable

With the help of our trusty ~~steed~~ index, we get the PSP optimization:

CREATE INDEX 
    p
ON dbo.Posts
(
    WhatIsIt
)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

Granted, this is a toy example, but…

Bottom line: If you want cool new features to (sometimes) work, you’re gonna need to pay attention to indexes.

What Else About Indexes?

Well, two things. In the example above, PSP didn’t kick in until we had one, I think owing to the computed column not being materialized, but…

Sometimes you’ll get PSP without an index, but the plans won’t be much different aside from cardinality estimates

Here’s a quick example.

DECLARE
    @sql nvarchar(MAX) = N'',
    @po int = 2,
    @pa int = 184618
--184618
SELECT 
    @sql += N'  
SELECT TOP (5000)
    p.Id,
    p.OwnerUserId,
    p.Score
FROM dbo.Posts AS p
WHERE p.PostTypeId = @po
AND   p.ParentId = @pa
ORDER BY 
    p.Score DESC;
';

EXEC sys.sp_executesql
    @sql,
  N'@po int,
    @pa int',
    @po,
    @pa;

First, with no indexes:

Even though both queries have no choice but to scan the clustered index, we get slightly different plans for vastly different estimates. Sort of (ha ha ha) interestingly, if the Sort hadn’t spilled in the top plan, they would have around the same execution time.

Anyway, I’ve been experimenting with other things, but the results have been surprising. For instance, I thought key column order would push the optimizer towards marking one parameter or the other as sensitive, but that doesn’t happen.

Wild stuff.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: Does PSP Work With Dynamic SQL?

Posted on July 4, 2022July 7, 2022 by Erik Darling

No, Really

When I talk to clients about using dynamic SQL, they’re usually under the misconception that those plans can’t get reused.

That may be true under some circumstances when:

It’s not properly parameterized
You use EXEC only and not sp_executesql

Under more favorable circumstances, dynamic SQL gets run, executed, and plans cached and reused with the same frequency as stored procedures.

Now, dynamic SQL isn’t exactly the same as stored procedures. There’s a lot you can do with those that just looks a mess in dynamic SQL, especially longer bits of code.

In today’s post, we’re going to look at how the Parameter Sensitive Plan (PSP) optimization works with dynamic SQL.

Bright, Sunshiny

I just learned how to spell “sunshiny”. Don’t let anyone ever tell you there’s nothing left to learn.

To keep up the sunshiny visage of today’s post, let’s get a TL;DR here: PSP does work with parameterized dynamic SQL.

Here’s an example, using a query with a parameter eligible for the PSP optimization.

DECLARE
    @sql nvarchar(MAX) = 
        N'',
    @parameters nvarchar(MAX) = 
        N'@ParentId int';

SELECT 
    @sql += N'
SELECT
    c = COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.ParentId = @ParentId;
';

EXEC sys.sp_executesql
    @sql,
    @parameters,
    0;

EXEC sys.sp_executesql
    @sql,
    @parameters,
    184618;

Both executions here get the option(plan per value... text at the end that indicates PSP kicked in, along with different query plans as expected.

Being Dense

Writing the not-good kind of dynamic SQL, like so:

SELECT 
    @sql = N'
SELECT
    c = COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.ParentId = ' + CONVERT(nvarchar(11), 0) + ';';

You will of course get different execution plans, but you’ll get a new execution plan for every different value that gets passed in. You will not get the PSP optimization.

This is not a good example of how you should be writing dynamic SQL. Please don’t do this, unless you have a good reason for it.

Anyway, this is good news, especially for parameterized ORM queries that currently plague many systems in crisis that I get to see every week.

Fun.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: A Missed Opportunity For PSP To Make A Query Go Faster

Posted on July 1, 2022June 24, 2022 by Erik Darling

Farce

The index and procedure creation script for this are a bit on the long side, so I’m going to stick them in a GitHub gist to keep the post concise, since we have some other ground to cover here.

I want to lead with the statistics object that gets used for the query, which has a single equality predicate on the parameter value to search the VoteTypeId column.

The relevant portion of the query is this:

FROM     dbo.Votes AS v
WHERE    v.VoteTypeId = @VoteTypeId

The histogram matches the row counts from the table precisely. Thanks, full scan!

So, what’s the problem?

Autoproblematic

To my eye, there’s sufficient skewness here to present parameter sensitivity issues.

But SQL Server disagrees!

733 records vs. 37 million records seems appropriately skewed to me, but we get all of the typical parameter sensitivity symptoms.

Plansplosion

Let’s get ready to rumble, etc.

EXEC dbo.VoteSniffing 
    @VoteTypeId = 4;

EXEC dbo.VoteSniffing 
    @VoteTypeId = 2;

Here are the plans:

Take a moment here to admire the somewhat confusing plan timing that Batch Mode operations are presenting here: The bottom query runs for around 32 seconds.

Up to the Nested Loops join, we use ~17 seconds of wall clock time with our serial execution plan. The Hash Match operator runs for 15 seconds on its own, in Batch Mode.

Parameter sensitivity still happens in SQL Server 2022.

Reversi

If we run the procedure in reverse order, using 4 to cache the plan and then 2 to reuse, we get a similar regression:

Now, okay, let’s pause for a minute here. I need to backtrack, but I don’t want to spend a ton of blogspace on it. I’m also going to put the information in a very snooty block quote.

If we re-run the procedure from the section up above to search for VoteTypeId 2 a second time, Memory Grant Feedback will fix the spill at the Hash Join, and bring the total execution time down to about 15 seconds.

That is an improvement, but… Look at the plan here. If VoteTypeId 2 uses a plan more suited to the number of rows it has to process, the overall time is around 4 seconds, with no need for a memory grant correction.

The second plan for this execution sequence, searching for VoteTypeId 4 second in order, the overall time goes from 0 seconds and 0 milliseconds to 1.4 seconds. The big plan does not make searching for infrequent values faster.

So you see, the big plan isn’t always better.

Missed Connections

Hey, look, this is the first CTP. Maybe stuff like this is still getting ironed out. Maybe this blog post will change the course of history.

I am only moderately influential in the eyes of Microsoft, though, so perhaps not.

Anyway, this seems like a situation with sufficient skew to produce the Dispatcher plan and then additional sub-plans to account for far different row counts present in the VoteTypeId column.

If this scenario (and other similar scenarios) is outside the scope of the feature, query tuning folks are still going to have a whole lot of parameter sensitivity issues to deal with.

And should that be the case, I’d like to speak to the manager.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: When PSP Can Help Your Queries Go Faster

Posted on June 30, 2022June 24, 2022 by Erik Darling

Errant Hairs

I’ve used this procedure as an example in the past. It’s a great parameter sniffing demo.

Why is it great? Because there’s exactly one value in the Posts table that causes an issue. It causes that issue because someone hated the idea of normalization.

The better thing to do here would be to have separate tables for questions and answers. Because we don’t have those, we end up with a weird scnenario.

In the Posts table, because questions and answers are lumped in together, there are certain traits that different types of posts can’t share:

Answers can’t have answers
Questions can’t have parent questions

There are other examples, but those are the two most obvious ones. But anyway, because of that, every Question has a ParentId of zero, and every Answer has the ParentId of the question it was posted under.

With around 6 million questions in the Posts table, that means there are around 6 million rows with a ParentId of zero, and around 11 million rows with other values.

Current Affairs

In compat level 150, if I run this procedure like so:

EXEC dbo.OptionalRecompile 
    @ParentId = 184618;

EXEC dbo.OptionalRecompile 
    @ParentId = 0;

The query plan is shared, and the second execution eats it:

From 4ms to 2 minutes is pretty bad.

2OH22

In compat level 160, if I do that exact same thing, the plans are immediately different for each execution:

This is a much better situation. Each separate plan is perfectly eligible for reuse. Neat. Great.

This is exactly the kind of query plan shenanigans (plananigans?) that should be avoided.

The only difference in the query text is the QueryVariantID:

option (PLAN PER VALUE(QueryVariantID = 2, predicate_range([StackOverflow2013].[dbo].[Posts].[ParentId] = @ParentId, 100.0, 1000000.0)))
option (PLAN PER VALUE(QueryVariantID = 3, predicate_range([StackOverflow2013].[dbo].[Posts].[ParentId] = @ParentId, 100.0, 1000000.0)))

And over in Query Store, we have these little lovelies:

SELECT qspf.* FROM sys.query_store_plan_feedback AS qspf;

+------------------+---------+------------+--------------+-----------------------+-------+-------------------+
| plan_feedback_id | plan_id | feature_id | feature_desc |     feedback_data     | state |    state_desc     |
+------------------+---------+------------+--------------+-----------------------+-------+-------------------+
|                6 |       3 |          1 | CE Feedback  | {"Feedback hints":""} |     1 | NO_RECOMMENDATION |
|                7 |       2 |          1 | CE Feedback  | {"Feedback hints":""} |     1 | NO_RECOMMENDATION |
+------------------+---------+------------+--------------+-----------------------+-------+-------------------+

Good job, SQL Server 2022.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: Extended Events Related To PSP

Posted on June 29, 2022June 30, 2022 by Erik Darling

Respect Wood

There isn’t a lot interesting in these Extended Events. I was pretty disappointed with them.

The ones I looked at are in this session definition:

CREATE EVENT SESSION 
    psp
ON 
    SERVER 
ADD EVENT 
    sqlserver.parameter_sensitive_plan_optimization
(
    ACTION(sqlserver.sql_text)
),
ADD EVENT 
    sqlserver.parameter_sensitive_plan_optimization_skipped_reason
(
    ACTION(sqlserver.sql_text)
),
ADD EVENT 
    sqlserver.parameter_sensitive_plan_testing
(
    ACTION(sqlserver.sql_text)
),
ADD EVENT 
    sqlserver.query_with_parameter_sensitivity
(
    ACTION(sqlserver.sql_text)
)
ADD TARGET 
    package0.event_file
(
    SET 
        filename = N'psp'
)
WITH 
(
    MAX_MEMORY = 4096 KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 1 SECONDS,
    MAX_EVENT_SIZE = 0 KB,
    MEMORY_PARTITION_MODE = NONE,
    TRACK_CAUSALITY = OFF,
    STARTUP_STATE = OFF
);

For the lazy, that’s:

parameter_sensitive_plan_optimization: This event is fired when a query uses Parameter Sensitive Plan (PSP) Optimization feature.
parameter_sensitive_plan_optimization_skipped_reason: Occurs when the parameter sensitive plan feature is skipped. Use this event to monitor the reason why parameter sensitive plan optimization is skipped
parameter_sensitive_plan_testing: Fired when parameter sensitive plan is tested.
query_with_parameter_sensitivity: This event is fired when a query is discovered to have parameter sensitivity. This telemetry will help us in identifying queries that are parameter sensitive and how skewed the columns involved in the query are.

The only one I haven’t gotten to fire yet in my testing is parameter_sensitive_plan_testing which does not break my heart.

Mythos

Most of what gets returned by those is barely useful. The one that I thought was most promising was the skipped_reason event. I was able to track map values down for that one:

+-------------------------+---------+----------------------------------+
|          name           | map_key |            map_value             |
+-------------------------+---------+----------------------------------+
| psp_skipped_reason_enum |       0 | None                             |
| psp_skipped_reason_enum |       1 | QueryVariant                     |
| psp_skipped_reason_enum |       2 | NoParameter                      |
| psp_skipped_reason_enum |       3 | InMemoryOLTP                     |
| psp_skipped_reason_enum |       4 | AutoParameterized                |
| psp_skipped_reason_enum |       5 | NonCacheable                     |
| psp_skipped_reason_enum |       6 | WithRecompileFlag                |
| psp_skipped_reason_enum |       7 | Unrecompilable                   |
| psp_skipped_reason_enum |       8 | TableVariable                    |
| psp_skipped_reason_enum |       9 | DBScopedConfigOff                |
| psp_skipped_reason_enum |      10 | QueryHint                        |
| psp_skipped_reason_enum |      11 | HasLocalVar                      |
| psp_skipped_reason_enum |      12 | QueryTextTooLarge                |
| psp_skipped_reason_enum |      13 | CursorWithPopulate               |
| psp_skipped_reason_enum |      14 | CLRModule                        |
| psp_skipped_reason_enum |      15 | Tvf                              |
| psp_skipped_reason_enum |      16 | DistributedQuery                 |
| psp_skipped_reason_enum |      17 | FullText                         |
| psp_skipped_reason_enum |      18 | OutputOrModifiedParam            |
| psp_skipped_reason_enum |      19 | UsePlan                          |
| psp_skipped_reason_enum |      20 | PDW                              |
| psp_skipped_reason_enum |      21 | Polybase                         |
| psp_skipped_reason_enum |      22 | EDC                              |
| psp_skipped_reason_enum |      23 | GQ                               |
| psp_skipped_reason_enum |      24 | DatatypesIncompat                |
| psp_skipped_reason_enum |      25 | LoadStatsFailed                  |
| psp_skipped_reason_enum |      26 | PlanGuide                        |
| psp_skipped_reason_enum |      27 | ParamSniffDisabled               |
| psp_skipped_reason_enum |      28 | NonInlinedUDF                    |
| psp_skipped_reason_enum |      29 | SkewnessThresholdNotMet          |
| psp_skipped_reason_enum |      30 | ConjunctThresholdNotMet          |
| psp_skipped_reason_enum |      31 | CompatLevelBelow160              |
| psp_skipped_reason_enum |      32 | UnsupportedStatementType         |
| psp_skipped_reason_enum |      33 | UnsupportedComparisonType        |
| psp_skipped_reason_enum |      34 | SystemDB                         |
| psp_skipped_reason_enum |      35 | UnsupportedObject                |
| psp_skipped_reason_enum |      36 | CompilationTimeThresholdExceeded |
| psp_skipped_reason_enum |      37 | Other                            |
+-------------------------+---------+----------------------------------+

Reading through that list, there are some interesting points where the feature won’t kick in. These points are sort of a ghosts of query anti-patterns past.

TableVariable
HasLocalVar
CursorWithPopulate
Tvf
ParamSniffDisabled
NonInlinedUDF

And, of course, this one gave me quite the chuckle: QueryTextTooLarge.

YOU HEAR THAT ENTITY FRAMEWORK?

Dulls

The results from my Extended Event session, and… I’ll talk through my disappointment in a moment, was not terribly spectacular.

The skipped_reason doesn’t reliably collect query text the way the other events do. That makes identifying the query that got skipped pretty difficult. There may be another action (like the stack) that captures it, but I haven’t had time to fully investigate yet.

Okay. Fine.

What do you do with this? I’m not sure yet. I don’t think I see much use in these just yet.

Hey, tomorrow’s another day. We’ll look at how the feature kicks in and fixes a parameter sniffing issue with this procedure.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: How PSP Works To Solve Some Parameter Sensitivity Issues

Posted on June 28, 2022June 24, 2022 by Erik Darling

Classico

In the bad old days, which are still your days (but at least you have better days to look forward to), SQL Server’s cost-based optimizer would take a parametrized query, come up with a query plan, and then reuse that plan until:

Plan cache clearing event
Plan evicting event

There are a bunch of reasons for those things. They’re not important here.

What is important is the future, where SQL Server will have some more options for getting out of parameter sensitive jams.

I talked about a somewhat related feature called Cardinality Estimation Feedback here. This is a different beast, though.

While Cardinality Estimation Feedback is a between-executions like Memory Grant Feedback, the Parameter Sensitive Plan feature is a heuristic runtime decision, sort of like Adaptive Joins and Batch Mode On Row Store.

Like most things, you’ll need to be in the latest compatibility level, 160, and… it hasn’t been announced yet, but traditionally things like this are Enterprise Only.

Time will tell.

Shellac

The way this feature works is, rather than caching a single query plan for every other execution to use, it creates what’s called a Dispatcher plan (if your query qualifies).

You’ll see something like this in the properties of the root node of your query plan, and your query will have some additional funny business at the end of it.

And that other thing:

SELECT TOP (10) 
        u.DisplayName, 
        p.*
    FROM dbo.Posts AS p
    JOIN dbo.Users AS u
        ON p.OwnerUserId = u.Id
    WHERE p.ParentId = @ParentId
    ORDER BY u.Reputation DESC 
    OPTION 
    (
        PLAN PER VALUE
        (
            QueryVariantID = 3, 
            predicate_range
            (
                [StackOverflow2013].[dbo].[Posts].[ParentId] = @ParentId, 
                100.0, 
                1000000.0
            )
        )
    )

Note that I formatted the option part of the query a bit to make it a bit more likable (to me).

The basic message here is that for the @ParentId parameter, there are three plan variants (the limit currently), and this is the third one.

There are also a couple numbers there that indicate the upper and lower bounds for bucketizing the three plan variants.

S/M/L

The basic idea here is to generate small, medium, and large plans for different cardinality estimation ranges.

Most of the examples I’ve been able to come up with have only generated two plans, one for the very low end, and one for the very high end.

As far as I can tell, the limitations for this feature seem to be:

Only for a single parameter predicate
The predicate has to be an equality
You only get three plan choices
At present, it disassociates queries from the procedures that they live in

In tomorrow’s post, we’ll look at some of the Extended Events attached to this feature that give you some insights into when the feature kicks in, when it doesn’t, and what SQL Server thought about how sensitive your query is.

Thanks for reading!

Going Further

SQL Server 2022 Parameter Sensitive Plan Optimization: The Problem With Sniffed Parameter Sensitivity

Posted on June 27, 2022June 27, 2022 by Erik Darling

Long Time Coming

When Microsoft first started coming up with these Intelligent Query Processing features, I think everyone who cares about That Sort Of Thing© wondered when parameter sensitivity would get fixed.

Let’s take a brief moment to talk about terminology here, so you don’t go getting yourself all tied up in knots:

Parameter Sniffing: When the optimizer creates and caches a plan based on a set of parameter(s) for reuse
Parameter Sensitivity: When a cached plan for one set of parameter(s) is not a good plan for other sets of parameter(s)

The first one is a usually-good thing, because your SQL Server won’t spend a lot of time compiling plans constantly. This is obviously more important for OLTP workloads than for data warehouses.

This can pose problems in either type of environment when data is skewed towards one or more values, because queries that need to process a lot of rows typically need a different execution plan strategy than queries processing a small number of rows.

This seems a good fit for the Intelligent Query Processing family of SQL Server features, because fixing it sometimes requires a certain level of dynamism.

Choice 2 Choice

The reason this sort of thing can happen often comes down to indexing. That’s obviously not the only thing. Even a perfect index won’t make nested loops more efficient than a hash join (and vice versa) under the right circumstances.

Probably the most classic parameter sensitivity issue, and why folks spend a long time trying to fix them, is the also-much-maligned Lookup.

But consider the many other things that might happen in a query plan that will hamper performance.

Join type
Join order
Memory grants
Parallelism
Aggregate type
Sort/Sort Placement
Batch Mode

The mind boggles at all the possibilities. This doesn’t even get into all the wacky and wild things that can mess SQL Server’s cost-based optimizer up a long the way.

Table variables
Local variables
Optimize for unknown
Non-SARGable predicates
Wrong cardinality estimation model
Row Goals
Out of date statistics

The mind also boggles here. Anyway, I’ve written quite a bit about parameter sensitivity in the past, so I’m going to link you to the relevant post tag for those.

Unlearn

With SQL Server 2022, we’ve finally got a starting point for resolving this issue.

In tomorrow’s post, we’ll talk a bit about how this new feature works to help with your parameter sensitivity issues, which are issues.

Not your parameter sniffing issues, which are not issues.

For the rest of the week, I’m going to dig deeper into some of the stuff that the documentation glosses over, where it helps, and show you a situation where it should kick in and help but doesn’t.

Keep in mind that these are early thoughts, and I expect things to evolve both as RTM season approaches, and as Cumulative Updates are released for SQL Server 2022.

Remember scalar UDF inlining? That thing morphed quite a bit.

Can’t wait for all of you to get on SQL Server 2019 and experience it.

Thanks for reading!

Going Further

How To Get Answers To SQL Server Performance Questions

Posted on June 24, 2022June 17, 2022 by Erik Darling

Exhaustion

It doesn’t matter where you’re posting or what your problem is, there are specific things that people will need to get you an answer.

What they need is often referred to as a reproducible example, or a MCVE. For SQL Server questions, that could include:

The query plan (actual is better than cached/estimated)
The query itself (please format it nicely)
Indexes involved (go figure; these are important)
Sample data (not your actual data, just enough of a resemblance to make it work)
Expected results (this is more important for getting help writing a query, but can be useful for perf questions too)

What no one wants to get into when it comes to performance questions is a giant wall-of-text word problem.

You may be the most eloquent question-asker in the known universe, but having the above items is worth hundreds of millions of words.

Get That Query Plan

Getting a query plan doesn’t have to be hard. What you should aim for is the actual plan, though. Estimated plans leave too much up to the imagination.

You can do it in SSMS by hitting this button, then running your query. Don’t bother with stuff like client statistics. Ain’t no one got time for that, as they say.

The thing is, you may be dealing with a stored procedure or more complex set of queries, and you’re not sure how to just capture the plans for the ill-performers.

Good news, there. You can use my free script sp_HumanEvents and follow the instructions here to do that.

Once you’ve got the plan, DON’T POST A PICTURE OF IT. People who post pictures of query plans on Q&A sites are immediately eaten by an ethereal sinkhole.

You can:

Paste The Plan
Gist The Plan
Choose your own method

What folks need is the plan XML, which you can get by right clicking on the plan you care about:

Textual Healing

Of course, the query plan is useful for most things. It’s useful to have the query as well, though.

Sometimes you get lucky and it’s shorter than the limit imposed in the query plan XML, other times you need to get the text elsewhere.

You can capture the query text in several ways, too:

sp_HumanEvents, like above
sp_WhoIsActive
sp_helptext, if it’s a proc/function, etc.

All of these things allow you to get the query, but now you need to make it readable for people. No, I don’t mean putting it in a CTE. I mean formatting, baby.

Generate Table And Index

The first two items above should be enough to get you some advice on why your query is slow, but some folks might also want to see what’s going on underneath.

It’s relatively easy to script out a table definition and also get any indexes attached via SSMS.

SQL Server SSMS Script Table — backstreets

Often, column data types and current indexing tell a part of the story that isn’t captured in just the plan or the text.

Thoughtful

I’m not going to talk about anonymizing stuff here. I know that Plan Explorer claims to do it, but it does not fully anonymize things.

You’ll still find things like statistics names, and linked server details remain intact.

If you need to generate test data, here are some ways to do it:

One other thing I want to make sure is included here, is don’t say things like “I have a problem with this query”. Be specific. If you’re getting an error message, post the error message. if you’re getting incorrect results, post those and the correct results and explain why they’re wrong.

No one can see what’s on your screen. If you want help for free, you’re gonna have to do the work.

Thanks for reading!

Going Further

SQL Server 2022: Cardinality Estimation Feedback

Posted on June 23, 2022June 23, 2022 by Erik Darling

Quiet As Kept

I’ve been trying to take the general temperature when it comes to SQL Server 2022. At least from a performance perspective, some interesting things have been introduced so far.

There have been a few neat things:

Parameter Sensitive Plan optimizations
Query Store Hints
Memory Grant Feedback improvements
DOP Feedback
Cardinality Estimation Feedback

I’m not seeing a whole lot out there. I’m not sure why. I follow quite a few SQL bloggers via Feedly.

Perhaps it’s just too new. Maybe everyone is waiting for CTP SP1.

Well, anyway. In this post I want to talk a little bit about what Cardinality Estimation Feedback can do, and what it can’t do.

What It Do

First, you need Query Store enabled to get this to work. It relies on the Query Store Plan hints also introduced for SQL Server 2022.

For queries that execute frequently and retain cached plans, the optimizer will look at some of the assumptions that get made under different Cardinality Estimation models.

Things like:

Row Goals
Predicate independence/correlation
Join containment being simple or base

What each of those things means isn’t terribly important to the post, but all of them are things that are influenced by using the legacy or default cardinality estimators.

As I understand it, this is a bit like Memory Grant Feedback. If estimation issues are detected, a different plan will be attempted. If that plan corrects a performance issue, then the hint will get persisted in Query Store.

Pretty cool, but…

What It Don’t Do

It doesn’t fix things while they’re running, like Adaptive Joins can do. That’s sort of unfortunate! Hear me out on why.

Often, when model errors are incorrect, queries run for a long time. Particularly when row goals are introduced, query plans are quite sensitive to those goals not being met quickly.

It’d be really unfortunate for you to sit around waiting for 15-16 executions of a poor performing query to finish executing before an intervention happens.

I would have either:

Reduced, or made this threshold configurable
Been more aggressive about introducing Adaptive Joins when CE models influence plan choices

After all, Adaptive Joins help queries at runtime rather than waiting for an arbitrary number of executions and then stepping in.

Perhaps there was a good reason for not doing this, but those were the first two things to cross my mind when looking into the feature.

How It Do

I was able to get the feature to kick in using a familiar query.

Here’s the setup script:

DBCC FREEPROCCACHE;
ALTER DATABASE 
    StackOverflow2010 
SET 
    QUERY_STORE CLEAR;
GO

    CREATE INDEX whatever 
        ON dbo.Votes(CreationDate, VoteTypeId, PostId);
    
    CREATE INDEX apathy
        ON dbo.Posts (PostTypeId)
            INCLUDE (OwnerUserId, Score, Title);
GO

    SELECT TOP (2500) 
        p.OwnerUserId, 
        p.Score, 
        p.Title, 
        v.CreationDate,
        ISNULL(v.BountyAmount, 0) AS BountyAmount
    FROM dbo.Posts AS p
    JOIN dbo.Votes AS v
        ON  p.Id = v.PostId
    WHERE v.VoteTypeId = 1
    AND   p.PostTypeId = 1
    ORDER BY v.CreationDate DESC;
    GO 17

SELECT qspf.* FROM sys.query_store_plan_feedback AS qspf;

SELECT qsqh.* FROM sys.query_store_query_hints AS qsqh;

For the first 16 runs, we get the same query plan that takes about 2 seconds.

Then, magically, on run #17, we get a different query plan!

Pretty cool! The plan totally changed, and clearly got better. I am happy about this. Not so happy that it would have taken 16 executions of a Potentially Painful© query to get here, but you know.

Here we are.

In Query Store

There are a couple views that will detail where hints came from and which were applied:

Since I just cleared out query store prior to this running, we can infer some things:

CE Feedback kicked in and gave us a new plan with a hint to disable row goals
The second plan generated was identified by the engine as needing memory grant feedback

I suppose this is a good reason to do some work on sp_QuickieStore so y’all can see this stuff in action.

Thanks for reading!