SQL Server 2022 CTP 2.1 Improvements To Parameter Sensitive Plan Optimization

Hop On Your Good Foot


Several weeks back, I blogged about a missed opportunity with the new parameter sensitive plan feature. At the time, I thought that there was indeed sufficient skewness available to trigger the additional plan variants, and apparently some nice folks at Microsoft agreed.

If we step back through the old demo, we’ll get different results.

EXEC dbo.VoteSniffing 
    @VoteTypeId = 4;

EXEC dbo.VoteSniffing 
    @VoteTypeId = 2;

I’m not gonna go into a lot of depth here, because I’m writing this while traveling, but we get the (mostly) desired outcome of two different plans being generated. Each plan is more suitable to the amount of data that the query has to process.

SQL Server Query Plan
home improvement

I’d consider this a success, even if it adds more queries to the unidentifiable corpse-pile.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

SQL Server 2022 Introduces IS [NOT] DISTINCT FROM; World Underwhelmed

Back In Time


Feeling quite old while remembering the first time I ever read this, and this note at the end:

This is the SQL language IS DISTINCT FROM feature —implemented in the query processor, but not yet available in the T-SQL language.

Groovy. Since at least 2011, this has been in the Query Processor, and here in 2022 we finally get the linguistic support.

The thing is, it’s pretty underwhelming, and I’m going to show you why I think so.

First, in the docs for SQL Server, all the examples use a single literal value, like so:

SELECT * FROM #SampleTempTable WHERE id IS DISTINCT FROM 17;

I went looking for other docs examples from vendors who have had the syntax around for 10+ years, and there wasn’t anything all that much more interesting.

Mostly case expressions and whatnot.

Big deal.

Alignment


First, if I try to run either of these queries, I’ll get an error after about 6 seconds.

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
WHERE c.UserId IS DISTINCT FROM 
               (
                   SELECT
                       v.UserId
                   FROM dbo.Votes AS v
               );

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
WHERE c.UserId IS NOT DISTINCT FROM 
               (
                   SELECT
                       v.UserId
                   FROM dbo.Votes AS v
               );

Why does it take 6 seconds to get an error? Because a few parts of the query plan have to do some work, and then finally:

Msg 512, Level 16, State 1, Line 1
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

Sort of like how sometimes you try to SUM a column and after a while you get an error about arithmetic overflow.

This is a bit annoying, because that means we need a way to return a single value to evaluate.

So Yeah…


We can’t even rewrite the queries like this to get around the error, but I do want to show you the plans.

This is why we have to wait several seconds to get an error (unless you change it to IS DISTINCT FROM ALL/ANY):

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
WHERE c.UserId IS DISTINCT FROM 
               (
                   SELECT
                       v.UserId
                   FROM dbo.Votes AS v
                   WHERE v.UserId = c.UserId
               );

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
WHERE c.UserId IS NOT DISTINCT FROM 
               (
                   SELECT
                       v.UserId
                   FROM dbo.Votes AS v
                   WHERE v.UserId = c.UserId
               );

Adding a where clause inside the subquery doesn’t help.

But these query plans are total motherchucking disasters, anyway. We’ll get into indexing later, but right now they both have the same shape and operators, though slightly different semantics to deal with is/is not distinct.

SQL Server Query Plan
is not cool

Both plans run single threaded, and using Nested Loops as the physical join type, which stinks because we’re putting together two pretty big tables.

Not to mention that Eager Index Spool. What a filth.

Adding Indexes


We need these indexes to make things go any faster. Before we do anything else, let’s create these so we’re not just sitting around thumb-twiddling.

CREATE INDEX
    c
ON dbo.Comments
    (UserId)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

CREATE INDEX
    v
ON dbo.Votes
    (UserId)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

Thinking About It


Okay, so writing the query like we did up there isn’t going to get us anything. Perhaps my expectations are a bit too exotic.

Let’s try something a bit more austere:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
JOIN dbo.Votes AS v
    ON c.UserId IS DISTINCT FROM v.UserId;

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
JOIN dbo.Votes AS v
    ON c.UserId IS NOT DISTINCT FROM v.UserId;

The first thing to be aware of here is that the IS DISTINCT FROM is an inequality predicate, so you’re stuck with Nested Loops as the physical join type:

SQL Server Query Plan
nightmare

I ran out of care-juice waiting for this to finish, so all you’re getting is an estimated plan. The lack of an equality predicate here means you don’t have Hash or Merge join as an option.

Following up on bad ideas, the IS NOT DISTINCT FROM is an equality predicate, but the plan chosen is a serial Merge Join variety, which drags on 14 seconds too long:

SQL Server Query Plan
change up

Drop An Index


If we simulate not having any useful indexes on one table or the other by hinting the clustered index, the performance outlook does not improve.

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c WITH(INDEX = 1)
JOIN dbo.Votes AS v
    ON c.UserId IS NOT DISTINCT FROM v.UserId;

SELECT
    c = COUNT_BIG(*)
FROM dbo.Comments AS c
JOIN dbo.Votes AS v WITH (INDEX = 1)
    ON c.UserId IS NOT DISTINCT FROM v.UserId;

No useful parts of the first query happen in Batch Mode, but the second query is rescued by two hash aggregates happening in batch mode.

SQL Server Query Plan
ouching

An odd point to make on a blog focused on SQL Server performance tuning is that sometimes not having a useful index gets you a better plan.

Anyway, I’m going back to my vacation.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

SQL Server 2022: Introduces the DATETRUNC Function, So You Can Truncate Your Dates And Stuff

Underscoring The Importance


When I first sat down to write about this, I made a funny mistake: I kept writing DATE_TRUNC over and over again.

In SQL Server it’s DATETRUNC.

Why? Because that’s the way it’s implemented in Postgres and DB2. Oracle, of course, just calls it TRUNC.

So, while it’s nice to have (what appears to be) the same behavior, it doesn’t exactly help to not have a 1:1 calling equivalent to other platforms.

I assume most of these additions to T-SQL are for cross-platform development and migrations.

Of course, Microsoft being so gosh darn late to this game means folks have probably been rolling-their-own versions of these functions for years.

If they went and called their system function DATE_TRUNC or even TRUNC, they might have some object naming issues to contend with.

Well, okay. But how does it work?

Childish Games


Here are some quick examples of how you call it.

SELECT TOP (10)
    u.DisplayName,
    year = 
        DATETRUNC(YEAR, u.LastAccessDate),
    quarter = 
        DATETRUNC(QUARTER, u.LastAccessDate),
    month = 
        DATETRUNC(MONTH, u.LastAccessDate),
    dayofyear = 
        DATETRUNC(DAYOFYEAR, u.LastAccessDate),
    day = 
        DATETRUNC(DAY, u.LastAccessDate),
    week = 
        DATETRUNC(WEEK, u.LastAccessDate),
    iso_week = 
        DATETRUNC(ISO_WEEK, u.LastAccessDate),
    hour = 
        DATETRUNC(HOUR, u.LastAccessDate),
    minute = 
        DATETRUNC(MINUTE, u.LastAccessDate),
    second = 
        DATETRUNC(SECOND, u.LastAccessDate),
    millisecond = 
        DATETRUNC(MILLISECOND, u.LastAccessDate),
    microsecond = 
        DATETRUNC(MICROSECOND, u.LastAccessDate) /*Doesn't work with datetime because there are no microseconds*/
FROM dbo.Users AS u;

And here are the results:

SQL Server Query Results
workin’

The thing to note here is that there’s no rounding logic involved. You just go to the start of whatever unit of time you choose. Of course, this doesn’t seem to do anything to the millisecond portion of DATETIME, because it’s not precise enough.

But for anyone out there who was hoping for a SOMONTH function to complement the EOMONTH function, you get this instead.

Works well enough!

But does it perform, Darling?

UnSARGable?


To make any test like this worthwhile, we need an index to make data searchable.

CREATE INDEX
    v
ON dbo.Votes
    (CreationDate)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);

You know, because that’s what they do. To make searching faster. Hello.

So look, under these perfect circumstances, everything performs well. But we have to do a lot of typing.

SELECT
    c = COUNT_BIG(*)
FROM dbo.Votes AS v
WHERE v.CreationDate >= DATETRUNC(YEAR, CONVERT(datetime, '20130101 00:00:00.000'));

Note here that we’re working on a literal value, not a column value, and we have to tell the datetrunc function which type we want via the convert function so that we get a simple seek plan:

SQL Server Query Plan
grin

Without all that, we get a dynamic seek plan:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Votes AS v
WHERE v.CreationDate >= DATETRUNC(YEAR, '20130101 00:00:00.000');

This has some… obvious performance issues compared to the above plan with correct data types.

SQL Server Query Plan
query pan

Query Pranks


Frequent blog readers will not be surprised that wrapping a table column in the new DATETRUNC function yields old performance problems:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Votes AS v
WHERE DATETRUNC(YEAR, v.CreationDate) >= CONVERT(datetime, '20130101 00:00:00.000');

This is particularly annoying because we’re truncating the column to the beginning of the year, which should be searchable in the index since that’s the sort order of the data in the index.

SQL Server Query Plan
shined out

Like most functions, these are fine in the presentation layer, but terrible in the relational layer. There’s no warning about this performance degradation in the documentation, either at the example of using the function in a where clause, or in the final closing remarks.

But that’s par for the course with any of these built-in functions.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Cost Of Complexity In SQL Server

Value Added


SQL Server can do a whole lot of things, aside from return the results of your queries. The thing is, the more of those you add into the mix the tougher it becomes to scale, and maintain reliable performance.

Of course, there are other design considerations that introduce the same kind of complexity:

  • Lots of tables in your schema leads to more joins
  • Lots of nested views lead to more choice for the optimizer
  • Lots of databases lead to cross-database joins, which can lead to tricky issues if they’re in different compat levels, etc.
  • Lots of functions and triggers that have entire books of business logic in them make everything go to hell

But beyond that, consider some more internal design choices:

  • Change Tracking
  • Change Data Capture
  • Auditing
  • Temporal Tables

Now you’re asking SQL Server to log a bunch of stuff every time you do something in your database. You have the action, and then writing that action down.

On top of that, you’re also opening yourself up to more queries because people are going to want to access and report on that data. This isn’t even touching all the additional internal monitoring that goes into SQL Server to track everything going on in there.

Observer overhead is a real thing.

Now let’s add in data protection and synchronization

  • Availability Groups
  • Replication (God forgive me)

I’m leaving Log Shipping out of the equation because it doesn’t really add much complexity on its own. You may add complexity do it by needing to copy files to strange places and choosing to report off shipped databases. I’m also leaving Failover Clusters out, because there’s one set of data that sits in one place. Only the SQL Server instance moves between nodes, unless you decide to span Failover Clusters and synchronize the data using your SAN or something. That stuff usually has a big enough price tag on it to dissuade most folks from tramping down that Path Of Misery©.

Both Availability Groups and Replication can make life miserable. I work with some very nice people who can’t create indexes unless they switch their Availability Groups to asynchronous mode first.

I also work with some very nice people who use Replication and don’t get much sleep at night. Me? I won’t touch the stuff.

The point, though, is that if you decide to have SQL Server start doing a whole lot of extra stuff every time you run a query, or make your queries really complicated to come up with an execution plan for, performance might suffer.

It’s up to you to make sure that you plan accordingly for adding in new features.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Even If SQL Server Table Variables Were Always In Memory, It Wouldn’t Make Them Better Than Temp Tables

Past Perfect


You know all those pesky developer myths that never go away?

  • CTEs and Views can’t use indexes: WRONG
  • Subqueries are slower than joins: WRONG
  • Temp tables are bad for performance: WRONG
  • Table Variables are always in memory: WRONG

Okay, that’s not a complete list, but a complete list would fill the internet to its very brim and cause all sorts of overflow errors.

My job as a performance tuning consultant is to teach folks when they’ve been lead astray.

To confuse things a bit further, Microsoft allows for in-memory @table variables, but you have do a lot of work to get set up to use them.

They’re not the ones you get out of the box.

Future Mediocre


Let’s say for the sake of argument that @table variables were always more “in-memory” than #temp tables (they’re not). They’d still have a lot of the same fundamental problems:

With all that stuff in mind: who cares if they’re only in memory? There are so many downsides for most query patterns, they could all happen in quantum space and performance would still stink.

I know, they have their uses:

But most of y’all aren’t using them that wisely.

And here’s the other thing: Most of y’all don’t have enough memory to begin with, and the memory you do have you misuse in a million ways. Why do you wanna make that worse?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Some Requests For Authors Of SQL Server Books

Heal Thyself


I dislike criticism without introspection, so anything that I say here I’ll also be doing my best to apply to anything I produce.

But anyway, I’m writing this because I read a lot of SQL Server books, and there are some things about them are pretty frustrating.

This goes for both print and digital copies, though some advice is specific to print copies because I think a lot of folks still buy them.

Anyway, here are some thoughts I have. You can show this post to your editors if you want. Maybe it’s their fault, not yours.

I have no idea.

Code Formatting


While I don’t expect anyone’s coding format to be as highly-evolved to near-perfection as mine, I do prefer consistency.

Many times while reading a book, the code formatting style changes from query to query, page to page, etc.

I understand that with multiple authors, things may change between chapters (but even then, agree to something, would ya?) but it’s jarring to see queries written in several different styles throughout a book.

Most code formatting tools out there will get you at least 70% of the way to code being formatted the way you want, even if you’re particularly persnickety about things.

Code Placement


For the absolute love of Crispin Glover, stop putting code in places where it will span multiple pages.

It sucks for everyone, no matter how they’re reading the book, but all those nice little SQL Server Scrubs who bought (hopefully bought) the digital copy of your book have an awkward time copying and pasting things over to run locally.

For the Print-cesses of the world, we just don’t need that kind of page-turning, cliff-hanging suspense.

I’ve seen this more than anything, and it drives me bonkers every time. Just keep writing and say the code example is on the next page.

If your code example takes up a whole page and then some… Well, look, I don’t have an easy answer for you. You want it there for digital folks to copy, but no one with a print edition is going to write that all out from scratch.

Maybe you should just get yourself a nice GitHub gist and link to longer code examples? Use a URL shortener or GO codes to make memorable URLs?

Either way, this particular issue is in every book I’ve ever read.

Image Quality


We really need to talk about this one.

Now, I get it, you’re probably not gonna talk anyone into printing a full color SQL Server book. At least no one smart.

If you can do that, great. You’re one step closer to your book not looking like a 90’s zine about anarchy or graffiti or whatever.

The next step: use high quality images. Please, please, please, use high quality images.

No one’s going to be able to read your book and follow along without wondering if they have cataracts if they’re staring at washed-out, fuzzy images the whole way through.

Far too often, it looks like someone compressed-to-death and watered down the printer ink for every image in a book.

Tech Review


Get someone you disagree with to tech review your book. Not someone who’s wrong, just someone who has a different point of view.

This isn’t a pitch to get me to do it. Unless I love you to death, I’m probably not gonna wanna read and review your book.

But far too many books seem to suffer from “I got my friend to do this”, which leads to groupthink.

There’s no alternate viewpoint considered or explored, and you end up with one-sided highways of thought.

Doing a tech review shouldn’t be about toeing a line, it should be about making a book more technically accurate.

Concept Introduction


I’m not sure if this is a book or thought  organization issue, but… If you’re in an early chapter of a book, say the first three, and you keep seeing things like “we cover this more in chapter 12” or 13 or 14 or some other far off place, you probably shouldn’t be doing it right then and there.

When you’re going to introduce a concept, you should either be covering it in that chapter, or covering it in the next one.

I see this quite a bit, and wonder if I need to skip way ahead in the book and read about this other thing, and then come back.

That’s a crappy, because it makes you feel lost in the material that’s supposed to be bringing you clarity on a subject.

Try to contain yourself, here. Maybe you’re saying too much too soon.

Auto-Errata


It happens to me every time I write a blog post. I re-read it as I’m going through, I re-read it when I’m done, and I tend to check in on things a day or so before it goes live.

When it goes live, I read through the post, and notice half-a-hundred things that I wanna fix, tweak, or expand on.

I can only imagine what that’s like at Bookscale© where you have 300+ pages of blog posts publishing at once.

As you make changes to the book, you should notify folks who bought it. Just randomly posting corrections is hard to keep track of.

GitHub is, again, a decent way to get and track feedback. It also gives you a way to announce large releases to the content so everyone knows when a new version is available.

Hopefully you’re making digital copies available to folks who purchase a physical copy. Otherwise, you’re gonna have a hard time getting a better product in their hands.

Reinventing The Wheel


Rather than take up a whole page with some reheated DMV query, give some love to community scripts out there that do the same thing, but better.

Anyone who knows better is going to pass right over that stuff anyway.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

SQL Server 2022’s GENERATE_SERIES Doesn’t Suck Anymore

Life Comes At You Fast


The summer intern at Microsoft was hard at work between CTPs. Last time around, there were some serious performance issues with our dear new friend GENERATE_SERIES.

With the release of CTP 2.1, the problems that we saw the first time around are all gone. But there are still a couple small caveats that you should be aware of.

There’s also been a change in the way you call the function, too. You not longer need the START and STOP identifiers in the function.

You just put in whatever numbers you’re into.

That’s probably a good thing.

Caveat #1: Parallelism Restrictions


Loading data into tables that have row store indexes on them can’t go parallel.

DROP TABLE IF EXISTS
    dbo.art_aux;

CREATE TABLE 
    dbo.art_aux
(
    id int NOT NULL PRIMARY KEY CLUSTERED
);

INSERT INTO
    dbo.art_aux WITH(TABLOCK)
(
    id
)
SELECT
    gs.value
FROM GENERATE_SERIES
     (
         1, 
         10000000
     ) AS gs
OPTION
(
    MAXDOP 8, 
    QUERYTRACEON 8649
);

But performance is still pretty good, here. Much better than prior.

SQL Server Query Plan
cooperation

I think most folks out there would be okay waiting a few seconds for 10 million rows to show up in a table for them.

So sequential. Much unique.

But, what if you want to load those rows in faster?

Caveat #2: Use A Heap Or SELECT INTO Instead


If you create a heap, or just SELECT INTO on instead, you’ll get a parallel load into the table.

For 10 million rows, there’s an improvement of about 1.7 seconds on my sort of crappy travel VM with 4 available CPUs.

DROP TABLE IF EXISTS
    dbo.art_aux;

CREATE TABLE 
    dbo.art_aux
(
    id int NOT NULL
);

INSERT INTO
    dbo.art_aux WITH(TABLOCK)
(
    id
)
SELECT
    gs.value
FROM GENERATE_SERIES
     (
         1, 
         10000000
     ) AS gs;

Here’s the plan:

SQL Server Query Plan
hot swap

Yes, dear reader, I even blog on vacation. That’s how much I love and care about you.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Why You Can’t Always Rely On Estimated Query Plans In SQL Server

Simple Enough


Most of the time, if you get the estimated plan for a query, the actual execution plan will look about the same.

You’ll get the additional information of course, but the shape and most of the other attributes will match up.

Other times, things will change between the two.

Here’s a silly example I came across recently while working on demos for some other stuff.

SELECT TOP (100)
     p.Id,
     vs.UpVotes,
     vs.DownVotes
FROM dbo.VoteStats() AS vs
JOIN dbo.Posts AS p
    ON vs.PostId = p.Id
WHERE vs.DownVotes > vs.UpMultipier
AND   p.CommunityOwnedDate IS NULL
AND   p.ClosedDate IS NULL
ORDER BY vs.UpVotes DESC;

Estimated Execution Plan


Here’s the estimated plan I get for this query:

SQL Server Query Plan
estimations

SQL Server estimates that a serial execution plan will work best, here.

Actual Execution Plan


Here’s the actual execution plan for the same query:

SQL Server Query Plan
actuality

At runtime, SQL Server decides that a parallel plan is the better solution.

Wild Nights


SQL Server can totally estimate parallel plans. I’ve seen it a million times.

In this case, though, it doesn’t do that. If you get the estimated plan after executing the query, it’ll come back with the parallel version.

Why does this happen? Well, as SQL Server adds features to the Intelligent Query Processing, more and more decisions are made when a query executes.

Stuff like this can make estimated and actual plans look much different from each other:

  • Adaptive joins
  • Interleaved execution
  • Table variable deferred compilation
  • Batch mode on row store

Wonders of modern query optimizers, eh?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Let’s Stop Calling Queries “Expensive”

Bad Names


When we talk about finding queries to tune, there’s an unfortunate term that gets thrown around: Expensive Queries.

Why is it unfortunate? Well, it reinforces the wrong mindset when it comes to query tuning, and leads people down the wrong path, looking at the wrong metrics.

SQL Server does have a cost-based optimizer, but those costs don’t mean anything to your hardware, or even to how long a query runs for.

Those costs are all estimates, based on two-decade old computer specs. There are many times when they’re wrong, or not properly aligned with reality.

Worse, it leads people to want to do crazy things like sort the plan cache by query cost to find things to tune.

Worse than that, they’ll look at “query cost relative to batch” to compare two queries for efficiency.

Ethically Sourced


There are many sources to find queries eating up your server hardware.

The point of this post isn’t to teach you how to use any of those things, but to teach you how to be smarter about using them in whatever way you’re comfortable.

My two favorite metrics to look at when looking for queries to tune are CPU and RAM. I explain why in the post, but the short story is that they’re reliable, real-life metrics that can be directly measured before and after to gauge progress.

I don’t look at things like reads, because those might go up or down while your query runtime doesn’t change at all.

They’re also pretty misleading if you’re looking at STATISTICS IO in a lot of circumstances, like with lookups.

Sortie


A while back I recorded a bunch of videos that show how cached/estimated plans can be really misleading when it comes to costs and all that stuff.

You can find it here:

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Some New Stuff In SQL Server 2022 CTP 2.1

New Baby


SQL Server 2022 CTP 2.1 dropped recently, and there’s some new stuff in it! What kind of new stuff?

Well, there’s some linguistic stuff:

  • APPROX_PERCENTILE_DISC
  • APPROX_PERCENTILE_CONT
  • DATETRUNC
  • IS [NOT] DISTINCT FROM
  • Bit manipulation:
    • LEFT_SHIFT
    • RIGHT_SHIFT
    • BIT_COUNT
    • GET_BIT
    • SET_BIT

How useful any of that will be depends on how you use SQL Server. I’ve never heard anyone yearning for bit manipulation functions, but perhaps I travel in the wrong circles.

Inside You


As far as “unannounced” stuff, here’s some of what I noticed:

There’s a new database scoped configuration called LEDGER_DIGEST_STORAGE_ENDPOINT. The name seems fairly self-documenting, but I can’t picture myself using ledger enough to explore it further.

There are a bunch of new objects, too:

+------------------------------------------------------------+----------------------------------+
|                            name                            |            type_desc             |
+------------------------------------------------------------+----------------------------------+
| dm_db_xtp_hash_index_approx_stats                          | SQL_INLINE_TABLE_VALUED_FUNCTION |
| dm_tran_distributed_transaction_stats                      | VIEW                             |
| extgov_attribute_sync_objects_synchronizing                | VIEW                             |
| sp_change_feed_configure_parameters                        | SQL_STORED_PROCEDURE             |
| sp_dw_physical_manifest_file_table_insert                  | EXTENDED_STORED_PROCEDURE        |
| sp_dw_physical_upsert                                      | EXTENDED_STORED_PROCEDURE        |
| sp_help_change_feed_table                                  | SQL_STORED_PROCEDURE             |
| sp_manage_distributed_transaction                          | EXTENDED_STORED_PROCEDURE        |
| sp_md_discovery_stats_staleness_detection                  | EXTENDED_STORED_PROCEDURE        |
| sp_reset_dtc_log                                           | EXTENDED_STORED_PROCEDURE        |
| sp_trident_create_credential_to_access_internal_md_storage | EXTENDED_STORED_PROCEDURE        |
+------------------------------------------------------------+----------------------------------+

Again, their usefulness will depend on how you use the product.

Here are some new extended events:

  • query_ce_feedback_telemetry
  • query_feedback_analysis
  • query_feedback_validation
  • resumable_add_constraint_executed

I don’t think I’ll need to add any of these to sp_HumanEvents, but I’m sure you’ll see them in lots of Microsoft demos.

There are also some new wait stats, but I think I’ll be ignoring them.

  • CDC_SCAN_FINISHED
  • CORRUPTED_PAGE_PROCESS
  • DW_DB
  • DW_WS_DB_LIST
  • NATIVE_SHUFFLE_WRITE_BUFFER_DEQUEUE
  • PARQUET_INDEX_BUILD_MANIFEST_SYNC
  • PURVIEW_POLICY_SDK_PREEMPTIVE_SCHEDULING
  • PWAIT_S3_TEMP_CREDENTIAL
  • RBAC_AAD_GROUP_INFO
  • SPINLOCK_EXT
  • WAIT_EXTGOV_PERMCACHE_DECISIONLOCK
  • WAIT_EXTGOV_PERMCACHE_RESOURCELOCK

Nothing all that juicy this time around. Maybe next time!

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.