SQL Server 2022: What The Heck Is sp_copy_data_in_batches?

Make It Or Not


I’m gonna be honest with you, dear reader, because without honesty we’ve got nothing.

Except lies — which you know — those can be comforting sometimes. Hm. I’ll have to think about that one for a bit.

While digging through to find new stuff in SQL Server 2022, this stored procedure caught my eye.

If you try to get the text of it, you get told off. It’s All Internal© as they say on the tubes.

EXEC sp_helptext 
    'sp_copy_data_in_batches';

Well, okay. But we can try to get it working on our own. Usually I use this method to figure out what parameters a new thing requires to run.

Not this time!

Trial And Error


What I usually do is stick NULL or DEFAULT after the EXEC to to see what comes back. Sometimes using a number or something makes sense too, but whatever.

This at least helps you figure out:

  • Number of parameters
  • Expected data types
  • Parameter NULLability
  • Etc. and whenceforth

Eventually, I figured out that sp_copy_data_in_batches requires two strings, and that it expects those strings to exist as tables.

The final command that ended up working was this. Note that there is no third parameter at present to specify a batch size.

sp_copy_data_in_batches 
    N'dbo.art', 
    N'dbo.fart';

Path To Existence


This, of course, depends on two tables existing that match those names.

CREATE TABLE dbo.art(id int NOT NULL PRIMARY KEY);
CREATE TABLE dbo.fart(id int NOT NULL PRIMARY KEY);

One thing to note here is that you don’t need a primary key to do this, but the table definitions do need to match exactly or else you’ll get this error:

Msg 37486, Level 16, State 2, Procedure sp_copy_data_in_batches, Line 1 [Batch Start Line 63]
'sp_copy_data_in_batches' failed because column 'id' does not have the same collation, 
nullability, sparse, ANSI_PADDING, vardecimal, identity or generated always attribute, CLR type 
or schema collection in tables '[dbo].[art]' and '[dbo].[fart]'.

Because GENERATE_SERIES is still a bit rough around the edges, I’m gonna do this the old fashioned way, which turns out a bit faster.

INSERT 
    dbo.art WITH(TABLOCK)
(
    id
)
SELECT TOP (10000000)
    id = 
        ROW_NUMBER() OVER 
        (
            ORDER BY 1/0
        )
FROM sys.messages AS m
CROSS JOIN sys.messages AS m2;

Behind The Scenes


I sort of expected to run some before and after stuff, and see the count slowly increment, but the query plan for sp_copy_data_in_batches just showed this:

SQL Server Query Plan
wink wink

I’m not really sure what the batching is here.

Also, this is an online index operation, so perhaps it won’t work in Standard Edition. If there even is a Standard Edition anymore?

Has anyone heard from Standard Edition lately?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2022: GENERATE_SERIES Causes Parallel Deadlocks In A Transaction

Many Times!


These table valued functions of the built-in variety have this problem.

This one is no exception. Well, it does throw an exception. But you know.

That’s not exceptional.

DROP TABLE IF EXISTS
    dbo.select_into;

BEGIN TRAN

SELECT
    id = 
        gs.value
INTO dbo.select_into
FROM GENERATE_SERIES
     (
         START = 1, 
         STOP = 10000000
     ) AS gs
OPTION(MAXDOP 8);

COMMIT;

If you run the above code, you’ll get this error:

Msg 1205, Level 13, State 78, Line 105
Transaction (Process ID 70) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
SQL Server Error
pilot program

Like the issues I outlined in yesterday’s post, I do hope these get fixed before go-live.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2022 GENERATE_SERIES: Some Notes At Release Time

Yep, I know

a great post
great post, psychic

It’s the first public CTP. Things will change. Things will get better. Think about the rich history of Microsoft fixing stuff immediately, like with adding an ordinal position to STRING_SPLIT.

That came out in SQL Server 2016, and uh… Wait, they just added the ordinal position in SQL Server 2022. There were two major versions in between that function getting released and any improvements.

With that in mind, I’m extending as much generosity of spirit to improvements to the function at hand: GENERATE_SERIES.

Quite a while back, I blogged about how much I’d love to have this as a function. We… sort of got it. It doesn’t do dates natively, but you can work around some of that with date functions.

In this post, I want to go over some of the disappointing performance issues I found when testing this function out.

Single Threaded In A Parallel Plan


First up, reading streaming results from the function is single threaded. That isn’t necessarily bad on its own, but can result in annoying performance issues when you have to distribute a large number of rows.

If you have to ask what the purpose or use case for 10 million rows is, it’s your fault that SQL Server doesn’t scale.

Got it? Yours, and yours alone.

DROP TABLE IF EXISTS
    dbo.art_aux;

CREATE TABLE 
    dbo.art_aux
(
    id int NOT NULL PRIMARY KEY CLUSTERED
);

The first way we’re going to try this is with a simple one column table that has a primary key/clustered index on it.

Of course, we won’t expect a parallel insert into the table itself because of that index, but that’s okay. For now.

INSERT INTO
    dbo.art_aux WITH(TABLOCK)
(
    id
)
SELECT
    gs.value
FROM GENERATE_SERIES
     (
         START = 1, 
         STOP = 10000000
     ) AS gs
OPTION(MAXDOP 8);

The query plan for this insert looks about like so:

SQL Server Query Plan
sup with that

I’m only including the plan cost here to compare it to the serial plan later, and to understand the per-operator cost percentage breakdown.

It’s worth noting that the Distribute Streams operator uses Round Robin partitioning to put rows onto threads. That seems an odd choice here, since Round Robin partitioning pushes packets across exchanges.

For a function that produces streaming integers, it would make more sense to use Demand partitioning which only pulls single rows across exchanges. Waiting for Round Robin to fill up packets with integers seems a poor choice, here.

Then we get to the Sort, which Microsoft has promised to fix in a future CTP. Hopefully that happens! But it may not help with the order preserving Gather Streams leading up to the Insert.

SQL Server Query Plan
preservatives

It seems a bit odd ordered data from the Sort would hamstring the Gather Streams operator’s ability to do its thing, but what do I know?

I’m just a bouncer, after all.

But The Serial Plan


Using the same setup, let’s make that plan run at MAXDOP 1.

INSERT INTO
    dbo.art_aux WITH(TABLOCK)
(
    id
)
SELECT
    gs.value
FROM GENERATE_SERIES
     (
         START = 1, 
         STOP = 10000000
     ) AS gs
OPTION(MAXDOP 1);

You might expect this to run substantially slower to generate and insert 10,000,000 rows, but it ends up being nearly three full seconds faster.

Comparing the query cost here (1048.11) vs. the cost of the parallel plan above (418.551), it’s easy to understand why a parallel plan was chosen.

It didn’t work out so well, though, in this case.

SQL Server Query Plan
cereal

With no need to distribute 10,000,000 rows out to 8 threads, sort the data, and then gather the 8 threads back to one while preserving that sorted order, we can rely on the serial sort operator to produce and feed rows in index-order to the table.

Hopefully that will continue to be the case once Microsoft addresses the Sort being present there in the first place. That would knock a second or so off the the overall runtime.

Into A Heap


Taking the index out of the picture and inserting into a heap does two things:

But it also… Well, let’s just see what happens. And talk about it. Query plans need talk therapy, too. I’m their therapist.

DROP TABLE IF EXISTS
    dbo.art_aux;

CREATE TABLE 
    dbo.art_aux
(
    id int NOT NULL
);
SQL Server Query Plan
hmmmmm

The Eager Table Spool here is for halloween protection, I’d wager. Why we need it is a bit of a mystery, since we’re guaranteed to get a unique, ever-increasing series of numbers from the function. On a single thread.

Performance is terrible here because spooling ten million rows is an absolute disaster under any circumstance.

With this challenge in mind, I tried to get a plan here that would go parallel and avoid the spool.

Well, mission accomplished. Sort of.

Crash And Burn


One thing we can do is use SELECT INTO rather than relying on INSERT SELECT WITH (TABLOCK) to do try to get it. There are many restrictions on the latter method.

SELECT
    id = 
        gs.value
INTO dbo.select_into
FROM GENERATE_SERIES
     (
         START = 1, 
         STOP = 10000000
     ) AS gs
OPTION(MAXDOP 8);

This doesn’t make things better:

SQL Server Query Plan
four minutes!

This strategy clearly didn’t work out.

Bummer.

Again, I’d say most of the issue is from Round Robin partitioning on the Distribute Streams.

Finish Him


The initial version of GENERATE_SERIES is a bit rough around the edges, and I hope some of these issues get fixed.

And, like, faster than issues with STRING_SPLIT did, because it took a really long time to get that worked on.

And that was with a dozen or so MVPs griping about it the whole time.

But there’s an even bigger problem with it that we’ll look at tomorrow, where it won’t get lost in all this stuff.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Desperately Seeking SQL: Advice For Small Companies Trying To Hire Quality Developers

On Sight


I’m gonna take a quick break from (mostly) technical content to blog about a problem I see day in and day out when working with clients: Bad SQL©.

The stuff I normally blog about — why some things are Bad In SQL© — are the result of seeing real performance pains. But Bad SQL© isn’t just anti-patterns, it’s also terribly formatted queries that hide anti-patterns.

You can normally eyeball a query to find things that generally don’t agree with performance out of the box, like:

  • Functions (inline ones aside)
  • Table variables
  • Stacked Common Table Expressions
  • Non-SARGable predicates
  • Overly complicated queries
  • Insert a million other things here

But of course, the more complicated queries are, or the more layers of abstraction exist in a query, the harder this stuff is to spot quickly. Particularly with views, and nested views, bad ideas can be buried many layers deep in the sediment.

I call it sediment because code often looks like geologic layers, where you can tell who wrote what and when based on style and techniques that got used.

And to vendors who encrypt their god-awful code: �

The Great Untangling


Getting through that untangling can be a costly and time consuming process, depending on the level of damage done over the years, and the desired outcome. Sometimes it’s easier to rewrite everything from scratch than to do in-place rewrites of existing objects.

It’s obviously worth exploring enhancements in newer versions of SQL Server that may power things across the finish line:

  • Perhaps the new cardinality estimator does more good than harm
  • Batch Mode On Row Store does a lot of good with bad code
  • Scalar UDF Inlining can solve a lot of function problems

There are many other general and targeted improvements that might help your workload without code changes. Hopefully that continues with SQL Server 2022.

On top of the workload improvements, new versions also provide improved insights into problems via dynamic management views, Query Store, logging, and more.

If you’re not on at least SQL Server 2016 right now, you’re leaving a whole lot on the table as far as this goes.

Hiring Issues


It’s tough for smaller companies to attract full time talent to fix huge backlogs of issues across SQL Server stored procedures, functions, views, index and table design, and all that.

Or even harder, convert ORM queries into sensible stored procedures, etc. when you start hitting performance limitations in the single-query model.

I asked on Twitter what folks out there consider attractive employment offerings and got a TON of great feedback.

First, I need acknowledge that not everyone wants to work for a huge company. Second, I need to acknowledge that salary isn’t everything to everyone.

But let’s assume that a smaller company want to hire someone in competition with a larger company. What can they offer when they run out of salary runway, and can’t match equity?

  • Clear career paths/Upward mobility
  • Flexible schedules
  • Paid time off for training
  • Covering the costs of training and certifications
  • Focusing on employee growth (not just sticking them in a corner to monkey with the same thing for years)
  • Quality of company culture (meeting overload was something I got a lot of DMs about)
  • Conference travel budgets
  • Meaningful company mission
  • Introducing tech savvy folks to the business side of things
  • Recognizing that not every employee wants to be an On-callogist

There were more, but these were the things I got the most hits from folks on. Having these doesn’t mean you can expect someone to take 20-30% less on the salary front, of course, but if you’re close to another offer these things might sway folks to your side.

Far and away, what I took from responses is that folks want to feel effective; like they can make a difference without a lot of bureaucracy and red tape. Get the hell out of my way, to coin a phrase.

Finder’s Fee


When it comes to attracting people to your company — think of it as your employer SEO — the SQL Server community is a great place to start.

If you want to try something for free, keep an eye out for when Brent posts to find out Who’s Hiring In The Database Community. It doesn’t cost you anything, but you have to keep on top of the posts and replies, and make sure you have good job description that sticks out.

If you have any location-based requirements for your candidates, try sponsoring a local SQL Server user group’s meetings for a few months. There may be a small, nominal fee if it’s entirely virtual. If it’s in-person, you’ll foot the bill for dozen or so pizza pies for attendees. That usually gets you an announcement before and after whatever speaker is presenting. It’s totally fair to ask for attendance numbers. Keeping on with that, consider sponsoring a SQL Saturday event. These typically have a deeper reach than a monthly user group, since there are more attendees in a concentrated area. You may get a booth, or your logo on slides, and whatever else you can negotiate with the event planners.

If you’re okay with spending more money for a lot of eyeballs, larger events like PASS Summit, and SQLBits are annual conferences with thousands of attendees. As a FYI, these are the types of conferences whomever you hire is probably going to want to attend, too.

Imagine that.

Askance


I have clients ask me to help them find quality employees for roles from time to time, or to help them interview folks they’ve farmed themselves.

Normally I’m happy to help on either front, and leave sealing the deal to them. I think from now on I’m gonna point them to this post, so they have some better ideas about how to put a stamp on things.

Not every company can offer everything, but as large companies continue to gobble up smaller ones, and Microsoft in particular keeps fishing folks out of the MVP pool, it’s going to be harder for those who remain to stay competitive. At least I think so: I haven’t quite been persuaded that there will be a coomba ya moment where everyone gets sick of the MegaCorp grind and goes back to mom and pop shops to reclaim their lost souls.

After all, a lot of folks do have their sights set on retirement. High salaries and generous equity (well, maybe not equity as the market is currently behaving) certainly help get them there faster.

That’s part of the picture that you can’t easily ignore, along with the oft-proferred wisdom that the only way to stay on a competitive salary track is to change jobs every 2-3 years.

Retention is going to get more difficult for everyone across the board, but the revolving door will largely let out with the bigger players who can afford to keep it spinning.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

 

Finding Query Store Queries With Missing Index Requests In SQL Server 2019

All The Pretty Little Caveats


I’ve said quite a bit about missing index request utility generally in SQL Server, even as recently as last week!

But then I got a user question about using Query Store to do something similar, so here goes.

If you need a pre-2019 way to do this with Query Store, Kendra Little has a blog post about that here.

WITH
    queries AS
(
    SELECT TOP (100)
        parent_object_name = 
            ISNULL
            (
                OBJECT_NAME(qsq.object_id),
                'No Parent Object'
            ),
        qsqt.query_sql_text,
        query_plan = 
            TRY_CAST(qsp.query_plan AS xml),
        qsrs.first_execution_time,
        qsrs.last_execution_time,
        qsrs.count_executions,
        qsrs.avg_duration,
        qsrs.avg_cpu_time,
        qsp.query_plan_hash,
        qsq.query_hash
    FROM sys.query_store_runtime_stats AS qsrs
    JOIN sys.query_store_plan AS qsp
        ON qsp.plan_id = qsrs.plan_id
    JOIN sys.query_store_query AS qsq
        ON qsq.query_id = qsp.query_id
    JOIN sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id
    WHERE qsrs.last_execution_time >= DATEADD(DAY, -7, SYSDATETIME())
    AND   qsrs.avg_cpu_time >= (10 * 1000)
    AND   qsq.is_internal_query = 0
    AND   qsp.is_online_index_plan = 0
    ORDER BY qsrs.avg_cpu_time DESC
)
SELECT
    qs.*
FROM queries AS qs
CROSS APPLY
(
    SELECT TOP (1)
        gqs.*
    FROM sys.dm_db_missing_index_group_stats_query AS gqs
    WHERE qs.query_hash = gqs.query_hash
    AND   qs.query_plan_hash = gqs.query_plan_hash
    ORDER BY
        gqs.last_user_seek DESC,
        gqs.last_user_scan DESC
) AS gqs
ORDER BY qs.avg_cpu_time DESC
OPTION(RECOMPILE);

I don’t love this query, because I don’t love querying Query Store views. That’s why I wrote sp_QuickieStore to make it a whole lot easier.

But anyway, this will get you a similar bunch of information.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Comparing Query CPU And Logical Reads Using The Plan Cache In SQL Server

Let’s Not Kid Ourselves


There are lots of things that can go wrong with SQL Server, performance-wise. Just about anything can fly off the handle.

Recently, I was handing out a passing thought on Twitter that caught the attention of the beloved and venerable Michael J Swart (b|t):

We got to Twalking a little bit about why I said that, too, and he found what I usually find when I start comparing things.

Scripty Kid


I decided to expand on some scripts to look at how queries use CPU and perform reads, and found some really interesting stuff. I’ll talk through some results and how I’d approach tuning them afterwards.

Here are the queries:

/*Queries that do no logical reads, but lots of CPU work*/
SELECT TOP (100)
    total_logical_reads = 
        FORMAT(qs.total_logical_reads, 'N0'), 
    total_worker_time_ms = 
        FORMAT(qs.total_worker_time / 1000., 'N0'),
    execution_count = 
        FORMAT(qs.execution_count, 'N0'),
    query_text = 
        SUBSTRING
        (
            st.text,
            qs.statement_start_offset / 2 + 1,
            CASE qs.statement_start_offset
                 WHEN -1 
                 THEN DATALENGTH(st.text)
                 ELSE qs.statement_end_offset
            END - qs.statement_start_offset / 2 + 1
        ),
    qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_logical_reads = 0
AND   qs.total_worker_time > 5000
ORDER BY qs.total_worker_time DESC;

/*Queries that do 2x more reads than CPU work*/
SELECT TOP (100)
    total_logical_reads = 
        FORMAT(qs.total_logical_reads, 'N0'), 
    total_worker_time_ms = 
        FORMAT(qs.total_worker_time / 1000., 'N0'),
    execution_count = 
        FORMAT(qs.execution_count, 'N0'),
    query_text = 
        SUBSTRING
        (
            st.text,
            qs.statement_start_offset / 2 + 1,
            CASE qs.statement_start_offset
                 WHEN -1 
                 THEN DATALENGTH(st.text)
                 ELSE qs.statement_end_offset
            END - qs.statement_start_offset / 2 + 1
        ),
    qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_logical_reads > (qs.total_worker_time * 2)
ORDER BY qs.total_logical_reads DESC;

/*Queries that do 4x more CPU work than reads*/
SELECT TOP (100)
    total_logical_reads = 
        FORMAT(qs.total_logical_reads, 'N0'), 
    total_worker_time_ms = 
        FORMAT(qs.total_worker_time / 1000., 'N0'),
    execution_count = 
        FORMAT(qs.execution_count, 'N0'),
    query_text = 
        SUBSTRING
        (
            st.text,
            qs.statement_start_offset / 2 + 1,
            CASE qs.statement_start_offset
                 WHEN -1 
                 THEN DATALENGTH(st.text)
                 ELSE qs.statement_end_offset
            END - qs.statement_start_offset / 2 + 1
        ),
    qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_worker_time > (qs.total_logical_reads * 4)
ORDER BY qs.total_worker_time DESC;

Resultant


A quick note about these is that the comparison between CPU and logical reads happens in the where clause, and I convert CPU time to milliseconds in the select list.

That might make the number look a little funny, but it makes them somewhat more easy to understand than microseconds in the grand scheme of things.

First, queries that do no logical reads but use CPU time:

SQL Server Query Results
ouchies

A lot of these were functions that processed input but didn’t touch data. Assembling and splitting string lists, XML, other variable assignment tasks, and occasionally DMV queries.

The “easy” button here is to stop using scalar and multi-statement functions so much. Those execution counts are hideous.

Second, queries that do 2x more reads than CPU work:

SQL Server Query Results
telling myself

I only found six of these, while the other two categories easily found the 100 row goal.

The queries in here largely seemed to either be:

  • Insert queries to temporary objects
  • Queries with parameter sniffing issues

Looking at these, the problem was largely the optimizer choosing Nested Loops joins when it really shouldn’t have. The worst part was that it wasn’t an indexing issue — every single one of these queries was doing seeks across the board — they were just happening in a serial plan, and happening way more than the optimizer estimated they would. Perhaps this is something that Adaptive Joins or Batch Mode more generally could have intervened in.

Third, queries that do 2x more CPU work than reads:

SQL Server Query Results
we are something

These queries were far more interesting from a tuning perspective, because there were obvious ineffiencies:

  • No good indexes to use
  • Large scans because of non-SARGable predicates
  • Predicate Key Lookups

But the important thing here is that these queries were able to do a lot of logical reads quickly — data they needed was already in memory — and just push the hell out of CPUs.

These are the queries you can have a field day fixing and making people happy.

Residuals


This selection of query results is why I tend to ignore logical reads and focus on CPU. I do still look at things like physical reads, and select queries that do suspicious amounts of writes.

  • Physical reads means going to disk, and disk is your mortal enemy
  • Select queries doing writes often indicate spools and spills, which can also be pretty bad

You may not like it, but this is what peak performance tuner looks like.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How SQL Server 2019 Helps You Find Queries That Have Missing Index Requests

bigtime


In the quest for me trying to get people to upgrade to a not-old-and-busted version of SQL Server, this is one that I talk about a lot because it really helps folks who don’t have all the time in the world to tune queries and indexes.

Here’s a quick helper query to get you started:

SELECT TOP (50)
    query_text = 
        SUBSTRING
        (
            st.text,
            qs.statement_start_offset / 2 + 1,
            CASE qs.statement_start_offset
                 WHEN -1 
                 THEN DATALENGTH(st.text)
                 ELSE qs.statement_end_offset
            END - qs.statement_start_offset / 2 + 1
        ),
    qp.query_plan,
    qs.creation_time,
    qs.last_execution_time,
    qs.execution_count,
    qs.max_worker_time,
    avg_worker_time = 
        (qs.total_worker_time / qs.execution_count),
    qs.max_grant_kb,
    qs.max_used_grant_kb,
    qs.total_spills
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
CROSS APPLY
(
    SELECT TOP (1)
        gqs.*
    FROM sys.dm_db_missing_index_group_stats_query AS gqs
    WHERE qs.query_hash = gqs.query_hash
    AND   qs.query_plan_hash = gqs.query_plan_hash
    AND   qs.sql_handle = gqs.last_sql_handle
    ORDER BY
        gqs.last_user_seek DESC,
        gqs.last_user_scan DESC
) AS gqs
ORDER BY qs.max_worker_time DESC
OPTION(RECOMPILE);

This should help you find queries that use a lot of CPU and might could oughtta use an index.

Note that this script does not assemble the missing index definition for you. That stuff is all readily available in the query plans that get returned here, and of course the missing index feature has many caveats and limitations to it.

You should, as often as possible, execute the query and collect the actual execution plan to see where the time is spent before adding anything in.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

What Do Missing Index Requests Really Mean In SQL Server?

Scripts Ahoy


There are a million scripts out there that will give you all of the missing index requests for a database (or even a whole server).

Some will even try to prioritize based on the metrics logged along with each request.

Right now, most of you get:

  • Uses: How many times a query compiled that could have used the index
  • Average Query Cost: A unit-less cost used by the optimizer for choosing a plan
  • Impact: A metric relative to the unit-less cost of the operator the index will help

Breaking each of those down, the only one that has a concrete meaning is Uses, but that of course doesn’t mean that a query took a long time or is even terribly inefficient.

That leaves us with Average Query Cost, which is the sum of each operator’s estimated cost in the query plan, and Impact.

But where does Impact come from?

Impactful


Let’s look at a query plan with a missing index request to figure out what the Impact metric is tied to.

Here’s the relevant part of the plan:

SQL Server Query Plan
sticky kid

And here’s the missing index request:

/*
The Query Processor estimates that implementing the following index could improve the query cost by 16.9141%.
*/

/*
USE [StackOverflow2013]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Comments] ([Score])
INCLUDE ([PostId])
GO
*/

Here’s the breakdown:

  • The optimizer estimates that hitting the Comments table will cost 762 query bucks, which is 17% of the total plan cost
  • The optimizer further estimates that hitting the Comments table with the suggested index will reduce the total plan cost by 16.9%

Here’s the relevant properties from the scan of the Comments table:

SQL Server Query Plan
Indecisive

What I want you to take away from this is that, while hitting the Comments table may be 17% of the plan’s total estimated cost, the time spent scanning that index is not 17% of the plan’s total execution time, either in CPU or duration.

You can see in the screenshot above that it takes around 450ms to perform the full scan of 24,534,730 rows.

Doubtful


In full, this query runs for around 23 seconds:

SQL Server Query Plan
outta here

The estimated cost of hitting the Comments tables is not 17% of the execution time. That time lives elsewhere, which we’ll get to.

In the meantime, there are two more egregious problems to deal with:

  1. The optimizer severely miscalculates the cost of scanning the Posts table at 70% (note the 860ms time here):
SQL Server Query Plan
oh no no no

2. It buries other missing index requests in the properties of the root operator:

SQL Server Missing Index Request
train tracks

Now, there are two other missing index requests listed here that are a) of higher “impact” b) not ordered by that impact number and c) even if both a and b were true, we know that adding those indexes would not substantially reduce the overall runtime of the stored procedure.

Assuming that we added every single missing index here, at best we would reduce the estimated cost of the plan by 87%, while only reducing the actual execution time of the plan by about 1.3 seconds out of 23 seconds.

Not a big win, here.

Hurtful


Examining where time is spent in this plan, this branch will stick out as the dominating factor:

SQL Server Query Plan
baby don’t do it

Some interesting things to note here, while we’re talking about interesting things:

  • The scan of the Badges table takes 1.4 seconds, and has an estimated cost of 1%
  • The estimated cost of the eager index spool is 11%, but accounts for 20 seconds of elapsed time (less the 1.4 seconds for the scan of Badges)
  • There was no missing index request generated for the Badges table, despite the optimizer creating one on the fly

This is a bit of the danger in creating missing index requests without first validating which queries generated them, and where the benefit in having them would be.

In tomorrow’s post, we’ll look at how SQL Server 2019 makes figuring this stuff out easier.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Proving That SQL Server T-SQL Scalar UDFs Execute Once Per Row

Scripted Content


If you use this script, you can follow along with the results.

The sys.dm_exec_function_stats DMV is only available in SQL Server 2016+, so if you’re on an earlier version than that, you should focus your energies on upgrading.

I’m sure you’ve just been busy for the last five years.

/*Context is everything*/
USE master;
GO 

/*Piece of crap function*/
CREATE OR ALTER FUNCTION 
    dbo.useless_function(@dbid int)
RETURNS sysname
AS
BEGIN
    DECLARE /*This is to ruin scalar UDF inlining*/
        @gd datetime = GETDATE();

    RETURN DB_NAME(@dbid);
END;
GO 

/*Just in the select list*/
SELECT
    database_name = 
        dbo.useless_function(d.database_id),
    total_databases = 
        (SELECT c = COUNT_BIG(*) FROM sys.databases AS d2)
FROM sys.databases AS d;
GO 

/*Executions here should match the count above*/
SELECT
    object_name =
        OBJECT_NAME(defs.object_id),
    defs.execution_count
FROM sys.dm_exec_function_stats AS defs
WHERE defs.object_id = OBJECT_ID('dbo.useless_function');

/*Reset the counters*/
DBCC FREEPROCCACHE;
GO 

/*Now in the where clause*/
SELECT
    mf.name,
    total_database_files = 
        (SELECT c = COUNT_BIG(*) FROM sys.master_files AS mf2)
FROM sys.master_files AS mf
WHERE mf.name = dbo.useless_function(mf.database_id)
GO 

/*Executions here should match the count above*/
SELECT
    object_name =
        OBJECT_NAME(defs.object_id),
    defs.execution_count
FROM sys.dm_exec_function_stats AS defs
WHERE defs.object_id = OBJECT_ID('dbo.useless_function');

Data Driven


On my server, I have 9 total databases and 42 total database files.

The results, therefore, look precisely and consistently like so:

SQL Server Query Results
You Probably Won’t Like This About Functions

Assumption


For the select list, T-SQL scalar UDFs will execute once per row projected by the query, e.g. the final resulting row count, under… Every circumstance I’ve ever seen.

In SQL server. Of course.

As a couple easy-to-digest examples. Let’s say you execute a query that returns 100 rows:

  • Your T-SQL scalar UDF is referenced once in the select list, so it’ll run 100 times
  • Your T-SQL scalar UDF is referenced twice in the select list, so it’ll run 200 times

For T-SQL scalar UDFs in other parts of a query, like:

  • Where Clause
  • Join Clause

They will execute for as many rows need to be filtered when these parts of the query are executed, for as many individual references to the function as there are in the query.

The results here may vary, depending on if there are any other predicates involved that may filter out other rows.

As a couple easy-to-digest examples:

  • If you use a T-SQL scalar UDF as a predicate on a million row table, it’ll execute a million times to produce a result and apply the predicate
  • If you do the same thing as above, but there’s another part of the where clause that filters out 500k rows, the function will only execute 500k times

All sorts of other things might change this, like if the other predicate(s) can be pushed to when the data is accessed, and if there are multiple invocations of the function.

You can see an edge case where that’s not true in this post:

https://erikdarling.com/sql-server/a-hidden-value-of-apply/

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Putting Spinlocks In Perspective In SQL Server

Get Around


The more people want to avoid fixing what’s really wrong with their server, the more they go out and find all the weird stuff that they can blame on something else (usually the product), or keep doing the same things that aren’t fixing the problem.

Spinlocks are one of those things. People will measure them, stare at them, Google them, and have no idea what to make of them, all while being sure there’s something going on with them.

I don’t want to discount when spinlocks can actually cause a problem, but you shouldn’t treat every performance problem like it’s a bridge too far from what you can solve.

Before you go ignoring all the other things that are going wrong, here’s a simple script to give you some idea if you need to follow the spinlock trail any further. Note that it might also be worth replacing the s.spins sort order with s.spins_per_collision, too.

Which sort order you choose long-term will depend on which yields numbers of interest on your system, which I can’t predict. Sorry about that.

SELECT TOP (20)
    days_uptime = 
        CONVERT(decimal(38,2), d.seconds_uptime / 86400.),
    rundate = 
        SYSDATETIME(),
    s.name,
    s.collisions,
    collisions_per_second = 
        CONVERT(bigint, s.collisions / d.seconds_uptime),
    s.spins,
    spins_per_second = 
        CONVERT(bigint, s.spins / d.seconds_uptime),
    s.spins_per_collision,
    spins_per_collision_per_second = 
        CONVERT(decimal(38,6), s.spins_per_collision / d.seconds_uptime),
    s.sleep_time,
    sleep_time_per_second = 
        CONVERT(bigint, s.sleep_time / d.seconds_uptime),
    s.backoffs,
    backoffs_per_second = 
        CONVERT(bigint, s.backoffs / d.seconds_uptime)
FROM sys.dm_os_spinlock_stats AS s 
CROSS JOIN
(
    SELECT 
        seconds_uptime = 
            DATEDIFF
            (
                SECOND, 
                d.sqlserver_start_time, 
                SYSDATETIME()
            )
    FROM sys.dm_os_sys_info AS d
) AS d
ORDER BY s.spins DESC;

Telltale


I understand that some spinlocks tend to happen in storms, and that this isn’t going to help to illuminate many situations when run in isolation. Bursty workloads, or workloads that only hit some crazy amount of spinlocks during shorter periods of high activity might escape it.

It can help you put the number of spinlocks you’re hitting in perspective compared to uptime, though.

If you see any numbers in the results that still make you say the big wow at your screen, you can easily log the output to a table every X minutes to gather more detail on when it’s happening.

Once you figure out when any potentially large spikes in spinlocks are occurring, you can match that up with:

  • Any independent query logging you’re doing
  • The plan cache, if it’s reliable
  • Query Store, if you’re smart enough to turn it on
  • Your monitoring tool data

Which should tell you which queries were executing at the time. I’d probably look for any high CPU effort queries, since those tend to be the spinlockiest in my experience.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.