These table valued functions of the built-in variety have this problem.
This one is no exception. Well, it does throw an exception. But you know.
That’s not exceptional.
DROP TABLE IF EXISTS
dbo.select_into;
BEGIN TRAN
SELECT
id =
gs.value
INTO dbo.select_into
FROM GENERATE_SERIES
(
START = 1,
STOP = 10000000
) AS gs
OPTION(MAXDOP 8);
COMMIT;
If you run the above code, you’ll get this error:
Msg 1205, Level 13, State 78, Line 105
Transaction (Process ID 70) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Like the issues I outlined in yesterday’s post, I do hope these get fixed before go-live.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
It’s the first public CTP. Things will change. Things will get better. Think about the rich history of Microsoft fixing stuff immediately, like with adding an ordinal position to STRING_SPLIT.
That came out in SQL Server 2016, and uh… Wait, they just added the ordinal position in SQL Server 2022. There were two major versions in between that function getting released and any improvements.
With that in mind, I’m extending as much generosity of spirit to improvements to the function at hand: GENERATE_SERIES.
In this post, I want to go over some of the disappointing performance issues I found when testing this function out.
Single Threaded In A Parallel Plan
First up, reading streaming results from the function is single threaded. That isn’t necessarily bad on its own, but can result in annoying performance issues when you have to distribute a large number of rows.
If you have to ask what the purpose or use case for 10 million rows is, it’s your fault that SQL Server doesn’t scale.
Got it? Yours, and yours alone.
DROP TABLE IF EXISTS
dbo.art_aux;
CREATE TABLE
dbo.art_aux
(
id int NOT NULL PRIMARY KEY CLUSTERED
);
The first way we’re going to try this is with a simple one column table that has a primary key/clustered index on it.
INSERT INTO
dbo.art_aux WITH(TABLOCK)
(
id
)
SELECT
gs.value
FROM GENERATE_SERIES
(
START = 1,
STOP = 10000000
) AS gs
OPTION(MAXDOP 8);
The query plan for this insert looks about like so:
I’m only including the plan cost here to compare it to the serial plan later, and to understand the per-operator cost percentage breakdown.
It’s worth noting that the Distribute Streams operator uses Round Robin partitioning to put rows onto threads. That seems an odd choice here, since Round Robin partitioning pushes packets across exchanges.
For a function that produces streaming integers, it would make more sense to use Demand partitioning which only pulls single rows across exchanges. Waiting for Round Robin to fill up packets with integers seems a poor choice, here.
Then we get to the Sort, which Microsoft has promised to fix in a future CTP. Hopefully that happens! But it may not help with the order preserving Gather Streams leading up to the Insert.
It seems a bit odd ordered data from the Sort would hamstring the Gather Streams operator’s ability to do its thing, but what do I know?
I’m just a bouncer, after all.
But The Serial Plan
Using the same setup, let’s make that plan run at MAXDOP 1.
INSERT INTO
dbo.art_aux WITH(TABLOCK)
(
id
)
SELECT
gs.value
FROM GENERATE_SERIES
(
START = 1,
STOP = 10000000
) AS gs
OPTION(MAXDOP 1);
You might expect this to run substantially slower to generate and insert 10,000,000 rows, but it ends up being nearly three full seconds faster.
Comparing the query cost here (1048.11) vs. the cost of the parallel plan above (418.551), it’s easy to understand why a parallel plan was chosen.
It didn’t work out so well, though, in this case.
With no need to distribute 10,000,000 rows out to 8 threads, sort the data, and then gather the 8 threads back to one while preserving that sorted order, we can rely on the serial sort operator to produce and feed rows in index-order to the table.
Hopefully that will continue to be the case once Microsoft addresses the Sort being present there in the first place. That would knock a second or so off the the overall runtime.
Into A Heap
Taking the index out of the picture and inserting into a heap does two things:
But it also… Well, let’s just see what happens. And talk about it. Query plans need talk therapy, too. I’m their therapist.
DROP TABLE IF EXISTS
dbo.art_aux;
CREATE TABLE
dbo.art_aux
(
id int NOT NULL
);
The Eager Table Spool here is for halloween protection, I’d wager. Why we need it is a bit of a mystery, since we’re guaranteed to get a unique, ever-increasing series of numbers from the function. On a single thread.
Performance is terrible here because spooling ten million rows is an absolute disaster under any circumstance.
With this challenge in mind, I tried to get a plan here that would go parallel and avoid the spool.
Well, mission accomplished. Sort of.
Crash And Burn
One thing we can do is use SELECT INTO rather than relying on INSERT SELECT WITH (TABLOCK) to do try to get it. There are many restrictions on the latter method.
SELECT
id =
gs.value
INTO dbo.select_into
FROM GENERATE_SERIES
(
START = 1,
STOP = 10000000
) AS gs
OPTION(MAXDOP 8);
This doesn’t make things better:
This strategy clearly didn’t work out.
Bummer.
Again, I’d say most of the issue is from Round Robin partitioning on the Distribute Streams.
Finish Him
The initial version of GENERATE_SERIES is a bit rough around the edges, and I hope some of these issues get fixed.
And, like, faster than issues with STRING_SPLIT did, because it took a really long time to get that worked on.
And that was with a dozen or so MVPs griping about it the whole time.
But there’s an even bigger problem with it that we’ll look at tomorrow, where it won’t get lost in all this stuff.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
You can normally eyeball a query to find things that generally don’t agree with performance out of the box, like:
Functions (inline ones aside)
Table variables
Stacked Common Table Expressions
Non-SARGable predicates
Overly complicated queries
Insert a million other things here
But of course, the more complicated queries are, or the more layers of abstraction exist in a query, the harder this stuff is to spot quickly. Particularly with views, and nested views, bad ideas can be buried many layers deep in the sediment.
I call it sediment because code often looks like geologic layers, where you can tell who wrote what and when based on style and techniques that got used.
And to vendors who encrypt their god-awful code: �
The Great Untangling
Getting through that untangling can be a costly and time consuming process, depending on the level of damage done over the years, and the desired outcome. Sometimes it’s easier to rewrite everything from scratch than to do in-place rewrites of existing objects.
It’s obviously worth exploring enhancements in newer versions of SQL Server that may power things across the finish line:
Perhaps the new cardinality estimator does more good than harm
Batch Mode On Row Store does a lot of good with bad code
Scalar UDF Inlining can solve a lot of function problems
There are many other general and targeted improvements that might help your workload without code changes. Hopefully that continues with SQL Server 2022.
On top of the workload improvements, new versions also provide improved insights into problems via dynamic management views, Query Store, logging, and more.
If you’re not on at least SQL Server 2016 right now, you’re leaving a whole lot on the table as far as this goes.
Hiring Issues
It’s tough for smaller companies to attract full time talent to fix huge backlogs of issues across SQL Server stored procedures, functions, views, index and table design, and all that.
Or even harder, convert ORM queries into sensible stored procedures, etc. when you start hitting performance limitations in the single-query model.
First, I need acknowledge that not everyone wants to work for a huge company. Second, I need to acknowledge that salary isn’t everything to everyone.
But let’s assume that a smaller company want to hire someone in competition with a larger company. What can they offer when they run out of salary runway, and can’t match equity?
Clear career paths/Upward mobility
Flexible schedules
Paid time off for training
Covering the costs of training and certifications
Focusing on employee growth (not just sticking them in a corner to monkey with the same thing for years)
Quality of company culture (meeting overload was something I got a lot of DMs about)
Conference travel budgets
Meaningful company mission
Introducing tech savvy folks to the business side of things
Recognizing that not every employee wants to be an On-callogist
There were more, but these were the things I got the most hits from folks on. Having these doesn’t mean you can expect someone to take 20-30% less on the salary front, of course, but if you’re close to another offer these things might sway folks to your side.
Far and away, what I took from responses is that folks want to feel effective; like they can make a difference without a lot of bureaucracy and red tape. Get the hell out of my way, to coin a phrase.
Finder’s Fee
When it comes to attracting people to your company — think of it as your employer SEO — the SQL Server community is a great place to start.
If you want to try something for free, keep an eye out for when Brent posts to find out Who’s Hiring In The Database Community. It doesn’t cost you anything, but you have to keep on top of the posts and replies, and make sure you have good job description that sticks out.
If you have any location-based requirements for your candidates, try sponsoring a local SQL Server user group’s meetings for a few months. There may be a small, nominal fee if it’s entirely virtual. If it’s in-person, you’ll foot the bill for dozen or so pizza pies for attendees. That usually gets you an announcement before and after whatever speaker is presenting. It’s totally fair to ask for attendance numbers. Keeping on with that, consider sponsoring a SQL Saturday event. These typically have a deeper reach than a monthly user group, since there are more attendees in a concentrated area. You may get a booth, or your logo on slides, and whatever else you can negotiate with the event planners.
If you’re okay with spending more money for a lot of eyeballs, larger events like PASS Summit, and SQLBits are annual conferences with thousands of attendees. As a FYI, these are the types of conferences whomever you hire is probably going to want to attend, too.
Imagine that.
Askance
I have clients ask me to help them find quality employees for roles from time to time, or to help them interview folks they’ve farmed themselves.
Normally I’m happy to help on either front, and leave sealing the deal to them. I think from now on I’m gonna point them to this post, so they have some better ideas about how to put a stamp on things.
Not every company can offer everything, but as large companies continue to gobble up smaller ones, and Microsoft in particular keeps fishing folks out of the MVP pool, it’s going to be harder for those who remain to stay competitive. At least I think so: I haven’t quite been persuaded that there will be a coomba ya moment where everyone gets sick of the MegaCorp grind and goes back to mom and pop shops to reclaim their lost souls.
After all, a lot of folks do have their sights set on retirement. High salaries and generous equity (well, maybe not equity as the market is currently behaving) certainly help get them there faster.
That’s part of the picture that you can’t easily ignore, along with the oft-proferred wisdom that the only way to stay on a competitive salary track is to change jobs every 2-3 years.
Retention is going to get more difficult for everyone across the board, but the revolving door will largely let out with the bigger players who can afford to keep it spinning.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
WITH
queries AS
(
SELECT TOP (100)
parent_object_name =
ISNULL
(
OBJECT_NAME(qsq.object_id),
'No Parent Object'
),
qsqt.query_sql_text,
query_plan =
TRY_CAST(qsp.query_plan AS xml),
qsrs.first_execution_time,
qsrs.last_execution_time,
qsrs.count_executions,
qsrs.avg_duration,
qsrs.avg_cpu_time,
qsp.query_plan_hash,
qsq.query_hash
FROM sys.query_store_runtime_stats AS qsrs
JOIN sys.query_store_plan AS qsp
ON qsp.plan_id = qsrs.plan_id
JOIN sys.query_store_query AS qsq
ON qsq.query_id = qsp.query_id
JOIN sys.query_store_query_text AS qsqt
ON qsqt.query_text_id = qsq.query_text_id
WHERE qsrs.last_execution_time >= DATEADD(DAY, -7, SYSDATETIME())
AND qsrs.avg_cpu_time >= (10 * 1000)
AND qsq.is_internal_query = 0
AND qsp.is_online_index_plan = 0
ORDER BY qsrs.avg_cpu_time DESC
)
SELECT
qs.*
FROM queries AS qs
CROSS APPLY
(
SELECT TOP (1)
gqs.*
FROM sys.dm_db_missing_index_group_stats_query AS gqs
WHERE qs.query_hash = gqs.query_hash
AND qs.query_plan_hash = gqs.query_plan_hash
ORDER BY
gqs.last_user_seek DESC,
gqs.last_user_scan DESC
) AS gqs
ORDER BY qs.avg_cpu_time DESC
OPTION(RECOMPILE);
I don’t love this query, because I don’t love querying Query Store views. That’s why I wrote sp_QuickieStore to make it a whole lot easier.
But anyway, this will get you a similar bunch of information.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I decided to expand on some scripts to look at how queries use CPU and perform reads, and found some really interesting stuff. I’ll talk through some results and how I’d approach tuning them afterwards.
Here are the queries:
/*Queries that do no logical reads, but lots of CPU work*/
SELECT TOP (100)
total_logical_reads =
FORMAT(qs.total_logical_reads, 'N0'),
total_worker_time_ms =
FORMAT(qs.total_worker_time / 1000., 'N0'),
execution_count =
FORMAT(qs.execution_count, 'N0'),
query_text =
SUBSTRING
(
st.text,
qs.statement_start_offset / 2 + 1,
CASE qs.statement_start_offset
WHEN -1
THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset / 2 + 1
),
qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_logical_reads = 0
AND qs.total_worker_time > 5000
ORDER BY qs.total_worker_time DESC;
/*Queries that do 2x more reads than CPU work*/
SELECT TOP (100)
total_logical_reads =
FORMAT(qs.total_logical_reads, 'N0'),
total_worker_time_ms =
FORMAT(qs.total_worker_time / 1000., 'N0'),
execution_count =
FORMAT(qs.execution_count, 'N0'),
query_text =
SUBSTRING
(
st.text,
qs.statement_start_offset / 2 + 1,
CASE qs.statement_start_offset
WHEN -1
THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset / 2 + 1
),
qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_logical_reads > (qs.total_worker_time * 2)
ORDER BY qs.total_logical_reads DESC;
/*Queries that do 4x more CPU work than reads*/
SELECT TOP (100)
total_logical_reads =
FORMAT(qs.total_logical_reads, 'N0'),
total_worker_time_ms =
FORMAT(qs.total_worker_time / 1000., 'N0'),
execution_count =
FORMAT(qs.execution_count, 'N0'),
query_text =
SUBSTRING
(
st.text,
qs.statement_start_offset / 2 + 1,
CASE qs.statement_start_offset
WHEN -1
THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset / 2 + 1
),
qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.total_worker_time > (qs.total_logical_reads * 4)
ORDER BY qs.total_worker_time DESC;
Resultant
A quick note about these is that the comparison between CPU and logical reads happens in the where clause, and I convert CPU time to milliseconds in the select list.
That might make the number look a little funny, but it makes them somewhat more easy to understand than microseconds in the grand scheme of things.
First, queries that do no logical reads but use CPU time:
A lot of these were functions that processed input but didn’t touch data. Assembling and splitting string lists, XML, other variable assignment tasks, and occasionally DMV queries.
The “easy” button here is to stop using scalar and multi-statement functions so much. Those execution counts are hideous.
Second, queries that do 2x more reads than CPU work:
I only found six of these, while the other two categories easily found the 100 row goal.
The queries in here largely seemed to either be:
Insert queries to temporary objects
Queries with parameter sniffing issues
Looking at these, the problem was largely the optimizer choosing Nested Loops joins when it really shouldn’t have. The worst part was that it wasn’t an indexing issue — every single one of these queries was doing seeks across the board — they were just happening in a serial plan, and happening way more than the optimizer estimated they would. Perhaps this is something that Adaptive Joins or Batch Mode more generally could have intervened in.
Third, queries that do 2x more CPU work than reads:
These queries were far more interesting from a tuning perspective, because there were obvious ineffiencies:
No good indexes to use
Large scans because of non-SARGable predicates
Predicate Key Lookups
But the important thing here is that these queries were able to do a lot of logical reads quickly — data they needed was already in memory — and just push the hell out of CPUs.
These are the queries you can have a field day fixing and making people happy.
Residuals
This selection of query results is why I tend to ignore logical reads and focus on CPU. I do still look at things like physical reads, and select queries that do suspicious amounts of writes.
Physical reads means going to disk, and disk is your mortal enemy
Select queries doing writes often indicate spools and spills, which can also be pretty bad
You may not like it, but this is what peak performance tuner looks like.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
In the quest for me trying to get people to upgrade to a not-old-and-busted version of SQL Server, this is one that I talk about a lot because it really helps folks who don’t have all the time in the world to tune queries and indexes.
Here’s a quick helper query to get you started:
SELECT TOP (50)
query_text =
SUBSTRING
(
st.text,
qs.statement_start_offset / 2 + 1,
CASE qs.statement_start_offset
WHEN -1
THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset / 2 + 1
),
qp.query_plan,
qs.creation_time,
qs.last_execution_time,
qs.execution_count,
qs.max_worker_time,
avg_worker_time =
(qs.total_worker_time / qs.execution_count),
qs.max_grant_kb,
qs.max_used_grant_kb,
qs.total_spills
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
CROSS APPLY
(
SELECT TOP (1)
gqs.*
FROM sys.dm_db_missing_index_group_stats_query AS gqs
WHERE qs.query_hash = gqs.query_hash
AND qs.query_plan_hash = gqs.query_plan_hash
AND qs.sql_handle = gqs.last_sql_handle
ORDER BY
gqs.last_user_seek DESC,
gqs.last_user_scan DESC
) AS gqs
ORDER BY qs.max_worker_time DESC
OPTION(RECOMPILE);
This should help you find queries that use a lot of CPU and might could oughtta use an index.
Note that this script does not assemble the missing index definition for you. That stuff is all readily available in the query plans that get returned here, and of course the missing index feature has many caveats and limitations to it.
You should, as often as possible, execute the query and collect the actual execution plan to see where the time is spent before adding anything in.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
There are a million scripts out there that will give you all of the missing index requests for a database (or even a whole server).
Some will even try to prioritize based on the metrics logged along with each request.
Right now, most of you get:
Uses: How many times a query compiled that could have used the index
Average Query Cost: A unit-less cost used by the optimizer for choosing a plan
Impact: A metric relative to the unit-less cost of the operator the index will help
Breaking each of those down, the only one that has a concrete meaning is Uses, but that of course doesn’t mean that a query took a long time or is even terribly inefficient.
That leaves us with Average Query Cost, which is the sum of each operator’s estimated cost in the query plan, and Impact.
But where does Impact come from?
Impactful
Let’s look at a query plan with a missing index request to figure out what the Impact metric is tied to.
Here’s the relevant part of the plan:
And here’s the missing index request:
/*
The Query Processor estimates that implementing the following index could improve the query cost by 16.9141%.
*/
/*
USE [StackOverflow2013]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Comments] ([Score])
INCLUDE ([PostId])
GO
*/
Here’s the breakdown:
The optimizer estimates that hitting the Comments table will cost 762 query bucks, which is 17% of the total plan cost
The optimizer further estimates that hitting the Comments table with the suggested index will reduce the total plan cost by 16.9%
Here’s the relevant properties from the scan of the Comments table:
What I want you to take away from this is that, while hitting the Comments table may be 17% of the plan’s total estimated cost, the time spent scanning that index is not 17% of the plan’s total execution time, either in CPU or duration.
You can see in the screenshot above that it takes around 450ms to perform the full scan of 24,534,730 rows.
Doubtful
In full, this query runs for around 23 seconds:
The estimated cost of hitting the Comments tables is not 17% of the execution time. That time lives elsewhere, which we’ll get to.
In the meantime, there are two more egregious problems to deal with:
The optimizer severely miscalculates the cost of scanning the Posts table at 70% (note the 860ms time here):
2. It buries other missing index requests in the properties of the root operator:
Now, there are two other missing index requests listed here that are a) of higher “impact” b) not ordered by that impact number and c) even if both a and b were true, we know that adding those indexes would not substantially reduce the overall runtime of the stored procedure.
Assuming that we added every single missing index here, at best we would reduce the estimated cost of the plan by 87%, while only reducing the actual execution time of the plan by about 1.3 seconds out of 23 seconds.
Not a big win, here.
Hurtful
Examining where time is spent in this plan, this branch will stick out as the dominating factor:
Some interesting things to note here, while we’re talking about interesting things:
The scan of the Badges table takes 1.4 seconds, and has an estimated cost of 1%
The estimated cost of the eager index spool is 11%, but accounts for 20 seconds of elapsed time (less the 1.4 seconds for the scan of Badges)
There was no missing index request generated for the Badges table, despite the optimizer creating one on the fly
This is a bit of the danger in creating missing index requests without first validating which queries generated them, and where the benefit in having them would be.
In tomorrow’s post, we’ll look at how SQL Server 2019 makes figuring this stuff out easier.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
If you use this script, you can follow along with the results.
The sys.dm_exec_function_stats DMV is only available in SQL Server 2016+, so if you’re on an earlier version than that, you should focus your energies on upgrading.
I’m sure you’ve just been busy for the last five years.
/*Context is everything*/
USE master;
GO
/*Piece of crap function*/
CREATE OR ALTER FUNCTION
dbo.useless_function(@dbid int)
RETURNS sysname
AS
BEGIN
DECLARE /*This is to ruin scalar UDF inlining*/
@gd datetime = GETDATE();
RETURN DB_NAME(@dbid);
END;
GO
/*Just in the select list*/
SELECT
database_name =
dbo.useless_function(d.database_id),
total_databases =
(SELECT c = COUNT_BIG(*) FROM sys.databases AS d2)
FROM sys.databases AS d;
GO
/*Executions here should match the count above*/
SELECT
object_name =
OBJECT_NAME(defs.object_id),
defs.execution_count
FROM sys.dm_exec_function_stats AS defs
WHERE defs.object_id = OBJECT_ID('dbo.useless_function');
/*Reset the counters*/
DBCC FREEPROCCACHE;
GO
/*Now in the where clause*/
SELECT
mf.name,
total_database_files =
(SELECT c = COUNT_BIG(*) FROM sys.master_files AS mf2)
FROM sys.master_files AS mf
WHERE mf.name = dbo.useless_function(mf.database_id)
GO
/*Executions here should match the count above*/
SELECT
object_name =
OBJECT_NAME(defs.object_id),
defs.execution_count
FROM sys.dm_exec_function_stats AS defs
WHERE defs.object_id = OBJECT_ID('dbo.useless_function');
Data Driven
On my server, I have 9 total databases and 42 total database files.
The results, therefore, look precisely and consistently like so:
Assumption
For the select list, T-SQL scalar UDFs will execute once per row projected by the query, e.g. the final resulting row count, under… Every circumstance I’ve ever seen.
In SQL server. Of course.
As a couple easy-to-digest examples. Let’s say you execute a query that returns 100 rows:
Your T-SQL scalar UDF is referenced once in the select list, so it’ll run 100 times
Your T-SQL scalar UDF is referenced twice in the select list, so it’ll run 200 times
For T-SQL scalar UDFs in other parts of a query, like:
Where Clause
Join Clause
They will execute for as many rows need to be filtered when these parts of the query are executed, for as many individual references to the function as there are in the query.
The results here may vary, depending on if there are any other predicates involved that may filter out other rows.
As a couple easy-to-digest examples:
If you use a T-SQL scalar UDF as a predicate on a million row table, it’ll execute a million times to produce a result and apply the predicate
If you do the same thing as above, but there’s another part of the where clause that filters out 500k rows, the function will only execute 500k times
All sorts of other things might change this, like if the other predicate(s) can be pushed to when the data is accessed, and if there are multiple invocations of the function.
You can see an edge case where that’s not true in this post:
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
The more people want to avoid fixing what’s really wrong with their server, the more they go out and find all the weird stuff that they can blame on something else (usually the product), or keep doing the same things that aren’t fixing the problem.
Spinlocks are one of those things. People will measure them, stare at them, Google them, and have no idea what to make of them, all while being sure there’s something going on with them.
I don’t want to discount when spinlocks can actually cause a problem, but you shouldn’t treat every performance problem like it’s a bridge too far from what you can solve.
Before you go ignoring all the other things that are going wrong, here’s a simple script to give you some idea if you need to follow the spinlock trail any further. Note that it might also be worth replacing the s.spins sort order with s.spins_per_collision, too.
Which sort order you choose long-term will depend on which yields numbers of interest on your system, which I can’t predict. Sorry about that.
SELECT TOP (20)
days_uptime =
CONVERT(decimal(38,2), d.seconds_uptime / 86400.),
rundate =
SYSDATETIME(),
s.name,
s.collisions,
collisions_per_second =
CONVERT(bigint, s.collisions / d.seconds_uptime),
s.spins,
spins_per_second =
CONVERT(bigint, s.spins / d.seconds_uptime),
s.spins_per_collision,
spins_per_collision_per_second =
CONVERT(decimal(38,6), s.spins_per_collision / d.seconds_uptime),
s.sleep_time,
sleep_time_per_second =
CONVERT(bigint, s.sleep_time / d.seconds_uptime),
s.backoffs,
backoffs_per_second =
CONVERT(bigint, s.backoffs / d.seconds_uptime)
FROM sys.dm_os_spinlock_stats AS s
CROSS JOIN
(
SELECT
seconds_uptime =
DATEDIFF
(
SECOND,
d.sqlserver_start_time,
SYSDATETIME()
)
FROM sys.dm_os_sys_info AS d
) AS d
ORDER BY s.spins DESC;
Telltale
I understand that some spinlocks tend to happen in storms, and that this isn’t going to help to illuminate many situations when run in isolation. Bursty workloads, or workloads that only hit some crazy amount of spinlocks during shorter periods of high activity might escape it.
It can help you put the number of spinlocks you’re hitting in perspective compared to uptime, though.
If you see any numbers in the results that still make you say the big wow at your screen, you can easily log the output to a table every X minutes to gather more detail on when it’s happening.
Once you figure out when any potentially large spikes in spinlocks are occurring, you can match that up with:
Any independent query logging you’re doing
The plan cache, if it’s reliable
Query Store, if you’re smart enough to turn it on
Your monitoring tool data
Which should tell you which queries were executing at the time. I’d probably look for any high CPU effort queries, since those tend to be the spinlockiest in my experience.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I’ve had it with presenting online, especially full-day precons. Talking to yourself for eight hours while standing in front of a camera is about as enjoyable as it sounds.
It’s a lot harder to build up rapport with the audience
There’s no social break where you get to talk to people
— Erik Darling Data (@erikdarlingdata) March 9, 2022
And there will be more MAXDOP shirts. I can’t leave my dear American friends behind.
Right On Time
This November 15-18, I’ll be presenting live and in person in Seattle at the PASS Data Community Summit. It’s available online for folks who can’t make it there in-person, too.
For a precon, I’ll be doing my world famous Blueprint Performance Tuning:
The Professional Performance Tuning Blueprint
Searching the internet for every problem isn’t cutting it. You need to be more proactive and efficient when it comes to finding and solving database performance fires.
I work with consulting customers around the world to put out SQL Server performance fires. In this day of learning, I will teach you how to find and fix your worst SQL Server problems using the same modern tools and techniques which I use every week.
You’ll learn tons of new and effective approaches to common performance problems, how to figure out what’s going on in your query plans, and how indexes really work to make your queries faster. Together, we’ll tackle query rewrites, batch mode, how to design indexes, and how to gather all the information you need to analyze performance.
This day of learning will teach you cutting edge techniques which you can’t find in training by folks who don’t spend time in the real world tuning performance. Performance tuning mysteries can easily leave you stumbling through your work week, unsure if you’re focusing on the right things. You’ll walk out of this class confident in your abilities to fix performance issues once and for all.
If you want to put out SQL Server performance fires, this is the precon you need to attend. Anyone can have a plan, it takes a professional to have a blueprint.
Regular sessions haven’t been announced yet, so keep your eyes peeled!
Hope to see you there, one way or another!
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.