Shouldn’t be the default in a major database. No self-respecting database would do that to itself, or to end users.
Imagine you’re some poor developer with no idea what SQL Server is doing, just trying to get your application to work correctly, and your queries end up in asinine blocking chains because of this dimwitted default.
I’d switch to Postgres, too. Adding in NOLOCK hints and maybe getting wrong data is probably the least of their concerns.
Ahzooray
In Azure SQL DB, the default isolation level is Read Committed Snapshot Isolation (RCSI). Optimism for me but not for thee is the message, here.
Imagine a problem so goofy that Microsoft didn’t want to deal with it in its cloud product? You’ve got it right here.
For the next version of SQL Server, the default isolation level for new databases should also be RCSI.
Especially because databases can have a local version store via Accelerated Database Recovery. Why not make the most of it?
And solve the dumbest problem that most databases I see deal with.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
There are two angles to this. One is that spools are just crappy temp tables. The mechanism used for loading data, especially into Eager Spools, is terribly inefficient, especially when compared to the various advances in tempdb efficiency over the years.
The second angle is that Eager Index Spools should provide feedback to users that there’s a potentially useful index that could be created.
There are checks for Eager Table and Index Spools in sp_BlitzCache. Heck, it’ll even unravel them and tell you which index to create in the case of Eager Index Spools.
It still bothers me that there’s no built-in tooling or warning about queries that rely on Spools. Not because the Spools themselves are necessarily bad, but because they’re usually a sign that something is deficient with the query or indexing. Spools in modification queries are often necessary and not as worthy of your scorn, but can often be substituted with manual phase separation.
You may have lots of very fast queries with small Spools in them. You may have them right now.
You may also have lots of very slow queries with Spools in them, and there’s nothing telling you about it.
Spool Building
In a perfect world, when spools are being built, they’d emit a specific wait stat. Having that information would help all of us who do performance tuning work know what to look for and focus in on during investigations.
Right now you can get part of the way there by looking at EXECSYNC waits, but even that’s unreliable. They show up in parallel plans with Eager Index Spools, but they show up from other things too. Usually, your best bet is to just look for top resource consuming queries and examine the plans for Spools.
syncer
Step 1: Give us better waits to identify Spools being built
Index Building
The optimizer will complain about missing indexes all the live-long day. Even in cases where an index would barely help.
banjo
I don’t know about you, but in general ~200ms isn’t a performance emergency.
But like, three minutes?
bank account
I’d wanna know about this. I’d wanna create an index to help this query.
Step 2: Give us missing index requests when Eager Index Spools get built.
End users need better feedback when new features get turned on and performance gets worse.
Sure, if you read this blog and know what to look for, you can find and fix things quickly. But there are a whole lot of people who don’t have things that easy, and I get a lot of calls from them to fix this sort of issue.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Of course, I’d love if everyone lived in the same ivory tower as me and always wrote perfect queries with clear predicates that the optimizer understands and lovingly embraces.
But in the real world, out in the nitty gritty, queries are awful. It doesn’t matter if it’s un-or-under-trained developers writing SQL, or that exact same person designing queries in an ORM. They turn out horrible and full of nonsense, like drunk stomachs at IHOP.
One of the most common problems I see is people getting lazy with checking for NULLs, or overly-protective about it. In “””””real””””” programming languages, NULLs get you errors. In databases, they’re just sort of whatever.
Thing Is
When you create a row store index on a column, whether it’s ascending or descending, clustered or nonclustered, the data is put in order. In SQL Server, that means NULLs are sorted together. Despite that, ISNULL still creates problems.
DROP TABLE IF EXISTS #t;
SELECT
x.n
INTO #t
FROM
(
SELECT
CONVERT(int, NULL) AS n
UNION ALL
SELECT TOP (10)
ROW_NUMBER() OVER
(
ORDER BY
1/0
) AS n
FROM sys.messages AS m
) AS x;
CREATE UNIQUE CLUSTERED INDEX c
ON #t (n) WITH (SORT_IN_TEMPDB = ON);
In this table we have 11 rows. One of them is NULL, and the other 10 are the numbers 1-10.
Odor By
If we select an ordered result, we get a simple query plan that scans the clustered index and returns 11 rows with no Sort operator.
pleased to meet you
However, if we want to replace that NULL with a 0, things get goofy.
dammit
Wheredo
Something similar occurs when ISNULL is applied to the where clause.
happyunhappy
There’s one NULL. We know where it is. But we still have to scan 10 other rows. Just in case.
Conversion
The optimizer should be smart enough to figure out simple use of ISNULL, like in both of these cases.
I’m sure wiser people can figure out deeper cases, too, and even apply them to more functions that involve some types of date math, etc.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
In my session on using dynamic SQL to defeat parameter sniffing, I walk through how we can use what we know statistically about data to execute slightly different queries to get drastically better query plans. What makes the situation quite frustrating is that SQL Server has the ability to do the same thing.
In fact, it does it every time it freshly compiles a plan based on parameter values. It makes a whole bunch of estimates and comes up with that it thinks is a good plan based on them.
But it sort of stops there.
Why am I talking about this now? Well, since Greatest and Least got announced for Azure SQL, it kind of got my noggin’ joggin’ that perhaps a new build of the box-product might make its way to the CTP phase soon.
I Dream Of Histograms
When SQL Server builds execution plans, the estimates (mostly) come from statistics histograms, and those histograms are generally pretty solid even with low sampling rates. I know that there are times when they miss crucial details, but that’s a different problem than the one I think could be solved here.
You see, when a plan gets generated, the entire histogram is available. It would be neat if there were a process to do one of a few things:
Mark objects with significant skew in them for additional attention
Take the VoteTypeId column in the Votes table. Lots of these values could safely share a plan. Lots of others have catastrophe in mind when plans are shared, like 2 and… well, most others.
ribbit
Sort of like how optimistic isolation levels take care of basic reader/writer blocking that sucks to deal with and leads to all those crappy NOLOCK hints. Save the hard problems for young handsome consultants.
Ahem ?
Overboard
I know, this sounds crazy-ambitious, and it could get out of hand quickly. Not to mention confusing! We’re talking about queries having multiple cached and usable plans, here. Who used what and when would be crucial information.
You’d need a lot of metadata about something like this, so you can tweak:
The number of plans
The number of buckets
Which plan is used by which buckets
Which code and parameters should be considered
I’m fine with auto-pilot for this most of the time, just to get folks out of the doldrums. Sort of like how Query Store was “good enough” with a bunch of default options, I think you’d find a lot of preconceived notions about what works best would pretty quickly be relieved of their position.
Anyway, I have a bunch of other posts about similar solvable problems. I have no idea when the next version of SQL Server will come out, or what improvements or features might be in it, but I hear that blog posts are basically wishes that come true. I figure I might as well throw some coins in the fountain.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I’ve written at length about what local variables do to queries, so I’m not going to go into it again here.
What I do want to talk about are better alternatives to what you currently have to do to fix issues:
RECOMPILE the query
Pass the local variable to a stored procedure
Pass the local variable to dynamic SQL
It’s not that I hate those options, they’re just tedious. Sometimes I’d like the benefit of recompiling with local variables without all the other strings that come attached to recompiling.
Hint Me Baby One More Time
Since I’m told people rely on this behavior to fix certain problems, you would probably need a few different places to and ways to alter this behavior:
Database level setting
Query Hint
Variable declaration
Database level settings are great for workloads you can’t alter, either because the queries come out of a black box, or you use an ORM and queries… come out of a nuclear disaster area.
Query hints are great if you want all local variables to be treated like parameters. But you may not want that all the time. I mean, look: you all do wacky things and you’re stuck in your ways. I’m not kink shaming here, but facts are facts.
You have to draw the line somewhere and that somewhere is always “furries”.
And then local variables.
It may also be useful to allow local variables to be declared with a special property that will allow the optimizer to treat them like parameters. Something like this would be easy enough:
DECLARE @p int PARAMETER = 1;
Hog Ground
Given that in SQL Server 2019 table variables got deferred compilation, I think this feature is doable.
Of course, it’s doable today if you’ve got a debugger and don’t mind editing memory space.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
One line I see over and over again — I’ve probably said it too when I was young and needed words to fill space — is that CTEs make queries more readable.
Personally, I don’t think they make queries any more readable than derived tables, but whatever. No one cares what I think, anyway.
Working with clients I see a variety of query formatting styles, ranging from quite nice ones that have influenced the way I format things, to completely unformed primordial blobs. Sticking the latter into a CTE does nothing for readability even if it’s commented to heck and back.
nope nope nope
There are a number of options for formatting code:
Formatting your code nicely doesn’t just help others read it, it can also help people understand how it works.
Take this example from sp_QuickieStore that uses the STUFF function to build a comma separated list the crappy way.
If STRING_AGG were available in SQL Server 2016, I’d just use that. Darn legacy software.
parens
The text I added probably made things less readable, but formatting the code this way helps me make sure I have everything right.
The opening and closing parens for the STUFF function
The first input to the function is the XML generating nonsense
The last three inputs to the STUFF function that identify the start, length, and replacement text
I’ve seen and used this specific code a million times, but it wasn’t until I formatted it this way that I understood how all the pieces lined up.
Compare that with another time I used the same code fragment in sp_BlitzCache. I wish I had formatted a lot of the stuff I wrote in there better.
carry the eleventy
With things written this way, it’s really hard to understand where things begin and end and that arguments belong to which part of the code.
Maybe someday I’ll open an issue to reformat all the FRK code ?
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Whomever decided to give “memory bank” its moniker was wise beyond their years, or maybe they just made a very apt observation: all memory is on loan.
Even in the context we’ll be talking about, when SQL Server has lock pages in memory enabled, the pages that are locked in memory may not have permanent residency.
If your SQL Server doesn’t have enough memory, or if various workload elements are untuned, you may hit one of these scenarios:
Query Memory Grant contention (RESOURCE_SEMAPHORE)
Buffer Cache contention (PAGEIOLATCH_XX)
A mix of the two, where both are fighting over finite resources
It’s probably fair to note that not all query memory grant contention will result in RESOURCE_SEMAPHORE. There are times when you’ll have just enough queries asking for memory grants to knock a significant pages out of the plan cache to cause an over-reliance on disk without ever hitting the point where you’ve exhausted the amount of memory that SQL Server will loan out to queries.
To help you track down any of these scenarios, you can use my stored procedure sp_PressureDetector to see what’s going on with things.
Black Friday
Most servers I see have a mix of the two issues. Everyone complains about SQL Server being a memory hog without really understanding why. Likewise, many people are very proud about how fast their storage is without really understanding how much faster memory is. It’s quite common to hear someone say they they recently got a whole bunch of brand new shiny flashy storage but performance is still terrible on their server with 64GB of RAM and 1TB of data.
I recently had a client migrate some infrastructure to the cloud, and they were complaining about how queries got 3x slower. As it turned out, the queries were accruing 3x more PAGEIOLATCH waits with the same amount of memory assigned to SQL Server. Go figure.
If you’d like to see those waits in action, and how sp_PressureDetector can help you figure out which queries are causing problems, check out this video.
Market Economy
The primary driver of how much memory you need is how much control you have over the database. The less control you have, the more memory you need.
Here’s an example: One thing that steals control from you is using an ORM. When you let one translate code into queries, Really Bad Things™ can happen. Even with Perfect Indexes™ available, you can get some very strange queries and subsequently very strange query plans.
One of the best ways to take some control back isn’t even available in Standard Edition.
there are a lot of bad defaults in sql server. most of them you know about and change. here’s another bad one: pic.twitter.com/555xPmgwRW
— Erik Darling Data (@erikdarlingdata) May 8, 2021
If you do have control, the primary drivers of how much memory you need are how effective your indexes are, and how well your queries are written to take advantage of them. You can get away with less memory in general because your data footprint in the buffer pool will be a lot smaller.
You can watch a video I recorded about that here:
Thanks for reading (and watching)!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
“Wow look at all these missing index requests. Where’d they come from?”
So this is neat! And it’s better than nothing, but there are some quirks.
And what’s a quirk, after all, but a twerk that no one enjoys.
Columnar
The first thing to note about this DMV is that there are two columns purporting to have sql_handles in them. No, not thatsql_handle.
One of them can’t be used in the traditional way to retrieve query text. If you try to use last_statement_sql_handle, you’ll get an error.
SELECT
ddmigsq.group_handle,
ddmigsq.query_hash,
ddmigsq.query_plan_hash,
ddmigsq.avg_total_user_cost,
ddmigsq.avg_user_impact,
query_text =
SUBSTRING
(
dest.text,
(ddmigsq.last_statement_start_offset / 2) + 1,
(
(
CASE ddmigsq.last_statement_end_offset
WHEN -1
THEN DATALENGTH(dest.text)
ELSE ddmigsq.last_statement_end_offset
END
- ddmigsq.last_statement_start_offset
) / 2
) + 1
)
FROM sys.dm_db_missing_index_group_stats_query AS ddmigsq
CROSS APPLY sys.dm_exec_sql_text(ddmigsq.last_statement_sql_handle) AS dest;
Msg 12413, Level 16, State 1, Line 27
Cannot process statement SQL handle. Try querying the sys.query_store_query_text view instead.
Is Vic There?
One other “issue” with the view is that entries are evicted from it if they’re evicted from the plan cache. That means that queries with recompile hints may never produce an entry in the table.
Is this the end of the world? No, and it’s not the only index-related DMV that behaves this way: dm_db_index_usage_stats does something similar with regard to cached plans.
SELECT
COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND p.Score < 0;
GO
SELECT
COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND p.Score < 0
OPTION(RECOMPILE);
GO
grizzly
Italic Stallion
You may have noticed that may was italicized in when talking about whether or not plans with recompile hints would end up in here.
Some of them may, if they’re part of a larger batch. Here’s an example:
SELECT
COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND p.Score < 0
OPTION(RECOMPILE);
SELECT
COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
JOIN dbo.Comments AS c
ON c.PostId = p.Id
WHERE u.Reputation = 1
AND p.PostTypeId = 3
AND c.Score = 0;
Most curiously, if I run that batch twice, the missing index request for the recompile plan shows two uses.
computer
Multiplicity
You may have also noticed something odd in the above screenshot, too. One query has produced three entries. That’s because…
The query has three missing index requests. Go ahead and click on that.
lovecraft, baby
Another longstanding gripe with SSMS is that it only shows you the first missing index request in green text, and that it might not even be the “most impactful” one.
That’s the case here, just in case you were wondering. Neither the XML, nor the SSMS presentation of it, attempt to order the missing indexes by potential value.
You can use the properties of the execution plan to view all missing index requests, like I blogged about here, but you can’t script them out easily like you can for the green text request at the top of the query plan.
something else
At least this way, it’s a whole heck of a lot easier for you to order them in a way that may be more beneficial.
EZPZ
Of course, I don’t expect you to write your own queries to handle this. If you’re the type of person who enjoys Blitzing things, you can find the new 2019 goodness in sp_BlitzIndex, and you can find all the missing index requests for a single query in sp_BlitzCache in a handy-dandy clickable column that scripts out the create statements for you.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I see people do things like this fairly often with UDFs. I don’t know why. It’s almost like they read a list of best practices and decided the opposite was better.
This is a quite simplified function, but it’s enough to show the bug behavior.
While writing this, I learned that you can’t create a recursive (self-referencing) scalar UDF with the schemabinding option. I don’t know why that is either.
Please note that this behavior has been reported to Microsoft and will be fixed in a future update, though I’m not sure which one.
Swallowing Flies
Let’s take this thing. Let’s take this thing and throw it directly in the trash where it belongs.
CREATE OR ALTER FUNCTION dbo.how_high
(
@i int,
@h int
)
RETURNS int
WITH
RETURNS NULL ON NULL INPUT
AS
BEGIN
SELECT
@i += 1;
IF @i < @h
BEGIN
SET
@i = dbo.how_high(@i, @h);
END;
RETURN @i;
END;
GO
Seriously. You’re asking for a bad time. Don’t do things like this.
Unless you want to pay me to fix them later.
Froided
In SQL Server 2019, under compatibility level 150, this is what the behavior looks like currently:
/*
Works
*/
SELECT
dbo.how_high(0, 36) AS how_high;
GO
/*
Fails
*/
SELECT
dbo.how_high(0, 37) AS how_high;
GO
The first execution returns 36 as the final result, and the second query fails with this message:
Msg 217, Level 16, State 1, Line 40
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).
A bit odd that it took 37 loops to exceed the nesting limit of 32.
This is the bug.
Olded
With UDF inlining disabled, a more obvious number of loops is necessary to encounter the error.
/*
Works
*/
SELECT
dbo.how_high(0, 32) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO
/*
Fails
*/
SELECT
dbo.how_high(0, 33) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO
The first run returns 32, and the second run errors out with the same error message as above.
Does It Matter?
It’s a bit hard to imagine someone relying on that behavior, but I found it interesting enough to ask some of the nice folks at Microsoft about, and they confirmed that it shouldn’t happen. Again, it’ll get fixed, but I’m not sure when.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.