Things SQL Server vNext Should Address: Default Isolation Level

You’re Gonna Miss Me


I deal with blocking problems all the time. All the damn time. Deadlocks, too.

Why do I have to deal with these problems? Read Committed is the default isolation level in SQL Server.

It is an utterly broken isolation level, and it shouldn’t be the default anymore.

Can’t Fix It


Any isolation level that lets

Shouldn’t be the default in a major database. No self-respecting database would do that to itself, or to end users.

Imagine you’re some poor developer with no idea what SQL Server is doing, just trying to get your application to work correctly, and your queries end up in asinine blocking chains because of this dimwitted default.

I’d switch to Postgres, too. Adding in NOLOCK hints and maybe getting wrong data is probably the least of their concerns.

Ahzooray


In Azure SQL DB, the default isolation level is Read Committed Snapshot Isolation (RCSI). Optimism for me but not for thee is the message, here.

Imagine a problem so goofy that Microsoft didn’t want to deal with it in its cloud product? You’ve got it right here.

For the next version of SQL Server, the default isolation level for new databases should also be RCSI.

Especially because databases can have a local version store via Accelerated Database Recovery. Why not make the most of it?

And solve the dumbest problem that most databases I see deal with.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Spools

Per Capita


There are two angles to this. One is that spools are just crappy temp tables. The mechanism used for loading data, especially into Eager Spools, is terribly inefficient, especially when compared to the various advances in tempdb efficiency over the years.

The second angle is that Eager Index Spools should provide feedback to users that there’s a potentially useful index that could be created.

There are checks for Eager Table and Index Spools in sp_BlitzCache. Heck, it’ll even unravel them and tell you which index to create in the case of Eager Index Spools.

It still bothers me that there’s no built-in tooling or warning about queries that rely on Spools. Not because the Spools themselves are necessarily bad, but because they’re usually a sign that something is deficient with the query or indexing. Spools in modification queries are often necessary and not as worthy of your scorn, but can often be substituted with manual phase separation.

You may have lots of very fast queries with small Spools in them. You may have them right now.

You may also have lots of very slow queries with Spools in them, and there’s nothing telling you about it.

Spool Building


In a perfect world, when spools are being built, they’d emit a specific wait stat. Having that information would help all of us who do performance tuning work know what to look for and focus in on during investigations.

Right now you can get part of the way there by looking at EXECSYNC waits, but even that’s unreliable. They show up in parallel plans with Eager Index Spools, but they show up from other things too. Usually, your best bet is to just look for top resource consuming queries and examine the plans for Spools.

SQL Server Query Plan
syncer

Step 1: Give us better waits to identify Spools being built

Index Building


The optimizer will complain about missing indexes all the live-long day. Even in cases where an index would barely help.

SQL Server Query Plan
banjo

I don’t know about you, but in general ~200ms isn’t a performance emergency.

But like, three minutes?

SQL Server Query Plan
bank account

I’d wanna know about this. I’d wanna create an index to help this query.

Step 2: Give us missing index requests when Eager Index Spools get built.

Buy A Telescope


This kind of stuff is already super-important now, and will become even more important as Scalar Function Inlining becomes more widely used.

End users need better feedback when new features get turned on and performance gets worse.

Sure, if you read this blog and know what to look for, you can find and fix things quickly. But there are a whole lot of people who don’t have things that easy, and I get a lot of calls from them to fix this sort of issue.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: ISNULL SARGability

Preferential


Of course, I’d love if everyone lived in the same ivory tower as me and always wrote perfect queries with clear predicates that the optimizer understands and lovingly embraces.

But in the real world, out in the nitty gritty, queries are awful. It doesn’t matter if it’s un-or-under-trained developers writing SQL, or that exact same person designing queries in an ORM. They turn out horrible and full of nonsense, like drunk stomachs at IHOP.

One of the most common problems I see is people getting lazy with checking for NULLs, or overly-protective about it. In “””””real””””” programming languages, NULLs get you errors. In databases, they’re just sort of whatever.

Thing Is


When you create a row store index on a column, whether it’s ascending or descending, clustered or nonclustered, the data is put in order. In SQL Server, that means NULLs are sorted together. Despite that, ISNULL still creates problems.

DROP TABLE IF EXISTS #t;

SELECT
    x.n
INTO #t
FROM
(
SELECT
    CONVERT(int, NULL) AS n

UNION ALL

SELECT TOP (10)
    ROW_NUMBER() OVER
    (
        ORDER BY
            1/0
    ) AS n
FROM sys.messages AS m
) AS x;


CREATE UNIQUE CLUSTERED INDEX c 
ON #t (n) WITH (SORT_IN_TEMPDB = ON);

In this table we have 11 rows. One of them is NULL, and the other 10 are the numbers 1-10.

Odor By


If we select an ordered result, we get a simple query plan that scans the clustered index and returns 11 rows with no Sort operator.

SQL Server Query Plan
pleased to meet you

However, if we want to replace that NULL with a 0, things get goofy.

SQL Server Query Plan
dammit

Wheredo


Something similar occurs when ISNULL is applied to the where clause.

SQL Server Query Plan
happy
SQL Server Query Plan
unhappy

There’s one NULL. We know where it is. But we still have to scan 10 other rows. Just in case.

Conversion


The optimizer should be smart enough to figure out simple use of ISNULL, like in both of these cases.

I’m sure wiser people can figure out deeper cases, too, and even apply them to more functions that involve some types of date math, etc.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Parameter Sniffing

Olympus


In my session on using dynamic SQL to defeat parameter sniffing, I walk through how we can use what we know statistically about data to execute slightly different queries to get drastically better query plans. What makes the situation quite frustrating is that SQL Server has the ability to do the same thing.

In fact, it does it every time it freshly compiles a plan based on parameter values. It makes a whole bunch of estimates and comes up with that it thinks is a good plan based on them.

But it sort of stops there.

Why am I talking about this now? Well, since Greatest and Least got announced for Azure SQL, it kind of got my noggin’ joggin’ that perhaps a new build of the box-product might make its way to the CTP phase soon.

I Dream Of Histograms


When SQL Server builds execution plans, the estimates (mostly) come from statistics histograms, and those histograms are generally pretty solid even with low sampling rates. I know that there are times when they miss crucial details, but that’s a different problem than the one I think could be solved here.

You see, when a plan gets generated, the entire histogram is available. It would be neat if there were a process to do one of a few things:

  • Mark objects with significant skew in them for additional attention
  • Bucket values by similar populations
  • Explore other plan choices for values per bucket

If you team this up with other features in the Intelligent Query Processing family, it could really help solve some of the most basic parameter sniffing issues. That’s what you want these features for: to take care of the low-hanging nonsense. Just like the cardinality estimation feedback that got announced at PASS Summit.

Take the VoteTypeId column in the Votes table. Lots of these values could safely share a plan. Lots of others have catastrophe in mind when plans are shared, like 2 and… well, most others.

ribbit

Sort of like how optimistic isolation levels take care of basic reader/writer blocking that sucks to deal with and leads to all those crappy NOLOCK hints. Save the hard problems for young handsome consultants.

Ahem ?

Overboard


I know, this sounds crazy-ambitious, and it could get out of hand quickly. Not to mention confusing! We’re talking about queries having multiple cached and usable plans, here. Who used what and when would be crucial information.

You’d need a lot of metadata about something like this, so you can tweak:

  • The number of plans
  • The number of buckets
  • Which plan is used by which buckets
  • Which code and parameters should be considered

I’m fine with auto-pilot for this most of the time, just to get folks out of the doldrums. Sort of like how Query Store was “good enough” with a bunch of default options, I think you’d find a lot of preconceived notions about what works best would pretty quickly be relieved of their position.

Anyway, I have a bunch of other posts about similar solvable problems. I have no idea when the next version of SQL Server will come out, or what improvements or features might be in it, but I hear that blog posts are basically wishes that come true. I figure I might as well throw some coins in the fountain.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Local Variable Estimates

And Then The World


I’ve written at length about what local variables do to queries, so I’m not going to go into it again here.

What I do want to talk about are better alternatives to what you currently have to do to fix issues:

  • RECOMPILE the query
  • Pass the local variable to a stored procedure
  • Pass the local variable to dynamic SQL

It’s not that I hate those options, they’re just tedious. Sometimes I’d like the benefit of recompiling with local variables without all the other strings that come attached to recompiling.

Hint Me Baby One More Time


Since I’m told people rely on this behavior to fix certain problems, you would probably need a few different places to and ways to alter this behavior:

  • Database level setting
  • Query Hint
  • Variable declaration

Database level settings are great for workloads you can’t alter, either because the queries come out of a black box, or you use an ORM and queries… come out of a nuclear disaster area.

Query hints are great if you want all local variables to be treated like parameters. But you may not want that all the time. I mean, look: you all do wacky things and you’re stuck in your ways. I’m not kink shaming here, but facts are facts.

You have to draw the line somewhere and that somewhere is always “furries”.

And then local variables.

It may also be useful to allow local variables to be declared with a special property that will allow the optimizer to treat them like parameters. Something like this would be easy enough:

DECLARE @p int PARAMETER = 1;

Hog Ground


Given that in SQL Server 2019 table variables got deferred compilation, I think this feature is doable.

Of course, it’s doable today if you’ve got a debugger and don’t mind editing memory space.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

CTEs Don’t Make Queries More Readable, Formatting Does

Kākāpō!


One line I see over and over again — I’ve probably said it too when I was young and needed words to fill space — is that CTEs make queries more readable.

Personally, I don’t think they make queries any more readable than derived tables, but whatever. No one cares what I think, anyway.

Working with clients I see a variety of query formatting styles, ranging from quite nice ones that have influenced the way I format things, to completely unformed primordial blobs. Sticking the latter into a CTE does nothing for readability even if it’s commented to heck and back.

nope nope nope

There are a number of options for formatting code:

Good Standing


Formatting your code nicely doesn’t just help others read it, it can also help people understand how it works.

Take this example from sp_QuickieStore that uses the STUFF function to build a comma separated list the crappy way.

If STRING_AGG were available in SQL Server 2016, I’d just use that. Darn legacy software.

SQL Server Query
parens

The text I added probably made things less readable, but formatting the code this way helps me make sure I have everything right.

  1. The opening and closing parens for the STUFF function
  2. The first input to the function is the XML generating nonsense
  3. The last three inputs to the STUFF function that identify the start, length, and replacement text

I’ve seen and used this specific code a million times, but it wasn’t until I formatted it this way that I understood how all the pieces lined up.

Compare that with another time I used the same code fragment in sp_BlitzCache. I wish I had formatted a lot of the stuff I wrote in there better.

SQL Server Query
carry the eleventy

With things written this way, it’s really hard to understand where things begin and end and that arguments belong to which part of the code.

Maybe someday I’ll open an issue to reformat all the FRK code ?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Three Kinds Of Memory Contention In SQL Server

Savings and Loans


Whomever decided to give “memory bank” its moniker was wise beyond their years, or maybe they just made a very apt observation: all memory is on loan.

Even in the context we’ll be talking about, when SQL Server has lock pages in memory enabled, the pages that are locked in memory may not have permanent residency.

If your SQL Server doesn’t have enough memory, or if various workload elements are untuned, you may hit one of these scenarios:

  • Query Memory Grant contention (RESOURCE_SEMAPHORE)
  • Buffer Cache contention (PAGEIOLATCH_XX)
  • A mix of the two, where both are fighting over finite resources

It’s probably fair to note that not all query memory grant contention will result in RESOURCE_SEMAPHORE. There are times when you’ll have just enough queries asking for memory grants to knock a significant pages out of the plan cache to cause an over-reliance on disk without ever hitting the point where you’ve exhausted the amount of memory that SQL Server will loan out to queries.

To help you track down any of these scenarios, you can use my stored procedure sp_PressureDetector to see what’s going on with things.

Black Friday


Most servers I see have a mix of the two issues. Everyone complains about SQL Server being a memory hog without really understanding why. Likewise, many people are very proud about how fast their storage is without really understanding how much faster memory is. It’s quite common to hear someone say they they recently got a whole bunch of brand new shiny flashy storage but performance is still terrible on their server with 64GB of RAM and 1TB of data.

I recently had a client migrate some infrastructure to the cloud, and they were complaining about how queries got 3x slower. As it turned out, the queries were accruing 3x more PAGEIOLATCH waits with the same amount of memory assigned to SQL Server. Go figure.

If you’d like to see those waits in action, and how sp_PressureDetector can help you figure out which queries are causing problems, check out this video.

Market Economy


The primary driver of how much memory you need is how much control you have over the database. The less control you have, the more memory you need.

Here’s an example: One thing that steals control from you is using an ORM. When you let one translate code into queries, Really Bad Things™ can happen. Even with Perfect Indexes™ available, you can get some very strange queries and subsequently very strange query plans.

One of the best ways to take some control back isn’t even available in Standard Edition.

If you do have control, the primary drivers of how much memory you need are how effective your indexes are, and how well your queries are written to take advantage of them. You can get away with less memory in general because your data footprint in the buffer pool will be a lot smaller.

You can watch a video I recorded about that here:

Thanks for reading (and watching)!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How SQL Server 2019 Helps You Link Queries To Missing Index Requests

Not Another Upgrade


When dm_db_missing_index_group_stats_query got documented, I was really happy. After all, this has been a peeve of mine for ages.

“Wow look at all these missing index requests. Where’d they come from?”

So this is neat! And it’s better than nothing, but there are some quirks.

And what’s a quirk, after all, but a twerk that no one enjoys.

Columnar


The first thing to note about this DMV is that there are two columns purporting to have sql_handles in them. No, not that sql_handle.

One of them can’t be used in the traditional way to retrieve query text. If you try to use last_statement_sql_handle, you’ll get an error.

SELECT
    ddmigsq.group_handle,
    ddmigsq.query_hash,
    ddmigsq.query_plan_hash,
    ddmigsq.avg_total_user_cost,
    ddmigsq.avg_user_impact,
    query_text = 
        SUBSTRING
        (
            dest.text, 
            (ddmigsq.last_statement_start_offset / 2) + 1,
            (
                ( 
                    CASE ddmigsq.last_statement_end_offset 
                        WHEN -1 
                        THEN DATALENGTH(dest.text) 
                        ELSE ddmigsq.last_statement_end_offset 
                    END
                    - ddmigsq.last_statement_start_offset 
                ) / 2 
            ) + 1
        )
FROM sys.dm_db_missing_index_group_stats_query AS ddmigsq
CROSS APPLY sys.dm_exec_sql_text(ddmigsq.last_statement_sql_handle) AS dest;

Msg 12413, Level 16, State 1, Line 27
Cannot process statement SQL handle. Try querying the sys.query_store_query_text view instead.

Is Vic There?


One other “issue” with the view is that entries are evicted from it if they’re evicted from the plan cache. That means that queries with recompile hints may never produce an entry in the table.

Is this the end of the world? No, and it’s not the only index-related DMV that behaves this way: dm_db_index_usage_stats does something similar with regard to cached plans.

As a quick example, if I execute these two nearly-identical queries, the DMV only records one potential use of the index:

SELECT 
    COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND   p.Score < 0;
GO 

SELECT 
    COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND   p.Score < 0
OPTION(RECOMPILE);
GO
SQL Server Missing Index Request
grizzly

Italic Stallion


You may have noticed that may was italicized in when talking about whether or not plans with recompile hints would end up in here.

Some of them may, if they’re part of a larger batch. Here’s an example:

SELECT 
    COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
AND   p.Score < 0
OPTION(RECOMPILE);

SELECT
    COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
   ON p.OwnerUserId = u.Id
JOIN dbo.Comments AS c
    ON c.PostId = p.Id
WHERE u.Reputation = 1
AND   p.PostTypeId = 3
AND   c.Score = 0;

Most curiously, if I run that batch twice, the missing index request for the recompile plan shows two uses.

computer

Multiplicity


You may have also noticed something odd in the above screenshot, too. One query has produced three entries. That’s because…

The query has three missing index requests. Go ahead and click on that.

lovecraft, baby

Another longstanding gripe with SSMS is that it only shows you the first missing index request in green text, and that it might not even be the “most impactful” one.

That’s the case here, just in case you were wondering. Neither the XML, nor the SSMS presentation of it, attempt to order the missing indexes by potential value.

You can use the properties of the execution plan to view all missing index requests, like I blogged about here, but you can’t script them out easily like you can for the green text request at the top of the query plan.

something else

At least this way, it’s a whole heck of a lot easier for you to order them in a way that may be more beneficial.

EZPZ


Of course, I don’t expect you to write your own queries to handle this. If you’re the type of person who enjoys Blitzing things, you can find the new 2019 goodness in sp_BlitzIndex, and you can find all the missing index requests for a single query in sp_BlitzCache in a handy-dandy clickable column that scripts out the create statements for you.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

A Bug With Recursive UDFs When Inlined In SQL Server 2019

Enough Already


I see people do things like this fairly often with UDFs. I don’t know why. It’s almost like they read a list of best practices and decided the opposite was better.

This is a quite simplified function, but it’s enough to show the bug behavior.

While writing this, I learned that you can’t create a recursive (self-referencing) scalar UDF with the schemabinding option. I don’t know why that is either.

Please note that this behavior has been reported to Microsoft and will be fixed in a future update, though I’m not sure which one.

Swallowing Flies


Let’s take this thing. Let’s take this thing and throw it directly in the trash where it belongs.

CREATE OR ALTER FUNCTION dbo.how_high
(
    @i int,
    @h int
)
RETURNS int
WITH
    RETURNS NULL ON NULL INPUT
AS
BEGIN

    SELECT 
       @i += 1;
    
    IF @i < @h
    BEGIN
        SET 
            @i = dbo.how_high(@i, @h);
    END;

    RETURN @i;

END;
GO

Seriously. You’re asking for a bad time. Don’t do things like this.

Unless you want to pay me to fix them later.

Froided


In SQL Server 2019, under compatibility level 150, this is what the behavior looks like currently:

/*
Works
*/
SELECT 
    dbo.how_high(0, 36) AS how_high;
GO 

/*
Fails
*/
SELECT 
    dbo.how_high(0, 37) AS how_high;
GO

The first execution returns 36 as the final result, and the second query fails with this message:

Msg 217, Level 16, State 1, Line 40
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

A bit odd that it took 37 loops to exceed the nesting limit of 32.

This is the bug.

Olded


With UDF inlining disabled, a more obvious number of loops is necessary to encounter the error.

/*
Works
*/
SELECT 
    dbo.how_high(0, 32) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO 

/*
Fails
*/
SELECT 
    dbo.how_high(0, 33) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO

The first run returns 32, and the second run errors out with the same error message as above.

Does It Matter?


It’s a bit hard to imagine someone relying on that behavior, but I found it interesting enough to ask some of the nice folks at Microsoft about, and they confirmed that it shouldn’t happen. Again, it’ll get fixed, but I’m not sure when.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

sp_QuickieStore Improved Performance Troubleshooting

Things Are Getting Better



Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.