A Little More About Isolation Levels In SQL Server

A Little More About Isolation Levels In SQL Server



In this video, I talk about some of the misguided expectations that Read Committed has associated with it, and query patterns to watch out for under Read Committed Snapshot Isolation.

If you’d like to see my full day performance tuning session, here are upcoming dates:

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Join me at DataTune in Nashville, March 8-9 2024

Spring Training


This March, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, March 8th-9th 2024 at Belmont University – Massey Center 1900 Belmont Blvd, Nashville, TN 37212

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Indexing SQL Server Queries For Performance: Fixing A Non-SARGable Predicate

Nausea


Okay, deep breath. Deep, deep breath.

Would you believe I still have to fix queries with a bunch of functions in joins and where clauses? Like every day? All day?

Where things get tough is when there’s some third party vendor app where code can’t be changed, but customers have become so fed up with performance that they’re willing to make other changes to help things along.

This isn’t a big fireworks demo. I could have spent a lot more time finding a worse scenario, and I’ve hinted it to exacerbate the issue a bit.

Sometimes my hardware is too good for bad demos, probably because it’s not in the cloud.

That whole “cloud” thing has some real problems.

Setup


Anyway, let’s say we have this index (because we do, I just created it).

CREATE INDEX
    p
ON dbo.Posts
    (CommunityOwnedDate, Score)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
GO

Let’s further presume that we have this stored procedure.

This stored procedure could have been a query.

CREATE OR ALTER PROCEDURE
    dbo.FixNonSargable
(
    @CommunityOwnedDate datetime,
    @Score integer
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;

    SELECT
        p.*
    FROM dbo.Posts AS p
    WHERE COALESCE(p.CommunityOwnedDate, '17530101') >= @CommunityOwnedDate
    AND   p.Score >= @Score
    ORDER BY
        p.Id DESC
    OPTION
    (
        USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'), /*No batch mode*/
        MAXDOP 1 /*No parallelism*/
    );
END;
GO

When executed with such great gusto, it’s a bit slow.

EXEC dbo.FixNonSargable
    @CommunityOwnedDate = '20130101',
    @Score = 10;

At ~1.4 seconds, we’re very upset with performance. Throw the whole thing in the trash. Install DuckDB.

sql server query plan

Index Reversal


If we change the key column order of our index, so the column with a seekable predicate can go first, we can get faster query execution.

CREATE INDEX
    p
ON dbo.Posts
    (Score, CommunityOwnedDate)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE, DROP_EXISTING = ON);
GO

Now you don’t have to wait 6-never months for your vendor to fix their garbage code.

sql server query plan
i could be happy

In this case, changing the order of key columns was a more beneficial arrangement for this particular query.

Results may vary. Not query results! I mean like, if you go do this in your database. On “your” hardware. In the “cloud”.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Creating Uncacheable Stored Procedures In SQL Server

Creating Uncacheable Stored Procedures In SQL Server



Here’s the demo script from today’s video:

/*Create a table if you need to*/
CREATE TABLE
    dbo.DinnerPlans
(
    id bigint IDENTITY,
    name nvarchar(40) NOT NULL,
    seat_number tinyint NULL,
    is_free bit NOT NULL,
);
GO

/*First example, with an object that doesn't exist*/
CREATE OR ALTER PROCEDURE
    dbo.i_live
(
    @decider bit = NULL
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
   
    IF @decider = 'true'
    BEGIN
        SELECT
            dp.*
        FROM dbo.DinnerPlans AS dp;
    END;
   
    IF @decider = 'false'
    BEGIN
        SELECT
            whatever.*
        FROM dbo.AnObjectThatDoesntEvenPretendToExist AS whatever;

        --SELECT result = 'false!';
    END;
   
    IF @decider IS NULL
    BEGIN
        SELECT
            result =
                'please make a decision.'
    END;
END;
GO

/*Say goodbye!*/
DBCC FREEPROCCACHE;

/*This runs without an error*/
EXEC dbo.i_live
    @decider = 'false';

/*But there's no query plan!*/
SELECT
    object_name =
       OBJECT_NAME(deps.object_id, deps.database_id),  
    deps.type_desc,
    deps.last_execution_time,
    deps.execution_count,
    dest.text,
    query_plan =
        TRY_CAST(detqp.query_plan AS xml)
FROM sys.dm_exec_procedure_stats AS deps
OUTER APPLY sys.dm_exec_sql_text(deps.plan_handle) AS dest
OUTER APPLY sys.dm_exec_text_query_plan(deps.plan_handle, 0, -1) AS detqp;
GO

Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

sp_QuickieStore: Find Your Worst Performing Queries During Working Hours

sp_QuickieStore: Find Your Worst Performing Queries During Working Hours



Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Catch the latest episode of the @sqldatapartners podcast, guest starring yours truly

Ziggy Played Guitar


I was recently honored to be a guest on the SQL Data Parters podcast. We talked about SQL Server performance tuning scripts, consulting, and responsible levels of meat.

Remember to rate and like and subscribe and all that jazz.

Thanks for listening!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Join me at DataTune in Nashville, March 8-9 2024

Spring Training


This March, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, March 8th-9th 2024 at Belmont University – Massey Center 1900 Belmont Blvd, Nashville, TN 37212

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Indexing SQL Server Queries For Performance: Fixing Predicate Key Lookups

Make Things Fast


Our very lofty goal as query tuners is to make the time from request to result shorter. Sometimes that requires a bit of sacrifice on the server side:

  • Queries going parallel, using more CPU to reduce wall clock time
  • Queries using more memory to avoid spilling to disk
  • Queries using space in tempdb via a #temp table to store results
  • Queries having better indexes available to them for searching

Just about every query tuning technique requires sacrificing some resource on the server (and sometimes more than one resource) to achieve our very lofty goal.

The only exclusion is fixing some miserably-written query to bring it back to sensibility. That requires only a sacrifice on your side.

Indexes are not free rides, of course, though much of the cost attributed to them is overblown. I’m not here to give you license to add 100 indexes to a table, but I am giving you permission to judge indexes by quality (read vs write usage) over quantity.

Anyway, let’s look at some of my favorite index design patterns when I’m tuning queries for clients.

Many of the issues are going to contextually framed by issues with parameter sensitivity in order to exacerbate them. You may see similar bad judgement without parameter sensitivity, of course. Either the optimizer screws up some cardinality estimation, or you do something that screws up cardinality estimation for the optimizer. Common causes for that are local variables, table variables, non-SARGable predicates, overly complicated queries, etc.

I don’t write queries that do stupid things, except to show you when things are stupid.

In this post, I’m going to cover the fixing predicate key lookups! Because why not? We’re already here. Stuck together.

Forever.

Key Lookups In General


Key lookups represent a choice made by the optimizer between:

  • Clustered index scan
  • Nonclustered index seek/scan + Clustered index lookup

Lookups can be quite useful to avoid creating quite wide nonclustered index, but the optimizer is heavily biased against them. It does not like random I/O.

Typically lookups get chosen when a relatively small number of rows are expected to be retrieved from the nonclustered index.

Bad estimates in general, and self-inflicted bad estimates from the list above, can contribute to lookups being chosen inappropriately.

Also like I said above, parameter sensitivity is a very easy way to demonstrate the problem.

Before we go on though, let’s talk about the two things lookups can be assigned to do:

  • Retrieve columns from the clustered index that aren’t part of the nonclustered index definition
  • Evaluate predicates in the clustered index that aren’t part of the nonclustered index definition

Often, fixing lookups only to avoid retrieving columns is a last resort for me, because it can mean greatly expanding the number of columns included in a nonclustered index.

But fixing lookups that have predicates in them is quite appealing to me, because a good index should support locating and filtering as much data is possible for important queries.

Okay, now we can go on.

Predicate Key Lookups: Good


Let’s start with this index:

CREATE INDEX
    p
ON dbo.Posts
    (Score, CreationDate, PostTypeId)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
GO

And let’s start with this procedure:

CREATE OR ALTER PROCEDURE
    dbo.PredicateKeyLookup
(
    @Score integer,
    @CreationDate datetime,
    @PostTypeId integer,
    @OwnerUserId integer
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
   
    SELECT TOP (100)
        p.*
    FROM dbo.Posts AS p
    WHERE p.Score >= @Score
    AND   p.CreationDate >= @CreationDate
    AND   p.PostTypeId = @PostTypeId
    AND   p.OwnerUserId = @OwnerUserId
    ORDER BY
        p.Score DESC,
        p.CreationDate DESC,
        p.Id DESC;
END;
GO

I know, selecting star is a bad practice, but I’m going to share something shocking with you (and I’ll blog about this later, too): Lookups are assigned the same optimizer cost unit, whether you select one column, or 1000 columns.

The thing to note right here, is that the stored procedure filters on all the columns in our index, plus it has an additional filter on the OwnerUserId column. That (along with the select list columns) is the work our lookup will have to do.

Starting with a query that selects a small number of rows:

EXEC dbo.PredicateKeyLookup
    @Score = 0,
    @CreationDate = '20131101',
    @PostTypeId = 1,
    @OwnerUserId = 39677;

The query plan looks like this, and is very fast. Yay. We smart.

sql server query plan
easy lookup

Zooming in on the lookup a bit, here’s what it’s doing:

sql server query plan

  1. The clustered primary key on the Posts table is Id, which the nonclustered index inherits as a final key column in the nonclustered index (seek predicate), so this is how we match rows across the two
  2. The clustered index has all these columns in it that are not part of the nonclustered index definition, so we can retrieve any rows matched via the seek predicate
  3. The clustered index has OwnerUserId in it as well, which is also not part of the nonclustered index definition, so we can additionally filter any rows matched via the seek predicate

Wordy, I know, for bullet points, but the goal of these posts is to describe things to developers who may have never encountered such things.

Predicate Key Lookups: Bad


If we change our execution parameters to ones that are not selective at all, we’ll use the same execution plan, but speed will get a bit worse.

EXEC dbo.PredicateKeyLookup
    @Score = 0,
    @CreationDate = '20080101',
    @PostTypeId = 2,
    @OwnerUserId = 22656;

To run down what changed:

  • CreationDate will fetch several more years of posts (2008 vs 2013)
  • We’re looking for answers (PostTypeId 2) from questions (PostTypeId 1)
  • The OwnerUserId is now a much more active user on the site

Now, rather than finishing in ~300ms, the query takes ~3.5 seconds.

sql server query plan
degrading

The majority of the time is spent looping and looking up, which is not what we want at all. Again, indexes should help us find and filter as much data as possible in one place.

You can think of each index as a separate data source. If you need two data sources on one table to filter things, you’re leaving yourself open to performance issues.

Predicate Key Lookups: Fixing


In this case, we can fix things well-enough by changing the index to this:

CREATE INDEX
    p
ON dbo.Posts
    (Score, CreationDate, PostTypeId, OwnerUserId)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE, DROP_EXISTING = ON);
GO

Now, either execution, in either execution order, is fast (enough).

sql server query plan
good enough plan

Now, for sure, with enough domain knowledge, and enough understanding of the queries, we could do things differently.

  • If we know most/all queries will filter on OwnerUserId/PostTypeId
  • We’re allowed to create a completely new index to help this query if it’s important enough
CREATE INDEX
    p
ON dbo.Posts
    (OwnerUserId, PostTypeId, Score, CreationDate)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE, DROP_EXISTING = ON);
GO

In this index, OwnerUserId and PostTypeId are the leading columns. The way b-tree indexes work, equality predicates on leading column(s) maintain sorting of any following columns.

Since we’re ordering by Score and CreationDate, this is important (for this query, anyway). Now, both plans look like this.

sql server query plan
faster!

Predicate Key Lookups: Thoughts


Often, fixing a predicate key lookup by adding the filtering column to the nonclustered index your query is already using is good enough.

Other times, to squeeze every last bit of performance out of things, we may need to make additional adjustments. Changing existing indexes in a way that rearranges key column order can be perilous.

While the query you’re working on may work way better, others may not be so happy with your tinkering. Again, this is where you need additional domain knowledge.

If you find yourself wanting to really change things like that, you may be safer creating a new index instead.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Indexing SQL Server Queries For Performance: Common Questions

Over And Over


Some of the best questions I get some clients, conference attendees, and random email, are about how to design indexes.

A lot of developers out there have a rather foggy picture of exactly how indexes work. They’re all seen phone books, and drawings of B-Tree indexes, but some common things still escape them.

In this post, I’m going to talk about a few things like I’m speaking to someone who has never created a table before.

Even if you’ve created lots of tables and indexes, you may find some interesting points in here, if you pay close attention.

The topics I’m going to go over are:

  • Heaps
  • Clustered indexes
  • Nonclustered indexes
  • Index design concepts

Hopefully it helps to clear some things up!

What’s A Heap?


A heap is a table that doesn’t have a clustered index on it. It may have a nonclustered primary key, nonclustered indexes, constraints of all variety, foreign keys, etc.

Heaps are not indexes. Heaps are unordered data structures.

There are good reasons for some tables to be heaps, but most tables shouldn’t be. A good example is a staging table for ETL loads.

Dumping a bunch of data into a heap typically works faster because data doesn’t need to be sorted to conform to index key ordering. More indexes also means more transaction log writes.

I don’t want to get into minimal logging, because that can be a really tricky thing to get in many circumstances. If you want to read more about that, you can follow this link: Minimal Logging with INSERT…SELECT into Heap Tables

Heaps behave different in some ways that are surprising, though when you update or delete (not truncate) rows from them:

  • Updates move rows around and leave pointers that need to be followed in their place
  • Deletes may remove all rows from data pages but not deallocate them from the table (unless you use a TABLOCK hint, or the delete is escalated to use a table lock naturally)

This isn’t something that happens in tables with clustered indexes on them, and they can have surprising impacts on query performance.

If you update variable length data in your heaps a whole bunch, you can end up with a lot of row movement, which will make later scans of the heap do a lot more reads to follow all those pointers to the location of the new rows.

Same thing with deletes. When you scan through a heap (because there’s no indexed data you can seek to), you may end up reading lots of empty pages to satisfy a complete scan of the data.

I don’t have a good explanation for why heaps are designed that way for updates. Perhaps it’s an optimization for large updates to not automatically split pages, which adds a bit of logging overhead. For deletes, though, I believe it’s so pages are at the ready for new data being loaded in without having to allocate new pages. Since heaps have no order, you can stick data anywhere.

One additional note here is that even with a TABLOCK hint, deletes may not immediately deallocate pages if you’re using Read Committed Snapshot Isolation, or Snapshot Isolation.

In general, there’s not a lot of sense to leaving tables as heaps in SQL Server, at least for OLTP workloads.

What’s A Clustered Index?


A clustered index logically orders the data in your table by the clustered index key column(s). Physical ordering is a separate issue.

I often hear people ask if clustered indexes are a copy of the table, and the answer is… kind of, but only in the way that nonclustered indexes are also copies of the data. Adding nonclustered indexes to a heap would likewise be a copy of data in the heap.

However, a clustered index is not a copy of a heap.

In other words, a single table cannot exist as a clustered and heaped version of itself simultaneously. You are either clustered or heaped.

A good clustered index (often on an identity column) does not necessarily have to provide optimized search access to the data stored in it. Often, the choice is to optimize the insert workload.

To pick a good clustered index, you should usually follow the NUDES acronym:

  • Narrow: Numbers, dates, not string-ish
  • Unique: Identities or sequences are good for this
  • Distinct: If not unique, then as distinct as possible
  • Ever: Increasing (append only is the goal, here
  • Static: Definitely not a column you ever update

The goal is to append unique, static values to the end of the list, that take up very few bytes so that intermediate (not leaf) pages can be more densely packed.

If you have a non-surrogate (natural) key that accomplishes this, great. If not, surrogate keys (again, identity columns) are a good choice to check these boxes. Clustered indexes do not have to be unique, nor do they have to be primary keys, but they are most often tightly coupled in SQL Server relational designs.

In most OLTP workloads, lacking a natural key to cluster on is fine, because you can create nonclustered indexes to optimize searches, joins, and other relational activities that benefit from ordering (group by, order by, partition by, etc.). If your OLTP queries end up scanning clustered indexes a lot, it’s usually a good sign that you haven’t designed nonclustered indexes well.

This assumes they have a where clause on them, of course.

What’s A Nonclustered Index?


Nonclustered indexes are separate copies of the table data, whether it’s clustered or heaped. On disk, in memory, in the transaction log, with different statistics, column ordering, and data pages assigned to them.

We’ll talk about some design strategies next, but the ideal usage for nonclustered indexes is to optimize data access, so your query’s where clause can locate data quickly. There are many other considerations and uses (for instance, supporting foreign keys, or backing up unique constraints), but if you’re very new to indexing, focus on helping your queries get to the data they’re searching for quickly.

A common misconception is that just because a column is in the key of an index, it will help the query be more efficient, regardless of what position in key column order it is, but that’s absolutely not true.

I like to think of key column order as sort of like following a recipe. Let’s say you’re following one where the start is a big piece of ~whatever~, and the end is cooked and cubed pieces of ~whatever~

The initial steps in the recipe are:

  • Defrost ~whatever~
  • Cube ~whatever~
  • Cook ~whatever~

If you don’t defrost ~whatever~ first, cubing it is going to take a long time (forget skipping to cooking it). If you don’t cube ~whatever~, cooking it is going to take a long time.

This is a bit like the way index key column order works (though not a perfect metaphor). If you don’t access the leading key, accessing any following keys is less efficient.

Each column is a gatekeeper to the next.

Nonclustered Index Design Patterns


Clustered indexes are the foundation and frame of a table. Nonclustered indexes are where we go for specific purposes, to find specific things.

If we didn’t have things organized in smaller chunks, we’d end up running around the whole place looking for toilet paper after we found a toilet in the dining room.

It’s a situation no one wants to encounter. Your queries are no different. The less running around looking for things they have to do, the happy they are.

In most cases, the fastest way to reduce the time they spend looking for things is to create indexes that match the goals of your where clause.

There is a persistent and rather idiotic strain of advice that the most selective column should always come first. That’s likely true if the key of the index is unique, and you’re searching equality predicates for a single row.

It’s just that most queries I see aren’t doing that — queries that do that (something like a primary key lookup) already perform well — the types of queries I deal with are usually dealing with a ton of rows.

There’s a second persistent piece of goofiness that inequality predicates should always be indexed for after equality predicates. That’s certainly not true when a small range (say the last hour of data) matches very few rows, but an equality predicate (say a bit column) would match tons of rows.

One important thing to remember about nonclustered indexes is that they inherit things from both clustered and heaped tables: identifiers.

  • If your table is a heap, there’s a row identifier (RID) that ends up in your nonclustered indexes
  • If your table is clustered, the clustered index key(s) end up in your nonclustered indexes
    • For non-unique nonclustered indexes, they end up as a final key column
    • For unique nonclustered indexes, they end up as an included column

Tomorrow, we’ll look at some index design patterns in practice.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Some Questions I’ve Answered Recently On Database Administrators Stack Exchange

Good Citizen


Here are some interesting questions that I’ve answered on dba.stackexchange.com:

I should probably get a real job.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.