SQL Server Archives | Page 160 of 164

Is The SQL Server Really Slower Than It Used To Be?

Posted on January 15, 2019May 14, 2022 by Erik Darling

Economy Of Words

When someone says that something is slower than it was before, whether it’s a query, a whole SQL Server, a website, or an app screen, it’s almost never while the perceived slowness is happening, nor is it reproducible (especially when a consultant is watching).

There are some basic things you need to have historical record of if you wanna figure it out:

What queries were running
What queries were waiting on
What was different from last time those queries ran

Microsoft has taken some steps to help us figure this out with Query Store, but really only for SQL Server 2017, when aggregated wait stats were added to the family of Query Store views.

But if you’re like most people, you’re not on SQL Server 2017, and even further into that segment, you don’t have Query Store enabled.

I think there’s more Microsoft could do to improve data that’s collected, but right now it’s just a collection of votes.

Right now, the GUI is so limited in what you can search for that I wrote a sp_QuickieStore to make working with the data easier.

Compared To What?

If you’re on older versions of SQL Server, including those about to be taken off life support, what are your options?

For Free!

You can log sp_BlitzFirst output to tables, and use PowerBi to view the data.
You can use Open Query Store to produce a similar data set to the Query Store feature.
You can use sp_WhoIsActive, and log those results to a table

For Money!

There’s a whole landscape of SQL Server monitoring tools out there, as well as things people get confused with SQL Server monitoring tools.

Ultimately, the best monitoring tool for you is one you:

Will actually use
Will enable you to find problems
Will enable you to solve problems

Getting overwhelmed with meaningless metrics (there were how many logouts per second?!), standalone charts that don’t allow you to correlate system activity to resource usage (save as image?!), or long lists of queries that may or may not run when anyone cares (yes, CHECKDB did a lot of reads, thanks) doesn’t help anyone. If that sounds like your monitoring tool, it might be time to trade it for a carton of Gatorade.

You’ve got a glorified FitBit strapped onto your SQL Server.

Here And Now

What’s currently happening on your SQL Server is often only a symptom of what’s been happening on your SQL Server for a long time.

There are very few problems I’ve seen that are truly “sudden”, unless someone recently made an ill-advised change to the server, like dropping an important index, etc.

The longer you let things like aging hardware, growing data, and ignoring query and index problems go, the worse things get.

Monitoring your server is a good first step, but it’s still up to you to address the problems, and address the right problems.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Fixing Your SQL Server: Prioritizing Tasks

Posted on January 14, 2019May 14, 2022 by Erik Darling

Gym Shorts

If you said to me “Erik, I wanna run a mile in 10 minutes”, my first question would be “how fast can you run one now?”

Most everyone wants their SQL Server to go faster, but getting it to go faster isn’t always the first thing you should worry about.

Let’s say our intrepid runner doesn’t know how fast they can run a mile now because they smoke a pack a day, are 200lbs overweight, and just had a heart attack.

Obviously, there are issues to address before they can even walk a mile, nevermind run a mile, nevermind run one in 10 minutes.

Chest Hair

The point is, it doesn’t do you any good to have the fastest SQL Server in the world, if:

You can lose all your data because you’re not taking backups
You can lose more data than the business wants because you’re not taking backups often enough
You can lose data to corruption you’re not checking for
You can lose all your backups because they’re on the same disks your server is
You can lose more data than necessary if you’re not getting notified when problems occur

This list could go on for a long time.

The nice part about most of these things is that they’ve been automated for years. For free. Ola Hallengren has made it ridiculously easy for you.

Once this stuff is set up, you have the mostly passive job of making sure they’re running and finishing, which you can make even more passive with email alerts.

Undertrained

If you’re not sure where to start, start with your boss. Ask how much data you’re allowed to lose. Ask sales people if they’re telling people they’ll only lose a certain amount of data.

If you really wanna find out, tell them how much data you can lose right now. You can get this number by looking at your most frequent type of backups.

Only taking full backups once a day? That’s 24 hours.

Only taking log backups every 2 hours? That’s 2 hours.

And of course, if you’ve never tested restoring your backups, it’s ∞∞∞Infinity∞∞∞

Thanks for reading!

Going Further

Candy Crush and SQL Server’s Missing Index Requests

Posted on January 11, 2019May 14, 2022 by Erik Darling

It’s My Blog It’s My Rules

Since I know this’ll come out Friday, I get to have fun.

I have a feeling most of you know what Candy Crush is. It’s a mindless, infinite thumb scroll replacement.

When you’re playing, Candy Crush will suggest moves to you. It thinks they’re good ideas. Though I don’t know what the algorithm is, it seems to recommend some goofy stuff.

Take this example. It wants me to move the orange piece. It’s not a bad move, but it’s not the best possible move.

Savvy Candy Crush players will see the obvious choice. Make five in a row with the purple pieces, get the cookie thing, cookie thing drops down next to the explodey thing, and when you combine it with the wrapped candy thing, all the purple pieces turn into explodey pieces.

Level over, basically.

But Candy Crush isn’t thinking that far ahead. Neither are missing index requests.

SQL Server Does The Same Thing

Let’s take this query, for example. It’s very Post-centric.

SELECT   p.OwnerUserId, p.Score, p.Title
FROM     dbo.Comments AS c
JOIN     dbo.Posts AS p
    ON p.OwnerUserId = c.UserId
WHERE    p.PostTypeId = 1
AND      p.ClosedDate >= '2018-06-01'
ORDER BY p.Score DESC;

Right now, the only indexes I have are clustered, and they’re not on any columns that help this query. Sure, they help other things, and having clustered indexes is usually a good idea.

This is the home run use-case for nonclustered indexes. You know. Organized copies of other parts of your data that users might query.

This is such an obvious move that SQL Server is all like “hey, got a minute?”

This is where things get all jumpy-blinky. Just like in Candy Crush.

Hints-a-Hints

This is the index SQL Server wants:

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Posts] ([PostTypeId],[ClosedDate])
INCLUDE ([OwnerUserId],[Score],[Title])

Would this be better than no index? Yes.

Is it the best possible index? No. Not by a long shot.

Let’s start with our predicates. SQL Server picked PostTypeId as the leading key column.

SELECT SUM(CASE p.PostTypeId WHEN 1 THEN 1 ELSE 0 END) AS [count_type_one],
       SUM(CASE WHEN p.ClosedDate >= '2018-06-01' THEN 1 ELSE 0 END) AS count_closed_date
FROM dbo.Posts AS p

Is it selective?

That ain’t good

Regardless of selectivity, the missing index request mechanism will always put equality predicates first.

What I’m getting at is that the missing index request isn’t as well thought out as a lot of people hope. It’s just one possible index for a query weighted to helping us find data in the where clause.

With a human set of eyes on it, you may discover one or more better possible indexes. You may even discover one on for the Comments table, too.

Other Issues

There’s also the issue of the included columns it chose. We’re ordering by Score. We’re joining on OwnerUserId.

Those may be helpful as key columns, depending on how much data we end up joining, and how much data we end up sorting.

A SQL Server query plan — Guesses. Just guesses.

Complicated Game

If you don’t have anyone doing regular index tuning, missing index hints are worth following because they’re better than nothing.

They’re like… acceptable. Most of the time.

The big things you have to watch out for are the incredibly wide requests, duplicative requests, and ones that want big string columns.

Thanks for reading!

Going Further

What SQL Server Indexes Don’t Do With Data

Posted on January 10, 2019May 14, 2022 by Erik Darling

You Fine And All

You can read a lot about how indexes might improve queries.

But there are certain query mechanics that indexes don’t solve for, and you can not only get stuck with an index that isn’t really helping, but all the same query problems you were looking to solve.

Understanding how indexes and queries interact together is a fundamental part of query tuning.

In this post, we’re going to look at some query patterns that indexes don’t do much to fix.

Part Of The Problem

Indexes don’t care about your queries. They care about storing data in order.

There are very definite ways that you can encourage queries to use them efficiently, and probably even more definite ways for you to discourage them from using them efficiently.

Even with syntax that seems completely transparent, you can get into trouble.

Take the simple example of two date columns: there’s nothing in your index that tracks how two dates in a row relate to each other.

Nothing tracks how many years, months, weeks, days, hours, minutes, seconds, milliseconds, or whatever magical new unit Extended Events has to make itself less usable.

Heck, it doesn’t even track if one is greater or less than another.

Soggy Flautas

Ah, heck, let’s stick with Stack Overflow. Let’s even create an index.

CREATE INDEX whatever 
ON dbo.Posts(CreationDate, ClosedDate);

Now let’s look at these super important queries.

    SELECT   COUNT(*) AS records
    FROM     dbo.Posts AS p
    WHERE    p.CreationDate < p.ClosedDate;


    SELECT   COUNT(*) AS records
    FROM     dbo.Posts AS p
    WHERE    p.CreationDate > p.ClosedDate;

How much work will the optimizer think it has to do, here? How many rows will it estimate? How will it treat different queries, well, differently? If you said “it won’t”, you’re a smart cookie. Aside from the “Actual rows”, each plan has the same attributes across.

Treachery

Neither of these query plans is terrible on its own.

The problem is really in larger plans, where bad decisions like these have their way with other parts of a query plan.

Nasty little lurkers they are, because you expect things to get better when creating indexes and writing SARGable predicates.

Yet for both queries, SQL Server does the same thing, based on the same guesses on the perceived number of rows at play. It’s one of those quirky things — if it’s a data point we care about, then it’s one we should express somehow.

A computed column might work here:

    ALTER TABLE dbo.Posts 
        ADD created_less_closed AS 
            CONVERT(BIT, CASE WHEN CreationDate < ClosedDate 
                              THEN 1
                              WHEN CreationDate > ClosedDate
                              THEN 0
                         END)


    CREATE INDEX apathy 
        ON dbo.Posts (created_less_closed);

Which means we’d have to change our queries a bit, since expression matching has a hard time reasoning this one out:

    SELECT   COUNT(*) AS records
    FROM     dbo.Posts AS p
    WHERE    1 = CONVERT(BIT, CASE WHEN CreationDate < ClosedDate 
                              THEN 1
                              WHEN CreationDate > ClosedDate
                              THEN 0
                         END)
    AND 1 = (SELECT 1);


    SELECT   COUNT(*) AS records
    FROM     dbo.Posts AS p
    WHERE    0 = CONVERT(BIT, CASE WHEN CreationDate < ClosedDate 
                              THEN 1
                              WHEN CreationDate > ClosedDate
                              THEN 0
                         END)
    AND 1 = (SELECT 1);

What Would Be Better?

Practically speaking, it isn’t the job of an index (or statistic) to track things like this. We need to have data that represents things that are important to users.

Though neither of these plans is terrible in isolation, bad guesses like these flow all through query plans and can lead to other bad decisions and oddities.

It’s one of those terrible lurkers just waiting to turn an otherwise good query into that thing you regard with absolute loathing every time it shows up in [your favorite monitoring tool].

Knowing this, we can start to design our data to better reflect data points we care about. A likely hero here is a computed column to return some value based on which is greater, or a DATEDIFF of the two columns.

Going Further

Wanna Pelt Me With Random Questions For Old Times’ Sake?

Posted on January 10, 2019May 14, 2022 by Erik Darling

It’s No Secret, I Miss You All Dearly

I even miss your random questions. Well, not the ones about replication or security, but the rest of them were pretty good.

Mostly.

If you feel like watching me evade responsibility for my physical appearance again, join me on YouTube for a live SQL Sever Q&A.

It starts Friday at noon EST, and I’ll go until you run of out stuff to ask me.

Or I run out of champagne.

See you there!

Going Further

What SQL Server Indexes Don’t Do

Posted on January 9, 2019May 16, 2022 by Erik Darling

Moved!

This post has been moved here.

Root cause: This one is haunted.

Enjoy the rest of your (now cursed) day.

Carrying On

Posted on January 8, 2019May 14, 2022 by Erik Darling

Trying to figure out what you’re going to do after years of knowing exactly what you were going to do is tough. There’s also a big variable, in that I don’t know exactly how my time is going to break down just yet.

In some form or another, I’d like to commit to doing the community things that I love doing.

Blogging

This’ll continue 100%, probably with more videos. All performance-focused stuff. I haven’t decided on a schedule yet, but since I’m basically starting from scratch, there’s a lot of topics that I’ll probably cover again to have the content locally.

Presenting

Heck yeah! If you’d like to have me at your User Group, PASS VC Chapter, Elks Lodge, Ladies Auxiliary, or anything else, drop me a line.

Open Source Scripts

This is a tough one. After “owning” the Blitz script releases for so long, I probably won’t be cracking those open to work on them again immediately. While I totally believe in the work I did on them, I don’t feel it would be appropriate to dedicate the same time to them.

I have some other ideas that I’m still fleshing out. Maybe something that more clearly ties performance metrics together. I’m not crazy about reinventing wheels, so don’t look for a Xerox copy.

Office Hours

I’m looking at doing a YouTube live stream thing, if there’s any interest. It’ll be a little different from Office Hours, since right now I’ll be by myself. I might mix in some presentation stuff, to get started.

Other Stuff

I have a few other ideas that I’ll get to over time, as they make sense. Stay tuned, as they say.

Thanks for reading!

Going Further

Introductory

Posted on January 7, 2019May 16, 2022 by Erik Darling

Testing, Yes, Yes

About a year and a half ago, when I started writing and presenting The Junior DBA Workout Plan, I was also thinking about the eventual inevitability of someday not working for Brent, and exactly what I’d wanna do afterwards.

It was a tough question at the time, because I felt like I’d get my butt handed to me trying to do the same thing, consulting-wise. One thing we always lamented was a general lack of follow-through on recommendations, which got me thinking about how I could do things differently, so as not to be handed my butt in the most thorough of ways.

A Bridge Appears

Let’s face it, you’re having trouble finding your SQL Server’s weak points.

Reading the wait stats hieroglyphics, understanding index crop circles, and those query plans that go on for days.

You’re gonna have a hard time fixing all that stuff on your own.

Even with explicit instructions and scripts, you’re gonna get gun shy. You’re gonna have other stuff to do, and your SQL Server tasks are gonna fall off.

I wanna bridge that gap, and make sure you stay on track.

Coaching, Not Consulting

I want to give you in-depth, personalized advice and training when I assess your server, but then follow up with you on a consistent basis to make sure you’re progressing and hitting milestones.

Look, there’s a million places to get a boilerplate. You need someone who can cut through the decades of meme advice out there, tell you what to do and how to do it, and make sure you stick to the plan.

That way, if you hit a wall, something doesn’t work, something stops working, you’re not stranded. We can adjust the plan together.

Going Further

Swingo

Posted on January 3, 2019May 16, 2022 by Erik Darling

This is gonna be fun.

Going Further

The Resource Governor MAXDOP Setting Can Lead to Poor Plan Choices In SQL Server

Posted on October 10, 2018May 16, 2022 by Joe Obbish

Resource Governor can be used to enforce a hard cap on query MAXDOP, unlike the sp_configure setting. However, query plan compilation does not take such a MAXDOP limit into account. As a result, limiting MAXDOP through Resource Governor can lead to unexpected degradations in performance due to suboptimal query plan choices.

Create Your Tables

We start with the not often seen here three table demo. I’d rather not explain how I came up with this sample data, so I’m not going to. I did my testing on a server with max server memory set to 10000 MB. The following tables take about half a minute to create and populate and only take up about 1.5 GB of space:

DROP TABLE IF EXISTS dbo.SMALL;
CREATE TABLE dbo.SMALL (ID_U NUMERIC(18, 0));

INSERT INTO dbo.SMALL WITH (TABLOCK)
SELECT TOP (100) 5 * ROW_NUMBER()
	OVER (ORDER BY (SELECT NULL))
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);

DROP TABLE IF EXISTS dbo.MEDIUM;
CREATE TABLE dbo.MEDIUM (ID_A NUMERIC(18, 0));

INSERT INTO dbo.MEDIUM WITH (TABLOCK)
SELECT TOP (600000) ROW_NUMBER()
	OVER (ORDER BY (SELECT NULL))
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
CROSS JOIN master..spt_values t3
OPTION (MAXDOP 1);

DROP TABLE IF EXISTS dbo.LARGE;
CREATE TABLE dbo.LARGE (
	ID_A NUMERIC(18, 0),
	ID_U NUMERIC(18, 0),
	FILLER VARCHAR(100)
);

INSERT INTO dbo.LARGE WITH (TABLOCK)
SELECT 2 * ( RN / 4), RN % 500, REPLICATE('Z', 100)
FROM
(
	SELECT TOP (8000000) ROW_NUMBER()
		OVER (ORDER BY (SELECT NULL)) RN
	FROM master..spt_values t1
	CROSS JOIN master..spt_values t2
	CROSS JOIN master..spt_values t3
) q
OPTION (MAXDOP 1)

CREATE INDEX IA ON LARGE (ID_A);
CREATE INDEX IU ON LARGE (ID_U);

The Long-Awaited Demo

I thought up the theory behind this demo on a car ride back from a SQL Saturday, but wasn’t able to immediately figure out a way to get the query plan that I wanted. I ended up finally seeing it in a totally different context and am now happy to share it with you. Consider the following query:

SELECT LARGE.ID_U
FROM dbo.SMALL
INNER JOIN dbo.LARGE ON SMALL.ID_U = LARGE.ID_U
INNER JOIN dbo.MEDIUM ON LARGE.ID_A = MEDIUM.ID_A
OPTION (MAXDOP 1);

The MAXDOP 1 hints results in a serial plan with two hash joins:

a36_MAXDOP1_plan

This is a perfectly reasonable plan given the size and structure of the tables. There are no bitmap filters because row mode bitmap filters are only supported for parallel plans. Batch mode is not considered for this query because I’m testing on SQL Server 2017 and there isn’t a columnstore index on any of the tables referenced in the query. On my machine a single query execution uses 2422 of CPU time and 2431 ms of elapsed time.

A parallel plan at MAXDOP 4 is able to run more quickly but with a much higher CPU time. A single execution of the MAXDOP 4 query uses 5875 ms of CPU time and 1617 ms of elapsed time. There are multiple bitmap filters present. I zoomed in on the most interesting part of the plan because I haven’t figured out how images work with WordPress yet:

a36_MAXDOP4_plan

Instead of doing a scan of the LARGE table, SQL Server instead chooses an index intersection plan. The cost of the additional hash join is reduced by multiple bitmap filters. There are only 2648396 and 891852 rows processed on the build and probe side instead of 8 million for each side, which is a significant gain.

Worse Than A Teen Running for Governor

Some end users really can’t be trusted with the power to run parallel plans. I thought about making a joke about an “erik” end user but I would never subject my readers to the same joke twice. After enforcing a MAXDOP of 1 at the Resource Governor level, you will probably not be shocked to learn that the query with the explicit MAXDOP 1 hint gets the same query plan as before and runs with the same amount of CPU and elapsed time.

If you skipped or forget the opening paragraph, you may be surprised to learn that the query with a MAXDOP 4 hint also gets the same query plan as before. The actual execution plan even has the parallel racing arrows. However, the query cannot execute in parallel. The parallelism and bitmap operators are skipped by the query processor and all of the rows are processed on one thread:

a36_MAXDOP4_with_RG

I uploaded the query plan here if you want to look at it. This type of scenario can happen even without Resource Governor. For example, a compiled parallel query may be downgraded all the way to MAXDOP 1 if it can’t get enough parallel threads.

The query performs significantly worse than before, which hopefully is not a surprise. A single execution took 12860 ms of CPU time and 13078 ms of elapsed time. Nearly all of the query’s time is spent on the hash join for the index intersection, with a tempdb spill and the processing of additional rows both playing a role. The tempdb spill occurs because SQL Server expected the build side of the hash join to be reduced to 1213170 rows. The bitmap filtering does not occur so 8 million rows were sent to the build side instead.

In this case, adding a MAXDOP 1 hint to the query will improve performance by about 5X. Larger differences in run times can be easily seen on servers with more memory than my desktop.

Final Thoughts

If you’re using using Resource Governor to limit MAXDOP to 1, consider adding explicit MAXDOP 1 hints at the query level if you truly need the best possible performance. The MAXDOP 1 hint may at first appear to be redundant, but it gives the query optimizer additional information which can result in totally different, and sometimes significantly more efficient, query plans. I expect that this problem could be avoided if query plan caching worked on a Resource Governor workload group level. Perhaps that is one of those ideas that sounds simple on paper but would be difficult for Microsoft to implement. Batch mode for row store can somewhat mitigate this problem because batch mode bitmap filters operate even under MAXDOP 1, but you can still get classic row mode bitmaps even on SQL Server 2019.

Thanks for reading!