SQL Server 2019 – Page 4

Unclogging SQL Server Performance

Posted on January 14, 2020May 14, 2022 by Erik Darling

Oft Evil

I had a client recently with, wait for it, a performance problem. Or rather, two problems.

The OLTP part was working fine, but there was a reporting element that was dog slow, and would cause all sorts of problems on the server.

When we got into things, I noticed something rather funny: All of their reporting queries had very high estimated costs, and all the plans were totally serial.

The problem came down to two functions that were used in the OLTP portion, which were reused in the reporting portion.

Uh Ohs

I know what you’re thinking: 2019 would have fixed it.

Buuuuuuuuuuut.

No.

As magnificent and glorious as FROID is, there are a couple limitations that are pretty big gotchas:

The UDF does not invoke any intrinsic function that is either time-dependent (such as GETDATE()) or has side effects³ (such as NEWSEQUENTIALID()).

And

¹ SELECT with variable accumulation/aggregation (for example, SELECT @val += col1 FROM table1) is not supported for inlining.

Which is what both were doing. One was doing some date math based on GETDATE, the other was assembling a string based on some logic, and not the kind of thing that STRING_AGG would have helped with, unfortunately.

They could both be rewritten with a little bit of work, and once we did that and fixed up the queries using them, things looked a lot different.

Freeee

For these plans, it wasn’t just that they were forced to run on one CPU that was harming performance. In some cases, these functions were in WHERE clauses. They were being used to filter data from tables with many millions of rows.

Yes, there was a WHERE clause that looked like AND dbo.function(somecol) LIKE ‘%thing%’, which was… Brave?

Getting rid of those bottlenecks relieved quite a lot of pain.

If you want to find stuff like this on your own, here’s what you can do:

Looking at the execution plan, hit get the properties of the select operator and look for a “NonParallelPlanReason”
Run sp_BlitzCache and look for “Forced Serialization” warnings
Inspect Filter operators in your query plans (I’m almost always suspicious of these things)
Review code for scalar valued function calls

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

In SQL Server 2019 Memory Grants Are Higher With Batch Mode For Row Store Queries

Posted on December 12, 2019May 16, 2022 by Erik Darling

Because I Got

Going Further

No Column Store Indexes With In-Memory Tempdb For SQL Server? No Problem!

Posted on November 19, 2019May 16, 2022 by Erik Darling

Old Habits

Let’s say you’re on SQL Server 2019. No, seriously. It’s been out for a couple weeks now.

You could be.

I say that you could be because you’re the kind of brave person who tries new things and experiments with their ~~body~~ server.

You may even do crazy things like this.

Stone Cold

CREATE TABLE #t ( id INT, INDEX c CLUSTERED COLUMNSTORE );

SELECT    COUNT_BIG(*) AS records
FROM      dbo.Users AS u
JOIN      dbo.Posts AS p
    ON u.Id = p.OwnerUserId
JOIN      dbo.Comments AS c
    ON u.Id = c.UserId
LEFT JOIN #t AS t ON 1 = 0;

Woah ho ho. What happened there? A #temp table with a clustered column store index on it left joined on 1 = 0?

Yes. People do this.

People do this because it’s getting some batch mode operations “for free”, which have the nasty habit of making big reporting queries run a lot faster.

Yonder Problem

When you enable 2019’s new in memory tempdb, which can really help with stuff tempdb needs help with, you may find yourself hitting errors.

Msg 11442, Level 16, State 1, Line 14
Columnstore index creation is not support in tempdb when memory-optimized metadata mode is enabled.
Msg 1750, Level 16, State 1, Line 14
Could not create constraint or index. See previous errors.

The good news is that this works with *real* tables, too.

CREATE TABLE dbo.t ( id INT, INDEX c CLUSTERED COLUMNSTORE );

SELECT    COUNT_BIG(*) AS records
FROM      dbo.Users AS u
JOIN      dbo.Posts AS p
    ON u.Id = p.OwnerUserId
JOIN      dbo.Comments AS c
    ON u.Id = c.UserId
LEFT JOIN dbo.t AS t ON 1 = 0;

And you can get plans with all sorts of Batchy goodness in them.

SQL Server Query Plan — Long way from home

Yeah, you’re gonna have to change some code, but don’t worry.

You’re the kind of person who enjoys that.

Right?

Thanks for reading!

Going Further

SQL Server 2019: Please Increase the RAM Limit For Standard Edition

Posted on October 24, 2019May 16, 2022 by Erik Darling

C’monnnn

It’s that time of the release cycle again! Microsoft has published the stuff that different editions of SQL Server 2019 will do or not do.

First up, I don’t disagree with any of the feature choices. They’re fine.

Second up, the RAM limit on Standard Edition is still a cruel, cruel mockery.

Here’s Why

I could point out that I have a desktop with 128 GB of RAM, or that current gen laptops allow up to 128 GB of RAM.

Sure, I could point out that many larger instance sizes in Azure and AWS make no sense for Standard Edition (even with constrained CPUs in Azure).

I could trot out charts and graphs and talk about how the limit is only for the buffer pool.

But the bottom line is this: If the automatic and intelligent query tuning features are that powerful, then giving Standard Edition either a 256GB buffer pool cap, or a buffer pool cap that scales with CPU licensing spend still won’t make it competitive, because hardware isn’t solving the kind of problems that those features do.

If they’re not, they shouldn’t be Enterprise only. Clearly hardware is the differentiator.

Increasing the buffer pool limit is a free choice

Microsoft gave away most of the important programmability features with 2016 SP1 to Standard Edition, and they’re giving up TDE in 2019.

Those are features that cost real human time to develop, support, and maintain. A larger buffer pool costs nothing.

Thanks for reading!

Going Further

Why Does My Serial Query Plan Have Parallel Wait Stats Like CXPACKET And CXCONSUMER?

Posted on September 23, 2019May 16, 2022 by Erik Darling

Harkening

In dramatic fashion, I’m revisiting something from this post about stats updates.

It’s a quick post, because uh… Well. Pick a reason.

Get In Gear

Follow along as I repeat all the steps in the linked post to:

Load > 2 billion rows into a table
Create a stats object on every column
Load enough new data to trigger a stats refresh
Query the table to trigger the stats refresh

Except this time, I’m adding a mAxDoP 1 hint to it:

SELECT COUNT(*)
FROM dbo.Vetos
WHERE UserId = 138
AND   PostId = 138
AND   BountyAmount = 138
AND   VoteTypeId = 138
AND   CreationDate = 138
OPTION(MAXDOP 1);

Here’s Where Things Get Interesting

Our MaXdOp 1 query registers nearly the same amount of time on stats updates and parallelism.

But our plan is indeed serial. Because we told it to be.

By setting maxDOP to 1.

Not Alone

So, if you’re out there in the world wondering why this crazy kinda thing goes down, here’s one explanation.

Are there others? Probably.

But you’ll have to find out by setting MAXdop to 1 on your own.

Thanks for reading!

Going Further

Can SQL Server 2019’s Scalar UDF Inlining Fix This Performance Problem?

Posted on August 27, 2019May 14, 2022 by Erik Darling

Snakey TOP

Going Further

SQL Server 2019: WAIT_ON_SYNC_STATISTICS_REFRESH

Posted on August 19, 2019May 16, 2022 by Erik Darling

Well Have You Ever

I mean ever really wondered just how long a query of yours waited on stats to automatically update before running?

Have you ever been absolutely, positively sure that those gosh dang darn stats updates were putting your query on the trolley to Slowsville?

Your wishes will be 100% granted in SQL Server 2019.

Now, this wait doesn’t show up when stats are created the first time.

So if you run a query with a where clause on a column that doesn’t already have stats, the initial gather won’t show.

This is only for updates. Not creates.

Got it?

In Action

The following script will take the Votes table, and make a copy of it called Vetos.

Then it’ll create a PK/CX (for some reason?), and run a query against a couple columns that are mostly NULL.

Voting data gets cleaned out of the SO data dump.

SELECT ISNULL(Id, 0) AS Id,
       PostId,
       UserId,
       BountyAmount,
       VoteTypeId,
       CreationDate
INTO dbo.Vetos
FROM dbo.Votes;

ALTER TABLE dbo.Vetos
 ADD CONSTRAINT PK_Vetos_Id 
    PRIMARY KEY CLUSTERED(Id);

SELECT TOP 10 * 
FROM dbo.Vetos 
WHERE UserId > 0 
AND BountyAmount > 0;

The last query is important because it generates the initial stats on both of those columns.

Now let’s put some work into it!

UPDATE v
SET v.BountyAmount = 50000
FROM dbo.Vetos AS v
WHERE v.BountyAmount IS NULL;

UPDATE v
SET v.UserId = v.VoteTypeId
FROM dbo.Vetos AS v
WHERE v.UserId IS NULL;

This table has 52,928,720 rows in it. Not the biggest, but a decent size to maybe have to wait on stats to update.

Ready Steady

In separate windows, I’ll run these:

SELECT COUNT(*) AS records
FROM dbo.Vetos AS v
WHERE v.BountyAmount > 500;

SELECT COUNT(*) AS records
FROM dbo.Vetos AS v
WHERE v.UserId < 16;

They’ll trigger the stats refresh.

Fun. Yes.

Checking in on each session’s wait stats using dm_exec_session_wait_stats, our wild wait appears.

So there you have it. 52 million row stats refreshes take about half a second.

That wasn’t very exciting. Let’s try something else.

Tricks, Kids

If we start from scratch, but instead of letting SQL Server create stats automatically by running a query, let’s create statistics with some funny options, and then update the columns.

CREATE STATISTICS s_b ON dbo.Vetos(BountyAmount) 
WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON;

CREATE STATISTICS s_u ON dbo.Vetos(UserId) 
WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON;

Now, when we run our select queries, things really slow down.

Quite a bit longer on those.

Actionable?

What would one do if they encountered 15-16 waits on this in real life?

Well, you have some options.

Update stats asynchronously
Create stats with no recompute and handle stats updates yourself
Update statistics more often than you currently do, trying to stay ahead of automatic updates

It’s hard to see this being a really big issue outside of very large tables, and perhaps only on under-powered servers.

Or if someone created statistics with some rather impolite settings.

Thanks for reading!

Going Further

The Fastest Ways To Get The Highest Value In SQL Server Part 3

Posted on July 25, 2019August 29, 2023 by Erik Darling

Silent and Grey

In yesterday’s post, we looked at plans with a good index. The row number queries were unfortunate, but the MAX and TOP 1 queries did really well.

Today, I wanna see what the future holds. I’m gonna test stuff out on SQL Server 2019 CTP 3.1 and see how things go.

I’m only going to hit the interesting points. If plans don’t change, I’m not gonna go into them.

Query #1

With no indexes, this query positively RIPS. Batch Mode For Row Store kicks in, and this finishes in 2 seconds.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT MAX(Score) AS Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id

) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

Query #2

Our TOP 1 query should be BOTTOM 1 here. It goes back to its index spooling ways, and runs for a minute.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT TOP (1) p.Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id
	ORDER BY p.Score DESC

) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Query #3

This is our first attempt at row number. It’s particularly disappointing when we see the next query plan.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT p.Score,
	       ROW_NUMBER() OVER (ORDER BY p.Score DESC) AS n
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = u.Id
) AS ca
WHERE u.Reputation >= 100000
AND ca.n = 1
ORDER BY u.Id;

On its own, it’s just regular disappointing.

Serial. Spool. 57 seconds.

With an index, we use the same plan as in SQL Server 2017, and it finishes in around 200 ms.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Myself.

Query #4

Why this plan is cool, and why it makes the previous plans very disappointing, is because we get a Batch Mode Window Aggregate.

SELECT u.Id,
       u.DisplayName,
	   u.Reputation,
	   ca.Score
FROM dbo.Users AS u
CROSS APPLY
(
    SELECT * 
	FROM 
	(
        SELECT p.OwnerUserId,
	           p.Score,
	           ROW_NUMBER() OVER (PARTITION BY p.OwnerUserId 
			                      ORDER BY p.Score DESC) AS n
	    FROM dbo.Posts AS p
	) AS p
	WHERE p.OwnerUserId = u.Id
	AND p.n = 1
) AS ca
WHERE u.Reputation >= 100000
ORDER BY u.Id;

It finishes in 1.7 seconds. This is nice. Good job, 2019.

With the index we get a serial Batch Mode plan, which finishes in about 1.4 seconds.

If you’re confused about where 1.4 seconds come from, watch this video.

Why Aren’t You Out Yet?

SQL Server 2019 did some interesting things, here.

In some cases, it made fast queries faster.

In other cases, queries stayed… exactly the same.

When Batch Mode kicks in, you may find queries like this speeding up. But when it doesn’t, you may find yourself having to do some good ol’ fashion query and index tuning.

No big surprise there. Not worth the picture.

I feel like I’m repeating myself.

Myself.