SQL Server 2019: If You Want Adaptive Joins, You Need Wider Indexes

What You Make Of It


We’ve got this query. Handsome devil of a query.

You can pretend it’s in a stored procedure, and that the date filter is a parameter if you want.

SELECT u.DisplayName, p.Score
FROM dbo.Users AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
WHERE u.CreationDate >= '20131201';

A long time ago, when we migrated Stack Overflow from Access to SQL Server 2000, we created indexes.

This one has worked alright.

CREATE INDEX ix_whatever 
    ON dbo.Posts(OwnerUserId);

But Now We’re On 2019


And we’ve, like, read a lot about Adaptive Joins, and we think this’ll be cool to see in action.

Unfortunately, our query doesn’t seem to qualify.

SQL Server Query Plan
Shame shame shame

Now, there’s an Extended Event that… Used to work.

These days it just stares blankly at me. But since I’ve worked with this before, I know the problem.

It’s that Key Lookup — I’ll explain more in a minute.

Index Upgrade


First, let’s get rid of the Lookup so we can see the Adaptive Join happen.

CREATE INDEX ix_adaptathy 
    ON dbo.Posts(OwnerUserId, Score);
SQL Server Query Plan
New new new

As We Proceed


Let’s think about what Adaptive Joins need:

  • An index on the column(s) you’re joining

This gives us a realistic choice between using a Nested Loops join to do efficient Seeks, or an easy scan for a Hash Join.

  • That index has to cover the query

Without a covering index, there’s too much for the optimizer to think about.

It’s not just making a choice between a Nested Loops or Hash Join, it’s also factoring in the cost of a Lookup.

This used to trigger the XE on eajsrUnMatchedOuter, meaning the outer table didn’t have an index that matched the query.

Why Revisit This?


When SQL Server 2019 comes out, people are gonna have really high hopes for their workloads automagickally getting faster.

While there are lots of things that it’ll likely help, it’s going to take a lot of work on your part to make sure your queries and indexes allow for the automagick to kick in.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Running Query Confusion In SQL Server

Somewhat Solved


This bit of confusion is largely solved in SQL Server 2019 under compatibility level 150, when FROID (scalar udf inlining) kicks in.

But, you know, we’re a ways off from 2019 dropping, being adopted, and compat level 150 being the operating norm.

So here goes!

Functional Querying


I’ve got a scalar valued function. What it does is unimportant, but I’m calling it in a query like this:

SELECT u.DisplayName, 
       dbo.TotalScore(u.Id) AS TotalScore --<functione
FROM dbo.Users AS u
WHERE u.Reputation >= 200000
ORDER BY u.Id;

When I run this in SSMS, it’s obvious to us what’s going on.

But if I’m watching what’s happening on a server using sp_WhoIsActive, what’s going on might not be obvious.

I’m doing all this with just my query running to show how confusing things can get.

First Confusion: Query Text

SQL Server sp_WhoIsActive
Foggy

This doesn’t look at all like the text of our query. We can guess that it’s the function running in the select list since we know what we’re doing, but, you know…

We can bring some clarity by running sp_WhoIsActive like this:

sp_WhoIsActive @get_plans = 1, 
               @get_outer_command = 1;

The outer command parameter will show us the query calling the function, which’ll look more familiar.

SQL Server sp_WhoIsActive
Headline News

Second Confusion: Phantom Parallelism

We’re hitting more of those harmless, silly little CXCONSUMER waits.

But how? Our query plan is serial!

SQL Server Query Plan
Glam Chowder

This part is a little less obvious, but if we get an estimated plan for our query, or track down the query plan for the function, it becomes more obvious.

SQL Server Query Plan
Questionable Taco

The query plan for the function is parallel — a cute ~nuance~ about scalar udfs is that they only prevent the query calling them from going parallel.

The function itself can go parallel. So that’s… nice.

I guess.

They Walked Inlined


In compat level 150, things are more clear.

SQL Server sp_WhoIsActive
CLRLY

The inner and outer text are the same. There’s more of that CXCONSUMER, though. Hoowee.

SQL Server Query Plan
Might as well jump.

But at least now we have a query plan that matches the parallel waits, right?

In the next post, we’re gonna talk more about those wait stats, though.

Thanks for reading!

Does SQL Server 2019 Help With Multiple Scalar Function Calls?

Waybad Machine


In yesterday’s post, we looked at a clever way to reduce calls to a scalar UDF using APPLY.

Today, we’re going to see if 2019 changes anything, and if our old trick still tricks.

Because, you know, what else do you do when you need to write 5 blog posts a week?

LOVE YOU!

Don’t Be A Donkey


I’m going to abridge this a little bit, since all the code is referenced at the link up there.

I’m also going to show you some stuff using Plan Explorer.

Why?

Because SSMS kept opening the plan XML as XML, and that makes for crap screenshots.

Here’s the results for the plan with two function references. It runs for ~2.2 seconds.

SQL Server Query Plan
Honesty, at last.

If you remember yesterday’s post (and why wouldn’t you, hm?) the query plans didn’t show us touching other tables at all.

Just seeking into the Users table and then magically computing scalars and filtering.

One of the nice things about scalar UDF inlining: honesty.

But, you know, the two where clause references end up expanding. We’re hitting pretty big tables, here, too.

Apply-ish-ness


Using APPLY has a similar *ffect here. The function is only referenced and filtered once, and the duration is cut roughly in half.

Now, I know you’re probably thinking, because YOU REMEMBER YESTERDAY’S POST!

SQL Server Query Plan
Ming the Merciless

How come these queries are so much slower with the functions inlined?

Well, they’re not. With query plans turned off, the first one runs in ~900ms, and the second one runs in ~500ms.

Yesterday’s plans run for 1.6s and 600ms respectively with plans turns off.

Apparently observation has overhead. If only there were a clever phrase for that.

Not All Functions


The idea behind FROID is that it removes some restrictions around scalar valued functions.

  1. They can be inlined into the query, not run per-row returned
  2. They don’t force serial execution, so you can get a parallel plan

If your functions already run pretty quickly over a small  number of rows, and the calling query doesn’t qualify for parallelism, you may not see a remarkable speedup.

That’s fine, though, because inlining has other benefits:

  • Query plans are honest about the work they do
  • Measuring the query will show you work that used to be hidden behind the function call(s)

Even if every query doesn’t magically finish before you run it, you’ll see pretty good gains.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

A Hidden Value Of Apply With SQL Server Scalar UDFs

Look, Functions Suck


That’s why smart people have been working on making them suck less.

The things I see people doing with them range from “you know there’s a system function that does that” to “oh wow, you wrote an entire program in here”.

I’m not kidding. I once saw a function that was a wrapper for ISNULL that returned the results of ISNULL. I have no idea why.

If I had to think of a DBA prank, writing scalar UDFs that are just wrappers for system functions would be pretty high up there.

Especially if they had the same names as the system functions.

Turning Down The Suck


A while back, Jonathan Kehayias blogged about a way to speed up UDFs that might see NULL input.

Which is great, if your functions see NULL inputs.

But what if… What if they don’t?

And what if they’re in your WHERE clause?

And what if they’re in your WHERE clause multiple times?

Oh my.

Tick, Tick, Tick


Here’s our function.

CREATE FUNCTION dbo.TotalScore(@UserId INT)  
RETURNS BIGINT  
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING  
AS   
BEGIN    
    DECLARE @TotalScore BIGINT;        
    SELECT @TotalScore =   
    (  
        SELECT ISNULL(SUM(p.Score), 0)  
        FROM dbo.Posts AS p  
        WHERE p.OwnerUserId = @UserId  
    ) +  
    (  
        SELECT ISNULL(SUM(c.Score), 0)  
        FROM dbo.Comments AS c  
        WHERE c.UserId = @UserId  
    )        
    RETURN @TotalScore;    
END
GO

What it does is go out to the Posts and Comments tables and sums up the Score columns for a user.

We’ll use it in our query like this:

SELECT u.DisplayName, 
       u.Reputation
FROM dbo.Users AS u
WHERE u.Reputation >= 100000
AND dbo.TotalScore(u.Id) >= 10000
AND dbo.TotalScore(u.Id) < 20000
ORDER BY u.Id;

We want to find people with a total score between 10 and 20 thousand.

Right on.

When we run the query, the plan looks like this, showing 2 seconds of runtime.

SQL Server Query Plan
Two seconds for 260 rows is kinda wack tho

Tock, Tock, Tock


I know, I know. Get to the point. Make it faster, bouncer-man.

Our goal is to get the function to run fewer times, so we’ll replace multiple calls to it with one call.

SELECT u.DisplayName,
       u.Reputation
FROM dbo.Users AS u
CROSS APPLY
    (
        VALUES (dbo.TotalScore(u.Id))
    ) AS t (Score)
WHERE u.Reputation >= 100000
AND   t.Score >= 10000
AND   t.Score < 20000
ORDER BY u.Id;

Using this technique, the query runs for about 780ms.

SQL Server Query Plan
Check you out.

Tale of the XE


What happens that makes this faster is more evident if we use the XE session from Jonathan’s post for similar reasons, and look at how many times the function was called.

If we look at the activity sequence, it goes up to 1060 for the first query:

SQL Server Extended Events
Moved Out The Hood

And only 615 for the second query:

SQL Server Extended Events
Thinner~

Exeunt


Right now, if we want scalar UDFs to run faster, we can:

  • Tune the underlying query (if there is one)
  • Have them run fewer times
  • Wait for SQL Server 2019

In tomorrow’s post, I’ll look at the same scenario using CTP 3 of SQL Server 2019.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Coming Froidpocalypse In SQL Server 2019

Legal Notice


I’ve trademarked: Froidrage, Froidulent, and Froidpocalypse.

If you want to use them, you have to pay me $10,000.

Alright, I’m being told by my lawyer that writing them on cocktail napkins and showing them to confused bartenders doesn’t actually register a trademark.

Nevermind.

Here’s What’s Gonna Happen


And it’s not a problem that you need SQL Server 2019 to see. All you have to do is try to rewrite a function.

Here’s our Villain, a scalar UDF.

CREATE FUNCTION dbo.Villain (@UserId INT)
RETURNS INT
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS	
BEGIN

    DECLARE @Score INT
	SELECT TOP (1)
	         @Score = p.Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = @UserId
	AND   p.PostTypeId = 1
	ORDER BY p.Score DESC;
    
	RETURN @Score;

END
GO

Here’s the query that’s gonna call it:

SELECT      TOP ( 10 )
              u.DisplayName, 
			  u.Reputation, 
			  dbo.Villain(u.Id)
FROM        dbo.Users AS u
ORDER BY    u.Reputation DESC;
GO

I’m Going To Show You Two Things


The estimated plan, and the actual plan.

I need to show you the estimated plan so you can see what the function does, because that’s not included in the actual plan.

Yes, the estimated plan is more accurate than the actual plan.

Marinate on that.

SQL Server Query Plan
On The Dancefloor

The important thing is the second plan, which is the function’s execution plan. Notice that it generated a missing index request, and doesn’t spool anything at all.

It handles the query logic with a Top N Sort.

Here’s the actual plan:

SQL Server Query Plan
A NONCLUSTERED INDEX SCAN!!!

Let’s talk about a couple things:

  • A nonclustered index scan that costs 100% and runs for 0.000s
  • A compute scalar that costs 0% and runs for ~3s

The compute scalar thing is well documented by… Well, not by official Microsoft documentation.

But they’ve been blogged about by Cookies Cunningham, and Paul White.

Thanks, you two.

Any Reasonable Person


Would say “I can rewrite that function and make things better”.

Because of course an inline function is always better than a scalar function.

Enter our Hero.

CREATE FUNCTION dbo.Hero (@UserId INT)
RETURNS TABLE
WITH SCHEMABINDING
AS	
RETURN

	SELECT TOP (1)
	         p.Score
	FROM dbo.Posts AS p
	WHERE p.OwnerUserId = @UserId
	AND   p.PostTypeId = 1
	ORDER BY p.Score DESC;

GO

Here’s the query that’s gonna call it:

SELECT      TOP ( 10 )
              u.DisplayName, 
			  u.Reputation, 
			  h.*
FROM        dbo.Users AS u
CROSS APPLY dbo.Hero(u.Id) AS h
ORDER BY    u.Reputation DESC;

I Only Need To Show You One Thing


Since the function is an inline type, the query processor is honest with us about the full query plan.

SQL Server Query Plan
Spiced Ham

Two things happened here:

  • The “function body” no longer goes parallel
  • The TOP (1) is run against an eager index spool rather than the clustered index

What’s The Point?


This is what FROID does for you without a rewrite. It’ll inline the scalar UDF.

The plan may be better, or it may be worse.

The scalar UDF plan ran for 3 seconds, and the inline version ran for almost 13 seconds.

Stay tuned for tomorrow’s post. I have a couple suggestions for how The SQL Server team can help end users stay on top of these problems in SQL Server 2019.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2019: Compatability Level’s Day Of Reckoning

Skin Tags


Stop me if you’ve heard this one:

“We recently upgraded to SQL Server 201(4, 6, 7), and performance is AWFUL…”

And the problem was pretty easily solved by flipping the compatibility level back to 110, which fixed (most) of the issues?

(Or just went back to having the issues they knew that they had before, which is often far less scary.)

In those versions, flipping compatibility level uses the new Cardinality Estimator (CE). That new Cardinality Estimator is real hit or miss.

The worst part is that there’s practically no gain to be realized for using higher compatibility levels — that changes with SQL Server 2019.

Feature Creature


There are two things that are pretty cool in SQL Server 2019: Scalar UDF Inlining (FROID), and Batch Mode for Row Store (BMFRS?).

FROID potentially solves a big problem that’s been plaguing SQL Server users for decades. Scalar UDFs are just straight up performance poison.

This fixes the problems with them (I mean, sure, not every UDF is eligible, and you can run into other problems, but still…).

BMFRS does a bunch of stuff: It makes Batch Mode processing available for Row Store indexes (duh), it also makes Adaptive Joins and Memory Grant Feedback available for them.

Those two things were introduced in 2017, but only available if you used column store (which is what Batch Mode was originally created for).

These things have the potential to fix some very big workload problems for people.

But there’s a thing.

Monkey Paw


In order to use them, you gotta be in compatibility level 150. That also brings along the new CE.

You could be trading one set of problems for another, here. That makes flipping the switch a hard sell.

It all depends on where your biggest problems are, and the time and resources you have to fix regressions.

For most people, it’s not realistic to test their entire workload. You can test your most important queries, as long as they’re reliable.

This is a good place to plug Workload Tools by Gianluca Sartori, which can make this easier.

You can also flip the switch during a low usage time and see if monitoring freaks out.

If it doesn’t, great. If it does, you have a lot of work to do.

Of course, if you’re on SQL Server Standard Edition, this might not matter. As of this writing, I have no idea if these two features will be available there.

Whack-A-Query


The addition of these two features is pretty neat. I’m excited for them.

I’m also very interested to see how customers react, both from the point of view of adopting SQL Server 2019, and adopting compatibility level 150.

I bet a lot of people are gonna want UDF inlining without having to buy the cow.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server Scalar UDF Inlining And Security Functions

You’ve Got No Security


Despite being a bouncer for many years, I have no interest at all in security.

Users, logins, roles, grant, deny. Not for me. I’ve seen those posters, and they’re terrifying.

Gimme 3000 lines of dynamic SQL any day.

This is a slightly different take on yesterday’s post, which is also a common problem I see in queries today.

Someone wrote a function to figure out if a user is trusted, or has the right permissions, and sticks it in a predicate — it could be a join or where clause.

High Finance


Stack Overflow isn’t exactly a big four accounting firm, but for some reason big four accounting firms don’t make their databases public under Creative Commons licensing.

So uh. Here we are.

And here’s our query.

DECLARE @UserId INT = 22656, --2788872, 22656
		@SQL NVARCHAR(MAX) = N'';

SET @SQL = @SQL + N'
SELECT    p.Id,
          p.AcceptedAnswerId,
          p.AnswerCount,
          p.CommentCount,
          p.CreationDate,
          p.FavoriteCount,
          p.LastActivityDate,
          p.OwnerUserId,
          p.Score,
          p.ViewCount,
          v.BountyAmount,
          c.Score
FROM      dbo.Posts AS p
LEFT JOIN dbo.Votes AS v
    ON p.Id = v.PostId
	AND dbo.isTrusted(@iUserId) = 1
LEFT JOIN dbo.Comments AS c
    ON p.Id = c.PostId
WHERE     p.PostTypeId = 5;
';

EXEC sys.sp_executesql @SQL,
                       N'@iUserId INT',
					   @iUserId = @UserId;

There’s a function in that join to the Votes table. This is what it looks like.

CREATE OR ALTER FUNCTION dbo.isTrusted ( @UserId INT )
RETURNS BIT
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS
    BEGIN
        DECLARE @Bitty BIT;

        SELECT @Bitty = CASE WHEN u.Reputation >= 10000 
		                     THEN 1 
							 ELSE 0 
					    END
        FROM   dbo.Users AS u
        WHERE  u.Id = @UserId;
        
		RETURN @Bitty;

    END;
GO

Bankrupt


There’s not a lot of importance in the indexes, query plans, or reads.

What’s great about this is that you don’t need to do a lot of analysis — we can look purely at runtimes.

It also doesn’t matter if we run the query for a trusted (22656) or untrusted (2788872) user.

In compat level 140, the runtimes look like this:

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 140;

 SQL Server Execution Times:
   CPU time = 7219 ms,  elapsed time = 9925 ms.

 SQL Server Execution Times:
   CPU time = 7234 ms,  elapsed time = 9903 ms.

In compat level 150, the runtimes look like this:

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 150;

 SQL Server Execution Times:
   CPU time = 2734 ms,  elapsed time = 781 ms.

 SQL Server Execution Times:
   CPU time = 188 ms,  elapsed time = 142 ms.

In both runs, the trusted user is first, and the untrusted user is second.

Sure, the trusted user query ran half a second longer, but that’s because it actually had to produce data in the join.

One important thing to note is that the query was able to take advantage of parallelism when it should have (CPU time is higher than elapsed time).

In older versions (or even lower compat levels), scalar valued functions would inhibit parallelism. Now they don’t when they’re inlined.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Bad Cardinality Estimates Lead To Bad Query Plan Choices: Does SQL Server 2019 Batch Mode On Rowstore Help?

Bad Fit


In the last post, I tried to play some tricks on the optimizer to have Batch Mode make a query faster.

It didn’t quite work out the way I wanted.

That isn’t to knock Batch Mode. It’s not supposed to fix everything, and I played a silly trick to get some operators to use it.

But it did make me curious if 2019’s Batch Mode for Row Store would improve anything.

After all, no dirty tricks are involved, which means more operators can potentially use Batch Mode, and maybe we’ll avoid that whole skewed parallelism problem.

I mean, what are the odds that I’ll get the exact same plans and everything?

Spoiler


SQL Server Query Plan
50%

The Merge Join plan is still wack as hell.

The forced hash join plan got better, though.

Still Skewy?


The skewed parallelism isn’t as contagious going across without the separate Filter/Bitmap operator.

SQL Server Query Plan
Uggo

With both the Hash Join and the Scan of Votes occurring in Batch Mode, the Bitmap is able to be pushed to the storage engine.

Note that there’s no explicit Bitmap filter anywhere in the plan, but one still shows up in the predicate of the Scan of the Votes table.

SQL Server Query Plan
Something new

With 3 out of 4 threads doing… vaguely even work, we end up in a better place, especially compared to only one thread doing any work.

At least now I can show you the Batch Mode Sort lie about being parallel.

SQL Server Query Plan
Crud

Three threads with rows at the Hash Join go into the Sort on a single thread.

You Know…


As much as I love these new bells and whistles, I’m gonna go back to good ol’ fashion query tuning next.

The problem here is still that a bad estimate causes a Merge Join to be chosen in the first place.

When we force the Hash Join, query runtime is reduced at the expense of a quite large memory grant.

This is likely preferable for most people (especially with Batch Mode Memory Grant Feedback), but it doesn’t address the root cause.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2019 ELEVATE ONLINE Index Operations

I Still Don’t Like Index Rebuilds



(Except columnstore)

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server 2019’s In Memory tempdb Improves Performance

And I’m Pins And I’m Needles



Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.