Execution Plans – Page 28

Are Table Variable Indexes Ever Useful In SQL Server?

Posted on May 16, 2019May 16, 2022 by Erik Darling

Oughtta Know

Indexes are good for so much more than what they’re given credit for by the general public.

One example where indexes can be useful is with the oft-maligned table variable.

Now, they won’t help you get a better estimate from a table variable. In versions prior to the upcoming 2019 release, table variables will only net you a single row estimate.

Yes, you can recompile to get around that. Yes, you can use a trace flag to occasionally be helpful with that.

Those defenses are inadequate, and you know it.

Help How?

Let’s say we have this query against a table variable.

SELECT u.DisplayName, b.Date
FROM dbo.Users AS u
CROSS APPLY
    (
        SELECT TOP 1 *
        FROM @waypops AS w
        WHERE u.Id = w.UserId
        ORDER BY w.Date DESC
    ) AS b
WHERE u.Reputation >= 100000;

With an unindexed table variable, the plan looks like this:

You can see by the helpful new operator time stats in SSMS 18 that this query runs for 13.443 seconds.

Of that, 13.333 seconds is spent scanning the table variable. Bad guess? You bet.

If we change the table variable definition to include an index, the plan changes, and runs much faster.

The query no longer goes parallel, but it runs for 226ms.

A significant change aside from parallelism is that the Top operator is no longer a Top N Sort.

The clustered index has put the table variable data in useful order for our query.

Insertions

The table variable insert looks like this:

DECLARE @waypops TABLE 
(
  UserId INT NOT NULL,
  Date DATETIME NOT NULL
  --, INDEX c CLUSTERED(UserId, Date DESC)
);

INSERT @waypops 
        (UserId, Date)
SELECT b.UserId, b.Date
FROM dbo.Badges AS b
WHERE b.Name IN ( N'Popular Question')
UNION ALL 
SELECT b.UserId, b.Date
FROM dbo.Badges AS b
WHERE b.Name IN (N'Notable Question' )

Right now, I’ve got the index definition quoted out. The insert runs for .662ms.

The insert with the index in place runs for .967ms:

Given the 13 second improvement to the final query, I’ll take the ~300ms hit on this one.

Wierda

If you’re wondering why I’ve got the insert query broken up with a UNION ALL, it’s because the alternative really sucks:

DECLARE @waypops TABLE 
(
  UserId INT NOT NULL,
  Date DATETIME NOT NULL
  , INDEX c CLUSTERED(UserId, Date DESC)
);

INSERT @waypops 
        (UserId, Date)
SELECT b.UserId, b.Date
FROM dbo.Badges AS b
WHERE b.Name IN ( N'Popular Question', N'Notable Question')

This insert takes 1.4 seconds, and introduces a spilling sort operator.

So uh, don’t do that IRL.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Only Thing Worse Than Optional Parameters For SQL Server Query Performance

Posted on May 7, 2019May 16, 2022 by Erik Darling

Kitchen Stink

A lot has been written about “kitchen sink” queries. A couple of my favorites are by Aaron Bertrand and Gail Shaw.

Both articles have similar start and end points. But I’m going to start at an even worse point.

Catch All Parameters

This is the worst possible idea.

    CREATE OR ALTER PROCEDURE dbo.AwesomeSearchProcedure (@SearchString NVARCHAR(MAX))
    AS
    SET NOCOUNT, XACT_ABORT ON;
    SET STATISTICS TIME, IO OFF;
    BEGIN
    
    DECLARE @AltString NVARCHAR(MAX) = N'%'
    
    SELECT TOP (1000) p.OwnerUserId, p.Title, p.CreationDate, p.LastActivityDate, p.Body
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId LIKE ISNULL(N'%' + @SearchString + N'%', @AltString)
    OR    p.Title LIKE ISNULL(N'%' + @SearchString + N'%', @AltString)
    OR    p.CreationDate LIKE ISNULL(N'%' + @SearchString + N'%', @AltString)
    OR    p.LastActivityDate LIKE ISNULL(N'%' + @SearchString + N'%', @AltString)
    OR    p.Body LIKE ISNULL(N'%' + @SearchString + N'%', @AltString);
        
    END;
    GO

It doesn’t get any better if you do this, either.

    SELECT TOP (1000) p.OwnerUserId, p.Title, p.CreationDate, p.LastActivityDate, p.Body
    FROM dbo.Posts AS p
    WHERE (p.OwnerUserId LIKE N'%' + @SearchString + N'%' OR @SearchString IS NULL)
    OR    (p.Title LIKE N'%' + @SearchString + N'%' OR @SearchString IS NULL)
    OR    (p.CreationDate LIKE N'%' + @SearchString + N'%' OR @SearchString IS NULL)
    OR    (p.LastActivityDate LIKE N'%' + @SearchString + N'%' OR @SearchString IS NULL)
    OR    (p.Body LIKE N'%' + @SearchString + N'%' OR @SearchString IS NULL);

Sample Run

In both cases, just searching for a single value results in a query that runs for >2 minutes.

    EXEC dbo.AwesomeSearchProcedure @SearchString = N'35004';
    GO

The problem is that we’re just searching for an OwnerUserId, but SQL Server doesn’t know that.

The query plan looks like this:

See that Filter? That’s where we do all of our search work. We scan the whole Posts table, and push every row across the pipe to the Filter.

Irritable

This pattern might work on a small amount of data, but like most things that are efficient in small doses this will quickly fall apart when your database reaches a mature size.

My example is pretty simple, too, just hitting one table. In real life, you monsters are going this across joins, throwing in row numbers, distincts, and ordering by the first 10 columns.

It only gets worse as it gets more complicated.

Thanks for reading!

Going Further

How Bad Cardinality Estimates Can Make Blocking Worse In SQL Server

Posted on May 3, 2019May 16, 2022 by Erik Darling

Stay Sober

I recently came across a bad locking problem. It was one of those things that happened “suddenly” and “for no reason”.

When I looked at the stored procedure, the code was something like this:

CREATE PROCEDURE dbo.bad_news(@PostTypeId INT)
AS	
BEGIN

DECLARE @PTID INT
SELECT @PTID = CASE WHEN @PostTypeId < 1
                         OR @PostTypeId > 8
				    THEN 4
			   END
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = @PTID;

END

I mean, it would look like that if we were using Stack Overflow.

We weren’t, but I’ll leave the rest to your imagination.

Outside The Proc

We have this index:

CREATE INDEX ix_spanky ON dbo.Posts(PostTypeId);

Let’s run this code, and look at what happens.

BEGIN TRAN
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = 3;
ROLLBACK

The locks look like this:

<Lock resource_type="KEY" index_name="PK_Posts__Id" request_mode="X" request_status="GRANT" request_count="28" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Posts__Id" request_mode="IX" request_status="GRANT" request_count="17" />

Our nonclustered index helps us find the rows we want to update. We don’t need to lock it because it doesn’t have the Score column in it.

We do need to lock the clustered index, but we’re able to take a small number (28) of Key locks.

This is the best outcome!

The worst outcome happens when we do this, which is what was happening in the stored procedure.

DECLARE @PostTypeId INT = 3;
BEGIN TRAN
UPDATE p
SET p.Score += 1
FROM dbo.Posts AS p
WHERE p.PostTypeId = @PostTypeId;
ROLLBACK

The locks look like this:

<Lock resource_type="OBJECT" request_mode="X" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="PK_Posts__Id" request_mode="X" request_status="GRANT" request_count="17" />

We not only have exclusive locks on pages, but on the entire table now.

Overestimaters Anonymous

The execution plans for these two queries are much different.

Likewise, the estimates are much different.

Oh So Loco

This is a fairly well-documented outcome of using a “declared variable”, or the optimize for unknown hint.

The optimizer makes a usually-generally-bad guess at the number of rows it’ll have to deal with.

In this case, the stored procedure had been modified to account for bad values passed in from the application.

The outcome was severe blocking because modification queries were taking far more intrusive locks than necessary.

So, you know, don’t do that.

Thanks for reading!

Going Further

Spill Levels Are Weird In SQL Server Query Plans

Posted on April 30, 2019May 16, 2022 by Erik Darling

Get Low

There’s a wonderful explanation from Paul White about spill levels here. At the risk of offending the copyright godses, I’m going to quote it here.

Consider the task of sorting 4000MB of data, when we only have 500MB of memory available. Obviously, we cannot sort the whole set in memory at once, but we can break the task down:

We first read 500MB of data, sort that set in memory, then write the result to disk. Performing this a total of 8 times consumes the entire 4000MB input, resulting in 8 sets of sorted data 500MB in size. The second step is to perform an 8-way merge of the sorted data sets. Note that a merge is required, not a simple concatenation of the sets since the data is only guaranteed to be sorted as required within a particular 500MB set at the intermediate stage.

Alright, interesting! That’d make me think that the number of pages involved in the spill would increase the spill level. In real life, I saw spill level 11 once. I wish I had that plan saved.

Rowed To Nowhere

Here’s a pretty big spill. But it’s only at level two.

That’s 8,115,283,973 rows.

Here’s a much larger spill that’s still only at level two.

That’s um… hang on.

231,424,059,820 rows. It’s crazy that we can add 223,308,775,847 rows to a spill and not need more passes than before.

But hey, okay. That’s cool. I get it.

I found level three! With uh 97,142 pages. That’s weird. I don’t get it.

But whatever, it probably can’t get much weirder than

SPILL LEVEL 8 FOR 43,913 PAGES 1-800-COME-ON-YO!

This Is Gonna Take More Digging

I’m gonna have my eye on weird spill levels from now on, and I’ll try to follow up as I figure stuff out.

Thanks for reading!

Going Further

How SQL Server’s Query Plan Cache Lies: Triggers

Posted on April 23, 2019May 16, 2022 by Erik Darling

Simpler Times

Let’s say we’ve got a simple update query. When we run it, it finishes instantly, and the query plan has no surprises.

BEGIN TRAN
UPDATE u
SET u.Reputation = 2147483647
FROM dbo.Users AS u
WHERE u.Id = 22656;

ROLLBACK

Then one day DevOps comes along and says that every time Reputation gets updated in the Users table, we have to check a bunch of conditions and then do a bunch of stuff based on the value.

One of those checks is to see if anyone has the ?INT MAX? and then insert a row into Badges.

Because I’m lazy (Agile?), I’m going to stick a waitfor in the trigger to simulate all the other checks and actions.

CREATE OR ALTER TRIGGER dbo.one_time
ON dbo.Users
AFTER UPDATE
AS
BEGIN
    
    IF EXISTS ( SELECT 1/0 
                FROM Inserted 
                WHERE Inserted.Reputation = 2147483647 )

    INSERT dbo.Badges ( Name, UserId, Date )
    SELECT N'INT MAX OMG', Id, GETDATE()
    FROM Inserted

    WAITFOR DELAY '00:00:10.000'

END;
GO

Less Simpler Times

Now when we run our update, the plan looks like this.

What’s important here is that we can see the work associated with the triggers.

What sucks is when we look at the plan cache.

Back To Easy

I’m gonna stick that update in a stored procedure to make life a little easier when we go looking for it.

CREATE PROCEDURE dbo.update_reputation 
AS
BEGIN
    BEGIN TRAN
    UPDATE u
    SET u.Reputation = 2147483647
    FROM dbo.Users AS u
    WHERE u.Id = 22656;
    
    ROLLBACK    
END;

After running the proc, here’s what we get back from the plan cache.

    SELECT OBJECT_NAME(deps.object_id) AS proc_name, 
           deps.last_elapsed_time / 1000. / 1000. AS last_elapsed_time_seconds,
           deqp.query_plan,
           dest.text
    FROM sys.dm_exec_procedure_stats AS deps
    CROSS APPLY sys.dm_exec_query_plan(deps.plan_handle) AS deqp
    CROSS APPLY sys.dm_exec_sql_text(deps.plan_handle) AS dest
    WHERE deps.object_id = OBJECT_ID('dbo.update_reputation');

We have a procedure reporting that it ran for 10 seconds (which it did, sort of…).

But no mention of the trigger. Hm.

Of course, we can get this information from trigger stats, but we’d have to know to go looking:

    SELECT OBJECT_NAME(object_id) AS trigger_name,
           dets.last_elapsed_time / 1000. / 1000. AS last_elapsed_time_seconds,
           deqp.query_plan,
           dest.text
    FROM sys.dm_exec_trigger_stats AS dets
    CROSS APPLY sys.dm_exec_query_plan(dets.plan_handle) AS deqp
    CROSS APPLY sys.dm_exec_sql_text(dets.plan_handle) AS dest
    WHERE OBJECT_ID = OBJECT_ID('dbo.one_time');

Lying Liars

When seemingly simple modification queries take a long time, things may not be as simple as they appear.

Blocking, and triggers might be at play. Unfortunately, there’s not a great way of linking any of that together right now.

Thanks for reading!

Going Further

Why Some SQL Server Date Functions Get Better Cardinality Estimates

Posted on April 22, 2019May 16, 2022 by Erik Darling

Date Debate

Searching dates is a common enough task. There are, of course, good and bad ways to do this.

Aaron Bertrand’s article, Bad habits to kick: mis-handling date / range queries, is a good place to start to learn about that subject.

This isn’t quite about the same thing, just about some behavior that I thought was interesting, and how it changes between cardinality estimator versions.

Bad Robot

If you’ve been query tuning for a while, you probably know about SARGability, and that wrapping columns in functions is generally a bad idea.

But just like there are slightly different rules for CAST and CONVERT with dates, the repercussions of the function also vary.

The examples I’m going to look at are for YEAR() and MONTH().

If you want a TL;DR, here you go.

Reality Bites

If you wanna keep going, follow me!

USING

The takeaway here isn’t that doing either of these is okay. You should fully avoid wrapping columns in functions in general.

One of the main problems with issuing queries with non-SARGable predicates is that the people who most often do it are the people who rely on missing index requests to direct tuning efforts, and non-SARGable queries can prevent those requests from surfacing, or ask for an even more sub-optimal index than usual.

If you have a copy of the StackOverflow2013 database, you can replicate the results pretty easily on SQL Server 2017.

They may be slightly different depending on how the histogram is generated, but the overarching theme is the same.

Yarly

If you run these queries, and look at the estimated and actual rows in the Clustered Index scan tooltip, you’ll see they change for every query.

DECLARE @blob_eater DATETIME;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2008;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2009;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2010;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2011;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2012;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2013;
    GO

Here’s a sample from the 2008 and 2009 queries.

ED: I took a break from writing this and “went to brunch”.

Any logical inconsistencies will work themselves out eventually.

Cash Your Checks And Come Up

Alright, let’s try that again with by month.

If you hit yourself in the head with a hammer and forgot the TL;DR, here’s what happens:

DECLARE @blob_eater DATETIME;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 1;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 2;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 3;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 4;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 5;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 6;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 7;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 8;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 9;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 10;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 11;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 12;

If you run these, they’ll all have the same guess on the clustered index scan.

To keep things simple, let’s look at the first couple:

The difference here is that now every single row estimate will be 205,476.

Lesson learned: The optimizer can make a decent statistical guess at the year portion of a date, but not the month portion.

In a way, you can think of this like a LIKE query.

The optimizer can make a decent guess at ‘YEAR%’, but not at ‘%MONTH%’.

Actual Facts To Snack On And Chew

The same thing happens for both new and old cardinality estimators.

DECLARE @blob_eater DATETIME;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2008
    OPTION(USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION'));

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE YEAR(u.CreationDate) = 2008
    OPTION(USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
    GO 

DECLARE @blob_eater DATETIME;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 12
    OPTION(USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION'));


    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 12
    OPTION(USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
    GO

Wouldn’t Get Far

But if we combine predicates, something really different happens between Linda Cardellini estimators.

DECLARE @blob_eater DATETIME;

    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 12
    AND YEAR(u.CreationDate) = 2012
    OPTION(USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION'));


    SELECT @blob_eater = u.CreationDate
    FROM dbo.Users AS u
    WHERE MONTH(u.CreationDate) = 12
    AND YEAR(u.CreationDate) = 2012
    OPTION(USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
    GO

In this case, the old CE (on the right), makes a very bad guess of 1 row.

The new CE (on the left) makes a slightly better, but still not great guess.

Ended

Neither of these is a good way to query date or time data.

You can see in every tooltip that, behind the scenes, the queries used the DATEPART function, which means that also doesn’t help.

The point of this post is that someone may use a function to query the year portion of a date and assume that SQL Server does a good job on any other portion, which isn’t the case.

None of these queries are SARGable, and at no point is a missing index request raised on the CreationDate column, even though if you add one it gets used and reduces reads.

Thanks for reading!

Going Further

Self Contained SQL Server Query Plans

Posted on April 12, 2019May 14, 2022 by Erik Darling

Plan, Actually

SQL Server has started collecting a ton of information about a query when it executes.

Live query stats actually captures operator runtimes. Additionally, the stuff that’s captured in actual query plan XML has seen a lot of development.

SSMS 18 goes a step further and shows you those without ticking the Live Query Plan button.

What am I getting at?

Outside Shot

As a consultant, people sometimes send me query plans. They’re usually estimated, or cached plans.

That’s not bad! You can get a sense of some important things based on them, but there’s a ton of detail in actual plans that makes life easier.

One example is with parameter sniffing: estimated and cached plans look like they did something completely reasonable.

Getting an actual plan is tough, though, especially if it’s a long running query, or the query runs modifications.

Containers Are All The Rage

What if query plan XML had enough information in it for you to “execute” the query locally without returning any results?
What if you could press play, fast forward, and rewind on a query plan?
What if you could try things like using the new or old CE or other hints on the query?
What if parameters could be masked (but differentiated internally) to test parameter sniffing?

This might be possible with the right information collected, even if some of it is imperfect. In newer versions of SQL Server, even information about statistics is gathered by the plan.

The one missing piece would be index definitions, and perhaps reasons why indexes weren’t used.

With the direction Microsoft is finally going in collecting runtime information about queries, I wouldn’t be surprised if something like this became possible.

Thanks for reading!

Going Further

Memory Grants For The SQL Variant Data Type In SQL Server

Posted on April 10, 2019May 16, 2022 by Erik Darling

Great Question, You

During my (sold out, baby!) Madison precon, one attendee asked a great question while we were talking about memory grants.

Turns out, if you use the SQL Variant datatype, the memory grants function a lot like they do for any long string type.

From the documentation, which hopefully won’t move or get deleted:

sql_variant can have a maximum length of 8016 bytes. This includes both the base type information and the base type value. The maximum length of the actual base type value is 8,000 bytes.

Since the optimizer needs to plan for your ~~laziness~~ ~~indecisiveness~~ ~~lack of respect for human life~~ inexperience, you can end up getting some rather enormous memory grants, regardless of the type of data you store in variant columns.

Ol’ Dirty Demo

Here’s a table with a limited set of columns from the Users table.

CREATE TABLE dbo.UserVariant 
( 
    Id SQL_VARIANT, 
    CreationDate SQL_VARIANT, 
    DisplayName SQL_VARIANT,
    Orderer INT IDENTITY
);

INSERT dbo.UserVariant WITH(TABLOCKX)
( Id, CreationDate, DisplayName )
SELECT u.Id, u.CreationDate, u.DisplayName
FROM dbo.Users AS u

In all, about 2.4 million rows end up in there. In the real table, the Id column is an integer, the CreationDate column is a DATETIME, and the DisplayName column is an NVARCHAR 40.

Sadly, no matter which column we select, the memory grant is the same:

SELECT TOP (101) uv.Id
FROM dbo.UserVariant AS uv
ORDER BY uv.Orderer;

SELECT TOP (101) uv.CreationDate
FROM dbo.UserVariant AS uv
ORDER BY uv.Orderer;

SELECT TOP (101) uv.DisplayName
FROM dbo.UserVariant AS uv
ORDER BY uv.Orderer;

SELECT TOP (101) uv.Id, uv.CreationDate, uv.DisplayName
FROM dbo.UserVariant AS uv
ORDER BY uv.Orderer;

It’s also the maximum memory grant my laptop will allow: about 9.6GB.

Get’em!

As if there aren’t enough reasons to avoid sql_variant, here’s another one.

Thanks for reading.

Going Further

SQL Server 2019: What Kind Of Scalar Functions Can’t Be Inlined?

Posted on April 5, 2019May 14, 2022 by Erik Darling

Dating Sucks

There’s a lot of excitement (alright, maybe I’m sort of in a bubble with these things) about SQL Server 2019 being able to inline most scalar UDFs.

But there’s a sort of weird catch with them. It’s documented, but still.

If you use GETDATE in the function, it can’t be inlined.

Say What?

Let’s look at three examples.

Numero Uno

CREATE OR ALTER FUNCTION dbo.YearDiff(@d DATETIME)
RETURNS INT  
WITH SCHEMABINDING,   
     RETURNS NULL ON NULL INPUT  
AS
BEGIN
DECLARE @YearDiff INT;

SET @YearDiff = DATEDIFF(HOUR, @d, GETDATE())

RETURN @YearDiff  
END;  
GO

This function can’t be inlined. It uses the GETDATE function directly in a calculation.

I’m not bothered by that! After all, it’s documented.

In writing.

Numero Dos

CREATE OR ALTER FUNCTION dbo.i_YearDiff(@d DATETIME)
RETURNS INT  
WITH SCHEMABINDING,   
     RETURNS NULL ON NULL INPUT  
AS
BEGIN
DECLARE @YearDiff INT;
DECLARE @i DATETIME = GETDATE()

SET @YearDiff = DATEDIFF(HOUR, @d, @i)

RETURN @YearDiff  
END;  
GO

I was thinking that maybe if we just calculated the date once in a variable and then use that, we’d be able to inline the function.

But no.

No we can’t.

Numero Tres

CREATE OR ALTER FUNCTION dbo.NothingToSeeHere(@d DATETIME)
RETURNS INT  
WITH SCHEMABINDING,   
     RETURNS NULL ON NULL INPUT  
AS
BEGIN
DECLARE @YearDiff INT;
DECLARE @i DATETIME = GETDATE()

SET @YearDiff = 1;

RETURN @YearDiff  
END;  
GO

What if we don’t even touch GETDATE? Hm?

No.

Still no.

Kinda Weird, Right?

If you’re using SQL Server 2019 and want to find functions that can’t be inlined, start here:

SELECT OBJECT_NAME(m.object_id) AS object_name,
       m.is_inlineable
FROM sys.sql_modules AS m
    JOIN sys.objects AS o
        ON o.object_id = m.object_id
WHERE o.type = 'FN'
      AND m.is_inlineable = 0;

None of these functions can be inlined:

SQL Server Management Studio Query Results — Bummer.

Unfortunately, the only real solution here is to rewrite the function entirely as an inline table valued function.

CREATE OR ALTER FUNCTION dbo.InlineYearDiff(@d DATETIME)
RETURNS TABLE  
WITH SCHEMABINDING  
AS
RETURN
    SELECT DATEDIFF(HOUR, @d, GETDATE()) AS TimeDiff
GO

Thanks for reading!

Going Further

Do Query Plans With Multiple Spool Operators Share Data In SQL Server?

Posted on April 2, 2019May 14, 2022 by Erik Darling

Spoolwork

I wanted to show you two situations with two different kinds of spools, and how they differ with the amount of work they do.

I’ll also show you how you can tell the difference between the two.

Two For The Price Of Two

I’ve got a couple queries. One generates a single Eager Index Spool, and the other generates two.

    SELECT TOP (1) 
            u.DisplayName,
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id) AS [Total Badges]
    FROM dbo.Users AS u
    ORDER BY [Total Badges] DESC;
    GO 

    SELECT TOP (1) 
            u.DisplayName,
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date ) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id) AS [Total Badges]
    FROM dbo.Users AS u
    ORDER BY [Total Badges] DESC;
    GO

The important part of the plans are here:

The important thing to note here is that both index spools have the same definition.

The two COUNT(*) subqueries have identical logic and definitions.

Fire Sale

The other type of plan is a delete, but with a different number of indexes.

/*Add these first*/
CREATE INDEX ix_whatever1 ON dbo.Posts(OwnerUserId);
CREATE INDEX ix_whatever2 ON dbo.Posts(OwnerUserId);
/*Add these next*/
CREATE INDEX ix_whatever3 ON dbo.Posts(OwnerUserId);
CREATE INDEX ix_whatever4 ON dbo.Posts(OwnerUserId);

BEGIN TRAN
DELETE p
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ROLLBACK

Differences?

Using Extended Events to track batch completion, we can look at how many writes each of these queries will do.

For more on that, check out the Stack Exchange Q&A.

The outcome is pretty interesting!

The select query with two spools does twice as many reads (and generally twice as much work) as the query with one spool
The delete query with four spools does identical writes as the one with two spools, but more work overall (twice as many indexes need maintenance)

Looking at the details of each select query, we can surmise that the two eager index spools were populated and read from separately.

In other words, we created two indexes while this query ran.

For the delete queries, we can surmise that a single spool was populated, and read from either two or four times (depending on the number of indexes that need maintenance).

Another way to look at it, is that in the select query plans, each spool has a child operator (the clustered index scan of Badges). In the delete plans, three of the spool operators have no child operator. Only one does, which signals that it was populated and reused (for Halloween protection).

Thanks for reading!