A SQL Server Memory Grant Helper Query For The Sentry One Repository

SEMA4


While working with a client, I came up with a query against the SentryOne repository.

The point of it is to find queries that waited more than a second to get a memory grant. I wrote it because this information is logged but not exposed in the GUI yet.

It will show you basic information about the collected query, plus:

  • How long it ran in seconds
  • How long it waited for memory in seconds
  • How long it ran for after it got memory
SELECT   HostName,
         CPU,
         Reads,
         Writes,
         Duration,
         StartTime,
         EndTime,
         TextData,
         TempdbUserKB,
         GrantedQueryMemoryKB,
         DegreeOfParallelism,
         GrantTime,
         RequestedMemoryKB,
         GrantedMemoryKB,
         RequiredMemoryKB,
         IdealMemoryKB,
         Duration / 1000. AS DurationSeconds,
         DATEDIFF(SECOND, StartTime, GrantTime) AS SecondsBetweenQueryStartingAndMemoryGranted,
         (Duration - DATEDIFF(MILLISECOND, StartTime, GrantTime)) / 1000. AS HowFastTheQueryRanAfterItGotMemory
FROM     PerformanceAnalysisTraceData
WHERE DATEDIFF(SECOND, StartTime, GrantTime) > 1
ORDER BY SecondsBetweenQueryStartingAndMemoryGranted DESC

The results I saw were surprising! Queries that waited 10+ seconds for memory, but finished instantly when they finally got memory.

If you’re a Sentry One user, you may find this helpful. If you find queries waiting a long time for memory, you may want to look at if you’re hitting RESOURCE_SEMAPHORE waits too.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

What A Difference A SQL Server CTP Makes: Improvements To Scalar UDF Inlining

FROID Rage


To test FROID, which is the codename for Microsoft’s initiative to inline those awful scalar valued function things that people have been griping about for like 20 years, I like to take functions I’ve seen used in real life and adapt them a bit to work in the Stack Overflow database.

The funny thing is that no matter how many times I see the same function doing the same thing in a different way, someone tells me it’s unrealistic.

Doesn’t matter what it does: Touch data. Not touch data. Do simple formatting. Create a CSV list. Parse a CSV list. Pad data. Remove characters. Proper case names.

“I would never use a function for that.”

Okay, Spanky ?

Too Two!


In CTP 2.2, I had a function that ended up with this query plan:

SQL Server Query Plan
Tell Moses to get the baseball bat.

The important detail about it is that it runs for 11 seconds in nested loops hell.

SQL Server Query Plan
Watch Out Now

For reader reference: The non-inlined version runs for about 6 seconds and gets an adaptive join plan.

The plan is forced serial with inlining turned off, naturally.

SQL Server Query Plan
You’re cool.

I sent the details over to my BESS FRENS at Microsoft, and it looks like it’s been fixed.

To Three!


In CTP 2.3, when we turn on functioning inlining and do the same thing:

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON;
SQL Server Query Plan
Fad Gadget

No more nested loops hell. Now the function gets an adaptive join plan with parallelism, and finishes immediately.

Thanks, frens.

And thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Last Week’s Almost Definitely Not Office Hours: March 15

ICYMI


Last week’s thrilling, stunning, flawless episode of whatever-you-wanna-call-it.

Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server’s Accelerated Database Recovery: Forcing Cleanup

Forcing Cleanup


There’s a stored procedure called sp_persistent_version_cleanup we can use to clean out PVS data.

There’s some helpful information about it in sp_helptext:

create procedure sys.sp_persistent_version_cleanup  
(  
 @dbname sysname = NULL,  -- name of the database.  
 @scanallpages BIT = NULL  -- whether to scan all pages in the database.  
)  
as  
begin  
 set @dbname = ISNULL(@dbname, DB_NAME())  
 set @scanallpages = ISNULL(@scanallpages, 0)  
  
 declare @returncode int  
  
 EXEC @returncode = sys.sp_persistent_version_cleanup_internal @dbname, @scanallpages  
  
 return @returncode  
end

We can pass in a database name, and if we want to scan all the pages during cleanup.

Unfortunately, those get passed to sp_persistent_version_cleanup_internal, which only throws an error with sp_helptext.

Locking?


While the procedure runs, it generates a wait called PVS_CLEANUP_LOCK.

This doesn’t seem to actually lock the PVS so that other transactions can’t put data in there, though.

While it runs (and boy does it run for a while), I can successfully run other modifications that use PVS, and roll them back instantly.

If we look at the locks it’s taking out using sp_WhoIsActive…

sp_WhoIsActive @get_locks = 1;

<Database name="StackOverflow">
  <Locks>
    <Lock request_mode="S" request_status="GRANT" request_count="1" />
  </Locks>
</Database>

It doesn’t seem to do anything terribly invasive.

Better Than Yours

Watching the session with XE also doesn’t reveal locking, but it may be cleverly hidden away from us.

In all, it took about 21 minutes to cleanup the 37MB of data I had in there.

I don’t think this is my fault, either. It’s not like I’m using a clown shoes VM here.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server’s Accelerated Database Recovery: How Big Is My Version Store?

Size Matters


When Query Store rolled out, there were a lot of questions about controlling the size, and placement of it.

To date, there’s still not a way to change where Query Store data ends up, but you can manage the size pretty well.

Of course, there’s a lot more going on there — query text and plan XML heavily inflate the size of things — so there’s naturally more concern.

What About Version Stores?


When talking about traditional row versioning in SQL Server, via Read Committed Snapshot Isolation, or Snapshot Isolation, there’s always a warning to keep an eye on the size of your version store.

Rightfully so, too. They put their data in tempdb, not locally the way the Persistent Version Store does. That means tempdb size can quickly get out of hand with multiple databases storing their Version Store data in there.

Storing the data locally, there’s far less chance of system-wide impact. I wouldn’t say it’s ZERO, depending on where you put your data, but it’s close to it.

Let’s Make Some Indexes


Books about SQL Server
Classics tho

I wanted to make this realistic — and after years of looking at your tables, I know you get rid of indexes at the same rate I get rid of SQL Server books.

So I went ahead and created 20 of them, and made sure they all had a column in common — in this case the Age column in the Users table.

CREATE INDEX [IX_Id_Age_AccountId] ON [dbo].[Users] ([Id], [Age], [AccountId]);
CREATE INDEX [IX_CreationDate_Age_AccountId] ON [dbo].[Users] ([CreationDate], [Age], [AccountId]);
CREATE INDEX [IX_DisplayName_Age_AccountId] ON [dbo].[Users] ([DisplayName], [Age], [AccountId]);
CREATE INDEX [IX_DownVotes_Age_AccountId] ON [dbo].[Users] ([DownVotes], [Age], [AccountId]);
CREATE INDEX [IX_EmailHash_Age_AccountId] ON [dbo].[Users] ([EmailHash], [Age], [AccountId]);
CREATE INDEX [IX_LastAccessDate_Age_AccountId] ON [dbo].[Users] ([LastAccessDate], [Age], [AccountId]);
CREATE INDEX [IX_Location_Age_AccountId] ON [dbo].[Users] ([Location], [Age], [AccountId]);
CREATE INDEX [IX_Reputation_Age_AccountId] ON [dbo].[Users] ([Reputation], [Age], [AccountId]);
CREATE INDEX [IX_UpVotes_Age_AccountId] ON [dbo].[Users] ([UpVotes], [Age], [AccountId]);
CREATE INDEX [IX_Views_Age_AccountId] ON [dbo].[Users] ([Views], [Age], [AccountId]);
CREATE INDEX [IX_WebsiteUrl_Age_AccountId] ON [dbo].[Users] ([WebsiteUrl], [Age], [AccountId]);
CREATE INDEX [IX_Age_CreationDate_AccountId] ON [dbo].[Users] ([Age], [CreationDate], [AccountId]);
CREATE INDEX [IX_Age_DisplayName_AccountId] ON [dbo].[Users] ([Age], [DisplayName], [AccountId]);
CREATE INDEX [IX_Age_DownVotes_AccountId] ON [dbo].[Users] ([Age], [DownVotes], [AccountId]);
CREATE INDEX [IX_Age_EmailHash_AccountId] ON [dbo].[Users] ([Age], [EmailHash], [AccountId]);
CREATE INDEX [IX_Age_Id_AccountId] ON [dbo].[Users] ([Age], [Id], [AccountId]);
CREATE INDEX [IX_Age_LastAccessDate_AccountId] ON [dbo].[Users] ([Age], [LastAccessDate], [AccountId]);
CREATE INDEX [IX_Age_Location_AccountId] ON [dbo].[Users] ([Age], [Location], [AccountId]);
CREATE INDEX [IX_Age_Reputation_AccountId] ON [dbo].[Users] ([Age], [Reputation], [AccountId]);
CREATE INDEX [IX_Age_UpVotes_AccountId] ON [dbo].[Users] ([Age], [UpVotes], [AccountId]);

Now We Need To Modify Them


BEGIN TRAN
UPDATE u
SET u.Age = 100
FROM dbo.Users AS u
WHERE u.Age IS NULL

ROLLBACK

This’ll run for a bit, obviously.

While it runs, we can use this query to look at how big the version store is.

SELECT DB_NAME(database_id) AS database_name,
       (persistent_version_store_size_kb / 1024.) AS persistent_version_store_size_mb
FROM sys.dm_tran_persistent_version_store_stats
WHERE persistent_version_store_size_kb > 0;
Not bad.

The only thing is that it stays the same size after we roll that back.

I mean, the ROLLBACK is instant, but cleanup isn’t.

In the next post, we’ll look at forcing cleanup.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Accelerated Database Recovery: What It Could Mean For SQL Server

Thinking Ahead


I love the idea of Accelerated Database Recovery, because it solves some really cool problems in SQL Server.

There was always this big Wompson & Wompson around rollback, in that it was single threaded. If you had a process get a parallel plan to do some modifications, the rollback could take much longer.

ADR doesn’t solve concurrency issues around multiple modification queries. They both still need the same locks, and other transactions aren’t reading from the Persistent Version Store (PVS from here on out).

But they could. Which would allow for some interesting stuff down the line.

Flashsomething

Oracle has a feature called Flashback that lets you view data as it existed in various points in time. You sort of have this with Temporal Tables now, but not database wide. It’s feasible to think that not only would the PVS let us look at data at previous points in time, but also to restore objects to that point in time.

Yep. Single objects.

AlwaysOptimistic

With PVS up and running, we’ve got row versioning in place.

That means SQL Server could feasibly join the rest of the civil database world by using optimistic locking by default.

It could totally be used in the way that RCSI and SI are used today to let readers and writers (and maybe even writers and writers!) get along peaceably.

SQL Server Query Plan
Avoidable

Happy Halloween

You know those spool things that I hate? This could be used to make some of them disappear.

The way PVS works now, we have a record of rows modified, which means we’ve effectively spooled those rows out somewhere already.

With those rows recorded, we could skip using spools all together and just read the rows we need to modify from here.

I’m Excited!


This is a very cool step forward for SQL Server.

I mean, aside from the fact that it took 15 minutes to cleanup a 75MB version store.

But still! This is gonna help a lot of people, and has potential to go in a few new directions to really improve the product.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Extra Documentation for KB Article 4462481

SQL Server KB 4462481 is a bit light on details:

Assume that you use Microsoft SQL Server data warehousing. When you run parallel insert queries to insert data into a clustered columnstore index, the query operation freezes.

To quote one of the Top Men of SQL Server product support:

Hope this helps!

How to freeze your query


Consider a query execution that meets all of the following criteria:

  1. A parallel INSERT INTO… SELECT into a columnstore table is performed
  2. The SELECT part of the query contains a batch mode hash join
  3. The query can’t immediate get a memory grant, hits the 25 second memory grant timeout and executes with required memory

The query may appear to get stuck. It no longer uses CPU and the parallel worker threads appear to wait on each other. Let’s go through an example on SQL Server 2017 RTM. First create all of the following tables:

CREATE PARTITION FUNCTION PART_FUN_REP_SGM(BIGINT)AS RANGE LEFTFOR VALUES (1, 2, 3);CREATE PARTITION SCHEME PART_SCHEME_REP_SGMAS PARTITION PART_FUN_REP_SGMALL TO ( [PRIMARY] );DROP TABLE IF EXISTS dbo.SOURCE_DATA_FOR_CCI;CREATE TABLE dbo.SOURCE_DATA_FOR_CCI (PART_KEY BIGINT NOT NULL,ID BIGINT NOT NULL,FAKE_COLUMN VARCHAR(4000) NULL) ON PART_SCHEME_REP_SGM (PART_KEY);INSERT INTO dbo.SOURCE_DATA_FOR_CCI WITH (TABLOCK)SELECT TOP (1048576) 1, ROW_NUMBER()OVER (ORDER BY (SELECT NULL)) % 16000, NULLFROM master..spt_values t1CROSS JOIN master..spt_values t2OPTION (MAXDOP 1);INSERT INTO dbo.SOURCE_DATA_FOR_CCI WITH (TABLOCK)SELECT TOP (1048576) 2, ROW_NUMBER()OVER (ORDER BY (SELECT NULL)) % 16000, NULLFROM master..spt_values t1CROSS JOIN master..spt_values t2OPTION (MAXDOP 1);DROP TABLE IF EXISTS dbo.LARGE_BATCH_MODE_MEMORY_REQUEST;CREATE TABLE dbo.LARGE_BATCH_MODE_MEMORY_REQUEST (ID VARCHAR(4000),INDEX CCI CLUSTERED COLUMNSTORE);INSERT INTO dbo.LARGE_BATCH_MODE_MEMORY_REQUESTWITH (TABLOCK)SELECT TOP (2 * 1048576) CAST(ROW_NUMBER()OVER (ORDER BY (SELECT NULL)) AS VARCHAR(8000))+ 'DYEL'FROM master..spt_values t1CROSS JOIN master..spt_values t2;DROP TABLE IF EXISTS dbo.CCI_SLOW_TO_COMPRESS_TARGET_1;CREATE TABLE dbo.CCI_SLOW_TO_COMPRESS_TARGET_1 (ID BIGINT NULL,INDEX CCI CLUSTERED COLUMNSTORE);DROP TABLE IF EXISTS dbo.CCI_SLOW_TO_COMPRESS_TARGET_2;CREATE TABLE dbo.CCI_SLOW_TO_COMPRESS_TARGET_2 (ID BIGINT NULL,INDEX CCI CLUSTERED COLUMNSTORE);

Consider the following query:

INSERT INTO dbo.CCI_SLOW_TO_COMPRESS_TARGET_1WITH (TABLOCK)SELECT LEFT(t1.ID, 1)FROM LARGE_BATCH_MODE_MEMORY_REQUEST t1INNER JOIN LARGE_BATCH_MODE_MEMORY_REQUEST t2ON t1.ID = t2.IDUNION ALLSELECT IDFROM dbo.SOURCE_DATA_FOR_CCIOPTION (MAXDOP 2);

Here’s what the plan looks like:a37_planI set Max Server Memory to 8000 MB and executed two queries with a maximum allowed memory grant of 60% via Resource Governor. The first query finished in about 40 seconds. The second query hasn’t finished after 30 minutes. During those 30 minutes the second query has only used 1184 ms of CPU time. The COLUMNSTORE_BUILD_THROTTLE wait type shows up in sys.dm_os_waiting_tasks:a37_waitExecution context id 2 is waiting on execution context id 1 with a wait type of HTDELETE. Execution context id 1 has a wait type of COLUMNSTORE_BUILD_THROTTLE. I don’t think that this wait is supposed to show up for parallel inserts. It can show up by design when creating or rebuilding a columnstore index in parallel:

When a columnstore index is being built, the memory grant estimate is based on a segment size of one million rows. The first segment of the index is built using a single thread so the real, required per-thread memory grant is found. Then the memory grants are given per thread and the other segments are built multi-threaded. Although all the threads for the operation are allocated at the start of the build, only one thread is used for the first segment and all the others incur a COLUMNSTORE_BUILD_THROTTLE wait.

The important point is that a wait type of COLUMNSTORE_BUILD_THROTTLE means that worker thread is waiting on another thread to do something. But so does a wait time of HTDELETE. There are only two worker threads and both of them are waiting on another thread to do something. As a result, the query makes no progress. I’ve let similar queries run for 72 hours before giving up and killing them.

How to unfreeze your query


Upgrading to SQL Server 2017 CU11 or higher fixes the issue. With that said, if you’re seeing this issue that means that you have columnstore insert queries waiting at least 25 seconds for a memory grant. That should be considered to be a problem even if the queries didn’t get stuck. If you find yourself in this situation, I strongly consider increasing memory available to SQL Server or limiting memory grants via Resource Governor or some other method.

Final Thoughts


If you have parallel insert queries into CCis that appear to be stuck with no CPU usage and long COLUMNSTORE_BUILD_THROTTLE wait times check your version of SQL Server. Upgrading to at least CU11 will resolve the immediate issue, but consider a more complete fix will include addressing the RESOURCE_SEMAPHORE waits that must be present on your system.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Lock Promotion In SQL Server Parallel Query Plans

I Don’t Always Talk About Locks


But when I do, it’s usually to tell people they should use RCSI, and then disappear in a cloud of smoke.

Recently I was thinking about lock promotion, because that’s what happens when I get lonely.

While digging around, I found some interesting stuff.

This is the part where I share it with you.

Without Five I Couldn’t Count To Six


The first thing I wanted was a table that I wouldn’t care about messing up, so I made a copy of the Users table.

SELECT *
INTO dbo.IDontCareAboutUsers
FROM dbo.Users AS u

ALTER TABLE dbo.IDontCareAboutUsers 
    ADD CONSTRAINT pk_IDontCareAboutUsers_id PRIMARY KEY CLUSTERED (Id);

Then I picked on a Reputation that only has one entry in the table: 20720.

BEGIN TRAN
UPDATE idcau
SET idcau.Reputation = 0
FROM dbo.IDontCareAboutUsers AS idcau
WHERE idcau.Reputation = 20720
OPTION(MAXDOP 4)

ROLLBACK

What followed was a full morning of wishing I paid more attention in internals class.

Number One


The first thing I found is that there were 16 attempts at promotion, and four successful promotions.

SQL Server query results in SQL Server Management Studio
4×4

Why did this seem weird? I dunno.

Why would there be only 4 successful attempts with no competing locks from other queries?

Why wouldn’t all 16 get promotions?

Number Two


Well, that’s a parallel plan. It’s running at DOP 4.

I added the hint in the update query above so I wouldn’t have to, like, do more to prove it.

A SQL Server Query Plan
Plantar

Okay, maybe this makes a little more sense. Four threads.

If each one tried four times, maybe another thread was like “nah, yo”, and then got by on the fifth try.

Number Three


Looking at perfmon counters before and after running showed.. exactly four!

SQL Server Perfmon Counters
Divisible
SQL Server Perfmon Counters
Still nowhere to go

Number Four


sp_WhoIsActive only showed single locks

SQL Server sp_WhoIsActive Locks
Hrm.

This isn’t wrong, necessarily. This is how things look in the DMVs it touches after the update runs, but the transaction is still open.

I’m not mad, but I am curious. I wanna know what happened in the middle.

Number Five


I set up a couple Extended Event sessions, one to capture locks acquired, and one to capture lock escalations.

This was neat.

SQL Server Extended Events Locks
Tell’em, picture

The red rectangle comes from locks acquired during the course of the update. You can see four separate threads going through and grabbing locks.

Each thread got the okay to escalate at 6,249 page locks.

Number Six


Lock promotion isn’t only denied when competing locks on the table are held by other queries.

Modification queries taking locks will attempt promotion every 1,250 locks.

Documentation regarding lock promotion points to at least 5,000 locks needing to be held before it occurs, as one factor (incompatible locks not being present are another).

If we have four threads asking every 1,250 locks (in this case on pages), they all will have made four attempts before finally escalating at 6,249.

6,249 / 1,250 is right around 5, for those who don’t have a calculator installed.

Don’t freak out if your monitoring tool tells you there’s a lot of attempts at escalation, and very few are successful.

It’s not always a sign that there’s blocking, though you may be able to correlate that with lock waits if both are present.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Last Week’s Almost Definitely Not Office Hours: March 8

ICYMI


Last week’s thrilling, stunning, flawless episode of whatever-you-wanna-call-it.

Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Building SQL Server Tools vs Building SQL Server Knowledge

Time Served


I’m gonna be honest with you: I’m not gonna build another First Responder Kit.

It’s established, it’s open source, and it’s badass. I poured a ton of work into it over the years, and starting from scratch seems like a bad use of my time.

I totally encourage you to help continue to build it. I learned a ton working on that stuff, and it was an incredibly valuable and rewarding experience.

I am going to build tooling that I think would be useful but that isn’t covered in there, for things like:

Any Tool Can See


Using any tool that returns a given set of information about wait stats, query plans, indexes, or whatever about SQL Server will show you roughly the same problems.

Roughly. Some better than others. Some I have no idea. Some not so much.

The point is, I can teach you to find problems with the tool you’re using, or help you find a tool that does.

I can also teach you how to solve them.

Nothing New But The Name


My thing over here is coaching. Helping you become better at whatever it is you wanna do.

Every tool is a wrapper for what’s inside SQL Server. Tools are interchangeable, mostly.

Knowing how use and interpret them in a meaningful way is not.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.