Blog

Software Vendor Mistakes With SQL Server: Not Letting Customers Add Indexes

I have ESPN


If your developers could intuit every possible way that your application could be used, and were really good at performance turning SQL Server, you probably wouldn’t be reading things like this.

But here I am, and here you are, and here we are, staring at each other desperately searching for a reason to go on.

Many times when I’m trying to help people achieve better performance, indexes are a big part of my analysis. Unless you have a brood of queries that are just so awful they’d thwart any and every index you could possibly throw under them, you can’t avoid this inevitability.

At the end of these calls, what I often get met with is: This is great, we just have to run it by our vendor.

Lemme explain why this is wrong, because I’ve seen that end result. Things don’t get better when you make then wait.

Sycamore Tree


You are a vendor. You vend software. The way clients use that software is up to them. They may customize it in some weird way, or they may have way more data stuck in their local site than other people.

It’s sort of like being a parent: you vend life to your kids, and you can teach them how you think they should use it, but there’s a really good chance they’re gonna make different choices.

When this happens, you need to be okay with the fact that your definition of how the software gets used is no longer applicable. This goes double for vendors who don’t bring much SQL Server performance tuning expertise to the table to begin with, because the indexes you had for your ideal usage pattern probably weren’t great either.

Running the new indexes they need by you for their usage of the software is a lot like you running what you wanna have for breakfast by me. I have no idea how you plan to spend your day. A feedbag of eggs and bacon might be totally reasonable.

The one exception to this is that the vendor might be able to take my suggestions and apply them to other installations — but this stinks of a different problem than hyper-specific customization — it means you have a lot of unhappy customers out there and you just got lucky that one was willing to pay for real help.

Plastic Surgeon


I’m going to spend a number of post talking about index follies I see all the time when looking at software vendor design patterns. For now, watch this video from my paid training to learn more about why you need nonclustered indexes:

Things like not cleaning up old indexes, not adding sufficient new indexes, general index design patterns, clustered indexes, and all the silly index settings I see people changing hoping to make a difference.

This post is to prepare you for the fact that indexes need to change over time, as your application grows. Index tuning is something you need to stay actively engaged with, otherwise you’re leaving a lot of performance on the table.

Especially for folks in the cloud, where hardware size is a monthly bill, this can be an expensive situation to end up in.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Overly Complicated Triggers

Necessary Evils


Triggers can be quite useful to perform functions that cascading foreign keys are too simple for, but you can very easily overload them. I’ve run into cases where people had written what looked like an entire application worth of business logic into triggers.

Triggers that hit tables that fire other triggers that call stored procedures across servers in a while loop. You know. Developers 🐿

One very important thing to understand is that triggers always happen in a transaction, and will roll everything back unless you explicitly SET XACT_ABORT OFF; inside them. I’m not saying you should do that, at all; just that it’s an option.

Bail Reform


There are a few things you should do early on in your triggers to let them bail out as soon as possible.

  • Check if ROWCOUNT_BIG() = 0
  • Check if there are rows in the inserted pseudo-table
  • Check if there are rows in the deleted psuedo-table

You’ll wanna do the check against ROWCOUNT_BIG() before any SET statements, because they’ll reset the counter to 0.

DECLARE @i int;
SELECT @i = COUNT_BIG(*) FROM (SELECT x = 1) AS x;
PRINT ROWCOUNT_BIG();
SET NOCOUNT ON;
PRINT ROWCOUNT_BIG();

The first will print 1, the second will print 0. Though I suppose messing that up would be an interesting performance tuning bug for your triggers.

One bug I see in plenty of triggers, though…

Multiplicity


Make sure your triggers are set up to handle multiple rows. Triggers don’t fire per-row, unless your modifications occur for a single row. So like, if your modification query is run in a cursor or loop and updates based on a single unique value, then sure, your trigger will fire for each of those.

But if your modifications might hit multiple rows, then your trigger needs to be designed to handle them. And I don’t mean with a cursor or while loop. I mean by joining to the inserted or deleted pseudo-tables, depending on what your trigger needs to do.

Note that if your trigger is for an update or merge, you may need to check both inserted and deleted. Complicated things are complicated.

One more thing to ponder as we drift along through our trigger-writing extravaganza, is that we need to be careful where we OUTPUT rows to. If you return them to a table variable or directly to the client, you’ll end up with a fully single-threaded execution plan.

You’ll wanna dump them to a #temp table or a real table to avoid that, if your triggers are being asked to handle a deluge of rows. For smaller numbers of rows, you’re unlikely to notice that being an issue.

Know When To Say END;


The longer and more complicated your trigger becomes, the harder it will be to troubleshoot performance issues with it. Since triggers are “part” of whatever your modification queries do, you can end up with locks being taken and held for far longer than intended if there’s a lot of busy work done in them.

In much the same way Bloggers Of The World™ will warn you to index your foreign keys appropriately, you need to make sure that any actions performed in your triggers are appropriately indexed for, too. They’re not so different, in that regard.

Separating triggers into specific functions and duties can be helpful, but make sure that you set the correct order of execution, if you need them to happen in a specific order.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Using Integers Instead Of Big Integers For Identity Columns

Schemathematics


I’ve had far too many clients get stuck running out of identity values for a table that grew a lot faster than they thought it would. The change is an ugly one for many reasons.

Though you can setup a process to make the change easier, and some changes are available as metadata-only, most people either have way more complications involved (foreign keys, triggers, etc.) or aren’t practically set up to use the metadata-only solution by having compressed indexes, and all the whatnot outlined in the linked post.

I was hoping that vNext would take care of the change being painful, but so far I haven’t heard of anything.

The integer maximum is, of course, 2,147,483,647 (2147483647). The big integer maximum is, of course 9,223,372,036,854,775,807 (9223372036854775807). The big integer maximum is around 4294967298 times larger. That gives you a lot more runway before you run out.

Of course, it comes with a trade off: you’re storing 8 bytes instead of 4. But my favorite way of explaining why that’s worth it is this: by the time you notice that 4 extra bytes of storage space, you’re probably about to run out of integers anyway.

Masters Of My Domain Knowledge


You don’t need to do this for static lists, or for things with an otherwise limited population. For example, if you were going to make a table of every one in your country, you could still use an integer. Even in the most populous countries on earth, you could probably survive a while with an integer.

The problem comes when you start tracking many to one relations.

An easy thing to imagine is transactions, where each user will likely have many credits and debits. Or if you’re more keen on the Stack Overflow database, each user will ask many questions and post many answers.

Hopefully, anyway. In reality, most users ask one terrible question and never come back, even if their terrible questions gets a really good answer.

The point is that once enough users have some degree of frequent activity, that identity column/sequence object will start racking up some pretty high scores. Forget the last page contention issues, there are much easier ways of dealing with those. Your problem is hitting that integer wall.

Aside from using a big integer, you could fiddle with resetting the identity or sequence value to the negative max value, but that makes people queasy for an entirely different set of reasons.

Wizzed’em


Any table in your database that’s set to represent individual actions by your users should use a big integer as an identity value, assuming you’re going the surrogate key route that utilizes an identity column, or a column based on a sequence object.

If you use a regular integer, you’re asking for problems later. Choose the form of your destructor:

  • Recycling identity values
  • Changing to a big integer
  • Deforming your table to use a GUID
  • Let’s call the whole thing off

It’s not easy making data layer changes once things have grown to the point where you’re starting to hit hard limits and boundaries. Anything you can do to protect yourself from the get-go is a must.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Using MAX Datatypes Unnecessarily

Max For The Minimum


In an earlier post, we talked about how strings change the way SQL Server’s optimizer gives memory to queries under certain conditions. In that post, I talked about why MAX datatypes have the MAX problems.

In this post we’re going to look at a couple other issues with them:

  • You can’t put them in the key of an index
  • You can’t push predicated down to them

I know what you’re thinking, here. You’d never do that; you’re much smarter. But someday you might have to explain to someone all the reasons why they shouldn’t do that, and you might want some more in-depth reasons other than “it’s bad”.

Trust me, I have to explain this to people all the time, and I wish I had a few great resources for it.

Like these posts, I guess.

Maxamonium


First, we have have this Very Useful™ query.

SELECT c = COUNT_BIG(*) FROM dbo.Posts AS P WHERE P.Body LIKE N'SQL%';
SQL Server Query Plan
grouch

The plan stinks and it’s slow as all get out, so we try to create an index.

CREATE INDEX not_gonna_happen
    ON dbo.Posts(Body);

But SQL Server denies us, because the Body column is nvarchar(max).

Msg 1919, Level 16, State 1, Line 7
Column 'Body' in table 'dbo.Posts' is of a type that is invalid for use as a key column in an index.

Second Chances


Our backup idea is to create this index, which still won’t make things much better:

CREATE INDEX not_so_helpful
    ON dbo.Posts(Id) INCLUDE(Body);

MAX columns can be in the includes list, but includes aren’t very effective for searching, unless they’re part of a filtered index. Since we don’t know what people will search for, we can’t create an explicit filter on the index either.

SQL Server Query Plan
ehh no

Even with a smaller index to read from, we spend a full two minutes filtering data out, because searching for N'SQL%' in our where clause can’t be pushed to when we scan the index.

And Sensibility


Let’s contrast that with a similar index and search of a column that’s only nvarchar(150). Sure, it’s not gonna find the same things. I just want you to see the difference in the query plan and time when we’re not hitting a (max) column.

This isn’t gonna help you if  you genuinely do need to store data up to ~2GB in size in a single column, but it might help people who used a max length “just to be safe”.

CREATE INDEX different_world
    ON dbo.Posts(Id) INCLUDE(Title);

SELECT c = COUNT_BIG(*) FROM dbo.Posts AS P WHERE P.Title LIKE N'SQL%';
SQL Server Query Plan
helicopter team

But if you fumbled around and found out, you might be able to downsize your columns to a byte length that actually fits the data, and do a lot better performance-wise. This search only takes about 460 milliseconds, even if we scan the entire index.

You may not like it, but this is what better performance looks like.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Using Scalar UDFs In Computed Columns Or Check Constraints

Days Ahead


In yesterday’s post, I taught you about the good things that come from using computed columns. In today’s post, I want to show you something terrible that can happen if you put scalar UDFs in them. The same issues arise if you use scalar UDFs in check constraints, so you can apply anything you see here to those as well.

And this isn’t something that SQL Server 2019’s UDF inlining feature, FROID, can fix for you. At least as this writing and recording.

To make things quick and easy for you to digest, here’s a training video that’s normally part of my paid classes available for free.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Not Using Computed Columns

Forgery


One of the most common things I see when working with vendor apps are queries that need to filter or join on some expression that traditional indexing can’t make efficient.

B-tree indexes only organize the data as it currently exists. As soon as you perform a runtime manipulation on it in a join or where clause, SQL Server’s optimizer and storage engine have far fewer good choices to make.

This comes back to a couple rules I have when it comes to performance:

  • Anything that makes your job easier makes the optimizer’s job harder
  • Store data the way you query it, and query data the way you store it

Presentation Is Everything


SQL Server has many built in functions that help you easily manipulate data for presentation. It also lets you write a variety of user defined functions if the existing set don’t do exactly what you want, or you have different needs.

None of these functions have any relational meaning. The built in ones don’t generally have any additional side effects, but user defined functions (scalar and multi-statement) have many additional performance side effects that we’ll discuss later in the series.

This practice violates both of the above rules, because you did something out of convenience that manipulated data at runtime.

You will be punished accordingly.

Snakes


These are the situations you want to avoid:

  • function(column) = something
  • column + column = something
  • column + value = something
  • value + column = something
  • column = @something or @something IS NULL
  • column like ‘%something’
  • column = case when …
  • value = case when column…
  • Mismatching data types

For a lot of these things, though, you can use a computed column to materialize the expression you want to use. They’ve been around forever, and I still barely see anyone using them.

There are a lot of misconceptions around them, usually that:

  • They cause blocking when you add them (only sometimes)
  • You can’t index them unless you persist them (you totally can!)

Known


There are some interesting things you can do with computed columns to make queries that would otherwise have a tough time go way faster. To make it quick and easy for you to learn about them, I’m making videos from my paid training available here for you to watch.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Software Vendor Mistakes With SQL Server: Explicit And Implicit Transactions

Holler And Cuss


There’s a time and a place for everything, except scalar functions. In a lot of the locking and deadlocking issues I help clients with, developers either:

  • Didn’t understand the scope of their transaction
  • Didn’t need an explicit transaction to begin with (ha ha ha)
  • Didn’t realize how god derned awful Implicit Transactions can be

In this post, I’m gonna give you access to some more of my training videos about locking and blocking for free. Holiday spirit, or something.

There’s a bunch of stuff in there that’ll help you generally with these issues, and one that covers the topic of this post specifically. Enjoy!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Why Some Types of Parallel Scans can be Slow on Cloud Storage

Upon reading the title, you may be thinking that of course parallel scans will be slow in the cloud. Cloud storage storage simply isn’t very fast. I would argue that there’s a bit more to it.

The Timeout

A query timed out the other day in production after running for 600 seconds. Of note, one of the tables used by the stored procedure is in a database with data files hosted by Azure blob storage. I grabbed the wait stats for the timed out query using query store:

According to the documentation, the latch category maps to LATCH_% wait types and the preemptive category maps to PREEMPTIVE_% wait types. I wasn’t able to reproduce the timeout when executing the same query text, even if I forced the exact same query plan with a USE PLAN. Admittedly, I was puzzled by the wait stats distribution for a while.

Decoding the Categories

I eventually realized that typically the first execution of the stored procedure for the business day tended to be the slowest. Sorting query store runtime DMV results by time:

I also noticed that the timed out executions tended to have higher physical I/O than the other executions. Sorting query store runtime DMV results by I/O:

I now finally had a method to reproduce the poor performance of the stored procedure: being the first one to run the code in the morning, presumably when the buffer pool didn’t have anything helpful for this stored procedure. Early in the morning, I was able to capture an actual plan that took 13 minutes at MAXDOP 8. Nearly all of the execution time is spent on an index scan for a table that is hosted on cloud storage:

Here are the top wait stats from the actual plan:

<Wait WaitType=”LATCH_EX” WaitTimeMs=”5423928″ WaitCount=”1971″ />
<Wait WaitType=”PAGEIOLATCH_SH” WaitTimeMs=”977288″ WaitCount=”29210″ />
<Wait WaitType=”CXPACKET” WaitTimeMs=”809678″ WaitCount=”700″ />
<Wait WaitType=”PREEMPTIVE_HTTP_REQUEST” WaitTimeMs=”43801″ WaitCount=”50344″ />

I used the task DMVs during query execution to get information about the latch type:

 

Now I know that the query spends most of its time trying to acquire exclusive latches of the ACCESS_METHODS_DATASET_PARENT type. There is also a fair amount of I/O wait time compared to the mysterious PREEMPTIVE_HTTP_REQUEST wait type.

The ACCESS_METHODS_DATASET_PARENT Latch

I asked the community if this latch was considered to be interesting. The winning option was “What?”. As usual, twitter is useless:

For my query, at times I observed seven out of eight worker threads all waiting for an exclusive latch at the same time. This isn’t surprising considering that the query averaged 6.7 LATCH_EX wait seconds per second. That the other thread was doing an I/O wait while the others were waiting for the latch, so perhaps the worker threads were doing I/Os while holding the latch resource. On a development environment (with unfortunately significantly better I/O), I mocked up a roughly similar table and enabled the latch_acquired, latch_released, file_read, and file_read_completed extended events. The results were quite educational:

There were two different patterns in the XE files. Sometimes the worker threads would acquire the latch, perform one or more I/Os, and release the latch. Otherwise they would acquire the latch and quickly release it without performing any I/Os. Paul Randal recently wrote the following about this latch:

When either a heap or an index is being accessed, internally there’s an object called a HeapDataSetSession or IndexDataSetSession, respectively. When a parallel scan is being performed, the threads doing the actual work of the scan each have a “child” dataset (another instance of the two objects I just described), and the main dataset, which is really controlling the scan, is called the “parent.”

When one of the scan worker threads has exhausted the set of rows it’s supposed to scan, it needs to get a new range by accessing the parent dataset, which means acquiring the ACCESS_METHODS_DATASET_PARENT latch in exclusive mode. While this can seem like a bottleneck, it’s not really, and there’s nothing you can do to stop the threads performing a parallel scan from occasionally showing a LATCH_EX wait for this latch.

At a very high level, I suspect that the fast latch releases (marked in blue in the above picture) occur when the worker thread can acquire an already available range of rows from the parent object. The slow latch releases (marked in red) occur when the worker thread tries to acquire a new range, can’t find one, and has to perform I/Os to add more ranges to the parent object. In both cases the parent object needs to be modified, so an exclusive latch is always taken. I don’t know how it actually works behind the scenes, but the theory matches the observed behavior.

In summary, the query does a parallel scan which is protected by the ACCESS_METHODS_DATASET_PARENT latch. The table getting scanned is hosted on cloud storage with high I/O latency. Latency being high contributes to the exclusive latch on ACCESS_METHODS_DATASET_PARENT getting held for a long time which can block all of the other parallel worker threads from proceeding.

The PREEMPTIVE_HTTP_REQUEST Wait Type

I already have enough information to be confident in a query tuning fix for the stored procedure, but for completeness, I also investigated the PREEMPTIVE_HTTP_REQUEST wait type. I was not able to find any useful documentation for this wait type. The official documentation only says “Internal use only.” Thanks, Microsoft.

A common technique to learn more about undocumented waits is to get callstacks associated with the wait type. I’ve seen the PREEMPTIVE_HTTP_REQUEST wait occur in Azure SQL databases, Azure SQL Managed Instances, and for databases hosted in Azure blob storage. It is possible to get callstacks for managed instances, but it is not possible to translate them because Microsoft does not release public symbols for managed instances. However, the blob storage scenario did allow me to get and translate call stacks for the wait. Below are a few of the stacks that I captured:

Performing I/Os to read data not in the buffer pool:

sqldk!XeSosPkg::wait_info_external::Publish
sqldk!SOS_Task::PushWait
sqlmin!SOS_Task::AutoSwitchPreemptive::AutoSwitchPreemptive
sqlmin!BlobAccess::AsyncHttpIOInternal
sqlmin!BlobAccess::AsyncIOInternal
sqlmin!XFCB2::ScatterReadInternal
sqlmin!FCB::ScatterRead
sqlmin!RecoveryUnit::ScatterRead
sqlmin!BPool::GetFromDisk
sqlmin!BPool::Get
sqlmin!BlobBase::FixNode
sqlmin!BlobManager::LocateBlob
sqlmin!BlobManager::OpenInternal
sqlmin!BlobManager::Open
sqlmin!LockBytesSS::Open
sqlmin!OpenLockBytesNew
sqlmin!OpenIlbssWithSeLobHandleRO
sqlmin!IndexDataSetSession::OpenStreamWithSeLobHandle
sqlmin!RowsetNewSS::OpenWithSeLobHandle
sqlmin!CBlobHandleBaseTableFactory::PilbGet
sqlTsEs!CompareLegacyLobNull
sqlTsEs!CTEsCompare<35,35>::BlCompareXcArgArgImpl
sqlTsEs!CEsExec::GeneralEval4
sqlmin!CQScanStreamAggregateNew::GetRowHelper

Writing LOB data to disk:

sqldk!XeSosPkg::wait_info_external::Publish
sqldk!SOS_Task::PushWait
sqlmin!SOS_Task::AutoSwitchPreemptive::AutoSwitchPreemptive
sqlmin!BlobAccess::AsyncHttpIOInternal
sqlmin!BlobAccess::AsyncIOInternal
sqlmin!XFCB2::ScatterReadInternal
sqlmin!FCB::ScatterRead
sqlmin!RecoveryUnit::ScatterRead
sqlmin!BPool::GetFromDisk
sqlmin!BPool::Get
sqlmin!BlobBase::FixNode
sqlmin!Blob::RWBlob
sqlmin!Blob::RWBlob
sqlmin!Blob::ReadWriteAt
sqlmin!BlobManager::ReadAtInternal
sqlmin!BlobManager::ReadAt
sqlmin!LockBytesSS::ReadAt
sqllang!write_LOB_chunked
sqllang!`anonymous namespace’::TSendRowClassNoCount<7,0>::TSendUnlimitedVartype<0>
sqllang!CTds74::SendRowImpl
sqlTsEs!CEsExec::GeneralEval
sqllang!CXStmtQuery::ErsqExecuteQuery
sqllang!CXStmtSelect::XretExecute
sqllang!CMsqlExecContext::ExecuteStmts<1,1>

Transaction log writes:

sqldk!XeSosPkg::wait_info_external::Publish
sqldk!SOS_Scheduler::UpdateWaitTimeStats
sqldk!SOS_Task::PopWait
sqlmin!SOS_ExternalAutoWait::~SOS_ExternalAutoWait
sqlmin!SOS_Task::AutoSwitchPreemptive::~AutoSwitchPreemptive
sqlmin!BlobAccess::AsyncHttpIOInternal
sqlmin!BlobAccess::AsyncIOInternal
sqlmin!XFCB2::AsyncWriteInternal
sqlmin!FCB::AsyncWrite
sqlmin!SQLServerLogMgr::FlushLCOld
sqlmin!SQLServerLogMgr::LogWriter
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint
KERNEL32.DLL+0x0000000000017974
ntdll.dll+0x000000000006A271

I’m guessing this one is renewing the lease as noted in the documentation: “the implementation of SQL Server Data Files in Azure Storage feature requires a renewal of blob lease every 45 to 60 seconds implicitly”

sqldk!XeSosPkg::wait_info_external::Publish
sqldk!SOS_Task::PushWait
sqlmin!SOS_Task::AutoSwitchPreemptive::AutoSwitchPreemptive
sqlmin!BlobAccess::AsyncHttpIOInternal
sqlmin!BlobAccess::AsyncIOInternal
sqlmin!LeaseRenewalManager2::RenewLeases
sqlmin!XStoreLeaseRenewalTask2
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint

Not surprisingly, all of them have to do with accessing azure storage. I was able to observe an interesting pattern during a table scan: every PAGEIOLATCH_SH wait was immediately preceded by a PREEMPTIVE_HTTP_REQUEST wait.

Getting stacks for the waits that are close together, with shared code removed and the I/O wait on top with the preemptive wait on the bottom:

sqldk!XeSosPkg::wait_info::Publish
sqldk!SOS_Task::PreWait
sqldk!WaitableBase::Wait
sqlmin!LatchBase::Suspend
sqlmin!LatchBase::AcquireInternal
sqlmin!BUF::AcquireLatch
sqlmin!BPool::Get
<<SHARED STACKS REMOVED>>

sqldk!XeSosPkg::wait_info_external::Publish
sqldk!SOS_Task::PushWait
sqlmin!SOS_Task::AutoSwitchPreemptive::AutoSwitchPreemptive
sqlmin!BlobAccess::AsyncHttpIOInternal
sqlmin!BlobAccess::AsyncIOInternal
sqlmin!XFCB2::ScatterReadInternal
sqlmin!FCB::ScatterRead
sqlmin!RecoveryUnit::ScatterRead
sqlmin!BPool::GetFromDisk
sqlmin!BPool::Get
sqlmin!BlobBase::FixNode
sqlmin!BlobManager::LocateBlob
sqlmin!BlobManager::OpenInternal
sqlmin!BlobManager::Open
sqlmin!LockBytesSS::Open
sqlmin!OpenLockBytesNew
sqlmin!OpenIlbssWithSeLobHandleRO
sqlmin!IndexDataSetSession::OpenStreamWithSeLobHandle
sqlmin!RowsetNewSS::OpenWithSeLobHandle
sqlmin!CBlobHandleBaseTableFactory::PilbGet
sqlTsEs!CompareLegacyLobNull

My understanding is that SQL Server makes an HTTP request to the blob storage in preemptive mode in order to queue an asynchronous I/O. The preemptive wait ends and the PAGEIOLATCH_SH wait begins shortly after. The traditional I/O wait ends when the I/O is received by SQL Server. That’s just a guess though.

I think a reasonable summary is that you should expect to see PREEMPTIVE_HTTP_REQUEST waits if you’re performing I/O against Azure storage. They are a normal part of SQL Server’s operations.

Reducing ACCESS_METHODS_DATASET_PARENT Latch Wait Time

In some situations, a query that is slow due to I/O performance can run faster with a higher DOP. Instead of one thread waiting for I/O you can have many threads waiting and overall runtime will decrease. Of course, this won’t work in all scenarios, such as if you’re hitting an IOPs limit. For my scenario, the DOP 8 query had 6.7 LATCH_EX wait seconds per second of runtime. Latch contention is so bad that it nearly steals all of the theoretical benefits achieved with query parallelism.

The obvious solution is to reduce the amount of I/O that is done by the query. That is what was indeed done to resolve the production issue. Another obvious solution is to improve I/O performance. More on that in the next section. However, in some cases the query may simply need to scan a lot of data and you can’t do anything about storage performance. In that situation, it may sometimes be practical to change the type of parallel scan performed by SQL Server.

Consider the following parallel plan against a partitioned table on the probe side:

For clarity, here is what the T-SQL looks like:

SELECT SUM(Case WHEN OffPageFiller IS NOT NULL THEN 1 ELSE 0 END)
FROM (
      VALUES (1),(2),(3),(4),(5),(6)
) v(v)
CROSS APPLY(
       SELECT OffPageFiller
       FROM dbo.SmallTable s
       INNER JOIN dbo.BigTablePart b ON s.JoinId = b.JoinId
       WHERE b.PartId = v.v
) q;

Each thread of the parallel nested loop operator gets assigned an entire partition and reads all of the data for that partition. There is no longer a need for the ACCESS_METHODS_DATASET_PARENT latch because the shared state between worker threads has been removed. I checked using extended events and found that the ACCESS_METHODS_DATASET_PARENT latch wait count was significantly decreased as expected. I suspect that the remaining latch acquisitions were from the hash build side scan. This type of access pattern requires a partitioned table, can be a pain to express in T-SQL, and may not work well if partitions are unevenly sized or if there aren’t enough partitions compared to DOP to get good demand-based distribution.

Perhaps a more reasonable option is to switch to columnstore instead. The query will do less I/O and I assume that the latch pattern is quite different, but I didn’t test this. Testing things in the cloud costs money. Maybe I should start a Patreon.

Reducing PREEMPTIVE_HTTP_REQUEST Wait Time

As I said earlier, I believe that seeing this wait is normal if you’re using azure storage. In my limited experience with it, any noticeable wait time was always tied to even longer PAGEIOLATCH waits. Tuning the query using traditional methods to reduce I/O wait time was always sufficient and I never had a need to specifically focus on PREEMPTIVE_HTTP_REQUEST. However, if you really need to focus on this wait type in particular for some reason, I’ll theorize some ways to reduce it:

  1. For all platforms, reduce I/O performed by the query using query tuning, schema changes, and so on.
  2. For managed instances, move your I/O from user databases to tempdb (it uses locally attached storage).
  3. For azure blob storage databases, make sure that you are using the right geographical region for your storage.

Here are a few more options which may not specifically reduce PREEMPTIVE_HTTP_REQUEST wait time but may improve I/O performance overall:

  1. For managed instances and azure blob storage databases, try increasing the size of your data files.
  2. For managed instances, try increasing the number of data files.
  3. For all platforms, switch to a more expensive tier of storage.

Final Thoughts

The query timed out in production because it performed a large parallel scan against a table that had little to no data in the buffer pool. PAGEIOLATCH waits were not the dominant wait type due to high latency from Azure blob storage. That high latency caused latch contention on the ACCESS_METHODS_DATASET_PARENT latch. My parallel scan was slow because cloud storage isn’t very fast. Maybe it isn’t so complicated after all?