There are, unfortunately, some times when developers refuse to put the ORM down.
I mean, it’s mostly unfortunate for them, because they’ll continue to have performance problems.
Me? I’ll be okay.
The items in this post are issues I’ve run into constantly when working with people who use ORMs, but don’t spend any time looking at the queries they generate.
I expect this list to grow. Heck, maybe it’ll even get some good comments that I can add to the list.
I know, I know. Good comments.
Unreasonable
Here’s the stuff I see quite frequently causing issues with ORM code, in no particular order, and they’re quite often signs that you’d be better off with a stored procedure.
1. Your select list doesn’t fit in the cached plan
Developers using ORMs often have to learn the hard way that when they poke an object, all the columns come out. Not only does the lesson seem hard to learn, the behavior seems hard to change. I’ve worked with people months and years apart and found the same bad habits over and over again, and this isn’t an exception. Not only can this hurt query performance for a number of reasons, but it also makes reproducing any issues really difficult because you can’t get the full query text easily.
2. Your queries generates long IN lists
It’s bad enough that most ORMs don’t deal gracefully with this by parameterizing the IN clause values. Even if you do parameterize them all, SQL Server might have different ideas about how best to apply that IN clause. Scroll down to the “Complexity” section in this blog post. You can get wildly different plans depending on how many search arguments you pass in. This is one of those times where a table valued parameter, temp table, or other materialization of the list is a way better idea.
3. You don’t understand the query it generates
Because you-know-who, did you-know-what, with you-know-who, but let’s keep that between me and you isn’t a good way to send queries to SQL Server. The number of far overly-complicated queries that generate unrecognizable logic that I’ve seen have, at this point, probably generated physical weight in the world beyond mere electrons. The 20 lines of code you wrote to explain what you want have turned into a 200 line query full of derived left joins to every single imaginable relation in the database. Why? I don’t know. You don’t know either.
4. You can’t get a good query plan
You see that paragraph up there? That’s why you can’t get a good query plan. The optimizer spent a reasonable amount of time assessing all of the crazy semantic needs of your query and came up with a reasonable plan as quickly as possible. But somewhere along the line, it misjudged something, or didn’t have time to explore that one last join reordering that would have made everything okay. Or maybe, like a query that generates a long IN clause, this monster would benefit from breaking the million-layer-dip of logic up by dumping some initial results into a #temp table.
5. You can’t get developers to fully parameterize queries
When you write queries that take parameters, whether it’s a stored procedure or dynamic SQL, you get better plan re-use. When you throw literal values into the mix, the optimizer is far less charitable, and will treat each query like it has never seen it before and go about compiling a brand new execution plan for it. Even if you turn on Forced Parameterization, your semi-parameterized queries won’t be parameterized. And of course, Optimize For Ad Hoc Workloads won’t help once the plan moves beyond stub-status.
Move On
There are many ways to tune a query, but unfortunately a good portion of them are unavailable while using ORMs in their natural code-only state. Sure, you can write custom queries in the code, but that has a lot of potential downsides, too. Depending on how the code is constructed, and if parameters are strongly typed, you may not get consistent plan re-use.
I’m all for application developers using tooling that enables them to work on new features quickly and in a way that they’re comfortable with. But at some point, SQL developers or DBAs need to step in and enforce coding standards. At some point, mom and dad have to the keys away and implement something that performs beyond just what “works”.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Sometimes I have a need to run a quick CPU comparison test between two different SQL Server instances. For example, I might be switching from old hardware to new hardware and I want to immediately see a faster query to know that I got my money’s worth. Sometimes I get a spider sense while working with virtualized SQL Server instances and want to check for problems. Yesterday, I was doing a sort of basic health check on a few servers that I hadn’t worked with much and I wanted to verify that they got the same performance for a very simple query.
The Test Code
To get a single CPU core to 100% within SQL Server with minimal setup, the best method known to me uses a temporary procedure that does a simple operation within a WHILE loop:
CREATE OR ALTER PROCEDURE #p AS
BEGIN
SET NOCOUNT ON;
DECLARE @i BIGINT = 0, @time DATETIME2 = SYSUTCDATETIME();
WHILE @i < 1000000
BEGIN
SET @i = @i + 1;
END;
SELECT cpu_time, DATEDIFF(MILLISECOND, @time, SYSUTCDATETIME()) elapsed_time
FROM sys.dm_exec_requests
WHERE session_id = @@SPID;
END;
GO
EXEC #p;
The code is extremely efficient (from a certain point of view) and CPU time will closely match elapsed time as as long as there’s no other processes on the lucky CPU that runs my beautiful code. It’s designed to spend as little time on waits and latches as possible. A temporary stored procedure is used to avoid ASYNC_NETWORK_IO waits. Paul White explains why the ASYNC_NETWORK_IO waits appear when the same code is run as part of a batch in this stack exchange answer.
CPU Problems
Going back to the intro, you can probably guess that I did not see identical CPU times when comparing the two servers. Otherwise, there would be no reason to write a blog post. The first server that I tested took about 2400 CPU ms to execute the code. The second server took about 300 CPU ms. There are of course a lot of factors in play here, but I would expect a healthy machine to take somewhere around 200-800 ms of CPU work. I definitely wouldn’t expect an 8X difference for two servers that were supposed to have identical performance!
To be clear, the only thing that you can safely conclude in this kind of situation is that there’s some kind of unknown configuration difference between the two servers. It does not necessarily mean that there’s some kind of severe issue with the server that takes more CPU time to perform the same work. With that said, it’s definitely suspicious and not something that you want to unexpectedly see. Running down the root cause of the issue can be difficult and time consuming because there are a lot of possible causes. Here is an incomplete list:
Hardware differences
Power plan and other OS configuration settings
OS patching, especially side channel vulnerability fixes
Enabling hyperthreading (CPU reporting within SQL server doesn’t have the same meaning)
Disabling Intel Turbo Boost
VM CPU oversubscription
VMware’s Enhanced vMotion Compatibility (EVC) – force a CPU to run on an older instruction set for compatibility reasons
Other VM issues
SQL Server version
Expensive monitoring configured within SQL Server including certain extended events, trace flags, profiling, or traces
In my case, the culprit ended up being a popular SQL Server third party monitoring solution (I will decline to name the vendor). Again, we can’t conclude that the vendor is doing something wrong. Different types of monitoring will have different overheads for different types of work within SQL Server. In some cases, simply asking for a SQL Server actual plan can more than double the execution time of a query. That doesn’t mean that we shouldn’t use actual plans as performance diagnosis tool!
Digging In
There was a fair amount of circumstantial evidence that the monitoring tool was responsible. It was running on the server with the slower code and it wasn’t running on the server with the faster code. Disabling the monitoring tool made both CPU times match. However, how can we really prove that the monitoring tool is the culprit? Where exactly is that extra CPU time going?
ETW tracing can provide a summary of call stacks during a sampled time period. Common choices are Windows Performance Recorder or PerfView. As usual, I will use PerfView. I was able to temporarily add third party monitoring to the server with the fast code. Fortunately, there wasn’t any other SQL Server work occurring during my testing so PerfView was quite effective at showing the difference between the two servers. In the image below, call stacks for slightly modified code without the monitoring tool are on the left. Call stacks for the same code with the monitoring tool present on the left are on the right:
It doesn’t take a SQL Server expert to see that there are some notable differences here. I’m not an expert in SQL Server call stacks, but a lot of the stuff on the left is about what I would expect for extremely efficient code in a loop. It’s a similar feeling to those people who can identify photoshopped pictures by looking at the pixels. A lot of the stuff on the right is what I would expect if SQL Server is spending a relatively large percentage of CPU time doing monitoring, such as ntdll!RtlQueryPerformanceCounter and sqllang!TraceUtil::GetStatementInfoBase. Here’s a diff view if that makes it easier to see some of the differences:
The “exc” column is the difference of the number of CPU ms spent by the method itself (not including called methods). A positive number means that the slow code spent more CPU time on that method and a negative number (not shown in this picture) means that the fast code spend more CPU time on that method.
I would describe the PerfView results as fairly conclusive evidence: the third party SQL Server monitoring tool is responsible for the 8X difference in CPU time for the WHILE loop between the two servers.
Final Thoughts
The real production workload does something more important (and complicated) than simply incrementing a variable. Seeing an 8X performance difference in that simple code wasn’t a good or welcome sign, but I expect to see a dramatic reduction in monitoring overhead when benchmarking the real workload after disabling the monitoring tool. There may not be anything to fix here, but it’s always helpful to be reminded of everything that’s running on your servers. Thanks for reading!
Over the years, there have been a lot of requests to get sp_BlitzCache to sort results by query cost. I understand why. It’s assumed that the optimizer is never wrong and that cost is directly associated with poor performance.
There are also rather misguided efforts to figure out parallelism settings based on plan costs. The main problem with that being that if you currently have a lot of parallel queries, all that means is that the estimated cost of the serial plan was higher than your current Cost Threshold For Parallelism setting, and the cost of the parallel plan was less than the cost of the serial plan.
If you increase Cost Threshold For Parallelism, you may very well still end up with a parallel plan, because the serial version was still more expensive. If you eventually change Cost Threshold For Parallelism to the point where some queries are no longer eligible for parallelism, you may eventually find yourself unhappy with the performance of the serial version of the query plan.
Albeit with less overall wait time on CX* doodads.
Next you’ll be complaining about all the SOS_SCHEDULER_YIELD waits you’ve got.
Insteads
Rather than look at estimated metrics, you should be looking at how queries actually perform. For most servers I look at, that means looking at queries with high average CPU time, and large memory grants. Those metrics typically represent tunable aspects of the query.
In other cases, you might look at wait stats to direct the type of queries you want to go after. Reads, writes, and executions are also valuable metrics at times.
One danger of looking at totals rather than averages is that you may find things that do a little bit of something a whole lot of times, and there’s no real way to tune the small bit of activity they generate other than to run the query less.
What’s A Cost For?
In general, I only tend to look at costs to figure out plan choices within a query, or when comparing two different plans for “the same” query.
This is where experimenting with hints to change the plan shapes and choices can show you why you got the plan you did, and what you might have to do to get the plan you want naturally.
Let’s say you want to figure out why you got a specific join type. You hint the type of join you want, and there’s a missing index request now. Adding the index gets you the plan shape you want without the hint. Everyone lived happily ever after.
Until the index got fragmented ???
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Whomever decided to give “memory bank” its moniker was wise beyond their years, or maybe they just made a very apt observation: all memory is on loan.
Even in the context we’ll be talking about, when SQL Server has lock pages in memory enabled, the pages that are locked in memory may not have permanent residency.
If your SQL Server doesn’t have enough memory, or if various workload elements are untuned, you may hit one of these scenarios:
Query Memory Grant contention (RESOURCE_SEMAPHORE)
Buffer Cache contention (PAGEIOLATCH_XX)
A mix of the two, where both are fighting over finite resources
It’s probably fair to note that not all query memory grant contention will result in RESOURCE_SEMAPHORE. There are times when you’ll have just enough queries asking for memory grants to knock a significant pages out of the plan cache to cause an over-reliance on disk without ever hitting the point where you’ve exhausted the amount of memory that SQL Server will loan out to queries.
To help you track down any of these scenarios, you can use my stored procedure sp_PressureDetector to see what’s going on with things.
Black Friday
Most servers I see have a mix of the two issues. Everyone complains about SQL Server being a memory hog without really understanding why. Likewise, many people are very proud about how fast their storage is without really understanding how much faster memory is. It’s quite common to hear someone say they they recently got a whole bunch of brand new shiny flashy storage but performance is still terrible on their server with 64GB of RAM and 1TB of data.
I recently had a client migrate some infrastructure to the cloud, and they were complaining about how queries got 3x slower. As it turned out, the queries were accruing 3x more PAGEIOLATCH waits with the same amount of memory assigned to SQL Server. Go figure.
If you’d like to see those waits in action, and how sp_PressureDetector can help you figure out which queries are causing problems, check out this video.
Market Economy
The primary driver of how much memory you need is how much control you have over the database. The less control you have, the more memory you need.
Here’s an example: One thing that steals control from you is using an ORM. When you let one translate code into queries, Really Bad Things™ can happen. Even with Perfect Indexes™ available, you can get some very strange queries and subsequently very strange query plans.
One of the best ways to take some control back isn’t even available in Standard Edition.
there are a lot of bad defaults in sql server. most of them you know about and change. here’s another bad one: pic.twitter.com/555xPmgwRW
— Erik Darling Data (@erikdarlingdata) May 8, 2021
If you do have control, the primary drivers of how much memory you need are how effective your indexes are, and how well your queries are written to take advantage of them. You can get away with less memory in general because your data footprint in the buffer pool will be a lot smaller.
You can watch a video I recorded about that here:
Thanks for reading (and watching)!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into an interesting scenario comparing the performance of a stored procedure using a temp table versus one utilizing a table variable under high concurrency in SQL Server 2019. Surprisingly, despite the usual criticisms of table variables, my tests showed that they outperformed the temp table by nearly 30 seconds. However, I also explore the new feature introduced in SQL Server 2019—tempDB metadata being stored in memory—which was supposed to enhance performance further. Unfortunately, this feature didn’t provide a significant boost for either method during my tests. The video concludes with a hopeful note, suggesting that once everyone upgrades to SQL Server 2019, the use of temp tables could become more widespread and efficient, potentially leading to an end to my current dead-end job at Erik Darling Data. Who knows, maybe I’ll find myself sipping cocktails on a beach in Bora Bora instead!
Full Transcript
Erik Darling here, still stuck in this dead-end job at Darling Data, or Erik Darling Data, whatever that idiot named the company. And I think, was it a week ago? Maybe it was a week ago. It could very well have been a week ago. I recorded a video showing a race between a store procedure that uses a temp table and a store procedure that uses a table variable under very high concurrency, and found, interestingly, that the much maligned table variable did indeed get off to, did indeed cross the finish line much, much faster. And it’s full 30 seconds faster. But of course, you know, table variables do have a lot of downsides. And I covered them in that video. But you know, there’s stuff that, there’s stuff about them that can be unattractive at times. And SQL Server 2019 offers us this lovely new feature feature, where we can put some tempDB metadata in memory. By in memory, I mean using the Hecatonish in memory file group stuff for some of the views in tempDB. There’s lots of information about which ones are in there. And I have a feeling that more are going to end up there. So I’m not going to list them all now because that list will quickly be inaccurate. And I don’t want I don’t I don’t need any more inaccuracies in my in my video.
So I’m just going to leave that alone. You can you can read whatever current documentation or whatever documentation is current when you watch this, because you are going to watch this because you have no choice because you are also stuck in your dead end job. Just not a darling, Eric, darling, darling, Eric data. Dead. So just to recap a little bit what we do here, we have our first store procedure up here that uses a temp table, we create a temp table, we insert some stuff into the temp table, and then that’s it. We just create it jumping in. That’s all. And then one down here, where we declare a table variable, and insert into the table variable. And then that’s it. That’s that’s all we do.
It’s the end of the road. To wrap those up nice and neat and make calling them from an outside application easier. I have this store procedure that wraps both of them in up and gets us a number to pass into them. And then executes one or the other based on whatever we pass in here. So we’re going to So if we do that, we do that. If we do that, we do that. Good. So what I have SQL Server 2019 is of course, this lovely, lovely new setting. Tim DB metadata memory optimized, our value in use is one that means it’s been enabled. And now let’s see how things turn out this time around.
This time, though, I want to run the table variable test first, because I want to see if we can do any better. So remember, remember last time, this finished in about eight seconds. So let’s see if the in memory stuff helps the temp table variable. So I’m going to hit that. I’m going to go over to this window when SP who is active and just kind of watch stuff go through in the weight info column is still going to be remarkably empty. And we are done. How long did that take? That took 7.599. So I don’t know what is that a 400 millisecond or so difference. I don’t know. I don’t think that that was very helpful.
So it looks like table variables are still still fast, but the new feature doesn’t really help them spin any faster. Now let’s go back and let’s test temp tables because remember last time temp tables took about 38 seconds, nearly 40 seconds to run. So let’s get who is active ready. Let’s clear that screen. So we have no bias or judgment going on there.
And we’ll kick that off and let’s see what SP who is active tells us. Ooh, the weight info column here is no, we are not seeing all those page latch EX weights. And we’re going, we’re going, and we’re done. And that was not 38 seconds, was it?
That was 11.7 seconds. So pretty cool that under real high concurrency, we can get much better. Tempt table performance from SQL Server 2019 with our fancy in memory, temp DB metadata feature. So that’s nice. I like that. I enjoy that.
And I look forward to, you know, being able to use temp tables all willy nilly as soon as everyone just goes ahead and goes, goes ahead and upgrades 2019 or all your problems are solved. And you probably won’t even, probably won’t even need me anymore.
I can go, go pursue my dreams of getting out of this dead end job. And I don’t know, maybe I’ll, maybe I’ll open a bar on the beach in Bora Bora. And we’ll look into that.
And we’ll look into that. we’ll look into that. So,
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the often misunderstood world of table variables in SQL Server, exploring their uses and benefits, particularly when dealing with high concurrency and frequent execution. Using a series of stored procedures to test both temp tables and table variables, I demonstrate how these objects perform under stress using RML Utilities’ O Stress tool. The results show that while temp tables can lead to significant contention and slower performance due to their serialization issues, table variables shine in scenarios requiring quick, parallel query execution. This video aims to provide clarity on when it’s appropriate to use table variables over other temporary storage options, highlighting the importance of choosing the right tool for the job based on your specific workload requirements.
Full Transcript
Erik Darling here with Erik Darling Data, named one of the top 1001 sexiest SQL Server consulting companies in the world by BeerGut Magazine. And on this lovely Sunday, and it is Sunday, and so am I, I am enjoying a GinFizz of sorts. Mmm. Mmm. What a delicacy. Today we’re going to talk about when you should use table variables. Table variables get sort of an understandably bad rap for a lot of reasons. You know, you get the forced serialization when you modify them, you get the crappy estimates unless you recompile or throw some hokey trace flags at them. You get the the, um, what do you call it? What was the other one? There was another good one I had. Oh yeah. SQL Server has no idea what’s in them. Not only does it have no idea how much is in them, but it has no idea what the contents of that much is. And that’s not good. Often not good when we care about things like queries that have good cardinality estimates and perform well and plan shape makes a, makes a huge difference. But, uh, they’re there for a reason. They, they do not exist just to make consultants money finger wagging at you. Poor people who are there for a reason. They do not exist just to make consultants money finger wagging at you. Poor people who are there for a reason. They do not exist just to make consultants money finger wagging at you.
Who, who, who keep using them for no particular reason. Uh, they do exist for a good reason. And, uh, I’ve, I’ve talked a little bit about how they help when you, uh, have very high frequency execution for queries. Uh, but what I didn’t explore then was, uh, what happens when you also have very high concurrency along with the high frequency execution. And that’s what we’re going to look at today. Now I am going to use a, uh, a couple store procedures. that I used, uh, in the past to look at how these two different types of objects perform. Uh, one of them is a temp table test. And, um, you know, I know it’s not maximally, maximally realistic, but that’s fine. Cause all it has to do is show you behavior. Not everything is going to be a hundred percent true to life, real world. This is what happened demo this, but it just shows you what happens in the real world when you use different objects for different purposes. And this is the whole point. Thanks for watching. I’m just going to leave now, go enjoy my, my gin fizz. So this, uh, this one here is a temp table tester and we create a temp table. I don’t know how I missed a semi colon. There’s a sloppy of me. It’s been gone. It’s going to kill me next time I see him and I get assassinated. Uh, actually no, it’s not even my fault. Cause I, I use SQL prompt and I hit, uh, insert semi colons. And I don’t know what, what was going on with that. I don’t know. Maybe SQL prompt is dumb, but all this does is create a temp table. Uh, insert into the temp table and then we’re done with it. Uh, this one down here, look, see this, the semi colons here. I don’t know what, what the H E double, double doglegs happened.
No, but this one declares a table variable and then does the exact same thing. Now, of course, we have to pass something in to these store procedures and, uh, I don’t want to have to do that in a weird way. So what I did is I wrote a wrapper store procedure, uh, for, for these things. And, uh, it’s called the temp test wrapper. You know, I’ll have all this code somewhere. I mean, probably, probably in the blog post. I’m not going to stick all the code in the YouTube description. That’s annoying.
Uh, but it’ll be, it’ll, it’ll accompany the blog post along with some links to other things that I’ve written about table variables in the past. Uh, including the miraculous video where I, I test running these store procedures in a loop. Won’t that be fun for you? Uh, so yeah, so this thing, uh, declares a, uh, declares a variable and, uh, we set that variable to some random number between, uh, uh, one and the max ID in the post database, which is first exact overflow 2013 is currently 2119501.
In case you had trouble reading that number. And then, uh, you know, with the test procedure is, oh, you know, uh, temp, temp table test, then we’ll run temp table test. And if the test proc that we pass in up here is that, then we’ll run that. And I guess I should, I mean, I don’t really need to make it unicode. I guess I, I guess I should for completeness, right?
And make sure I don’t go against any of my, any of my own best practices mismatching data types. Not that this is going to matter, but whatever. I don’t want to get hoisted by my own petard or whatever that’s called unitard. Um, but you know, if we run these things, we get query plans turned on over here.
If we, uh, if we test these things out and run them, uh, we will get an execution plan that, uh, accurately portrays what we wanted to do. We have an insert into our pound sign temp table and then insert into our at sign table variable. We are all good. So what I’m going to do next is run use O stress from the lovely and talented RML utilities.
And I’m going to use that to, uh, run a bunch of concurrent threads 1000 times. So 300 sessions for 1000 iterations. And I just want to make a lot of busy work for the server. There’s no other point to it. Just plain busy work.
And let’s go over here and let’s, uh, plug that into. Oh, stress and get off that off and running. And let’s come over to here to SP who is active.
And we’re going to use this get task info parameter to get, uh, additional weight stat information. We see a whole bunch of page latch EX weights. And if we run this a whole bunch of times, uh, that’s what we’re just going to see piling up, right? Maybe they’re not terribly long.
Maybe this is no, not the worst, but you know, we can obviously see the, the tempDB contention occurring. We can see SQL Server trying to figure out where to put stuff, where things go, where do you belong? Um, I don’t know.
I don’t know where to go anymore. So if we keep running this, we’re just going to keep seeing it, right? It’s obvious signs of contention. Now, if, uh, we have code where it matters that, uh, that like, you know, what, what we’re putting into our, our temp table, we might just have to deal with some of that contention.
We might just have to, you know, live with it because we might need some of the, uh, the benefits of temp tables that table variables don’t have. You can put data into them in a, in a parallel query. You can, uh, you can create indexes on them that actually make a big difference because, you know, you get statistical information, you get all accurate.
Well, I mean, you know, usually accurate row counts. So there are some, you know, kind of iffies around, um, uh, temp tables getting cached and stored procedures. But as I always say, if that is your biggest problem, you should just go on vacation and find a new job where you will have interesting problems to solve again. Uh, but if you look, that whole thing took about 40 seconds to complete.
39.368, which is close enough to 40 for me. And I’m going to order a 40 later in remembrance of this fine demo. Uh, but next let’s go and look at table variables.
All right. Let’s, for good luck, we’re going to clear that out. And now we’re going to run the table variable version. Come back over here and let’s start running who is active.
And we’ll start seeing that wait info column. It doesn’t have anything in it. We’re not really waiting on anything. And, uh-oh, we’re done in eight seconds. So, if we were measuring this workload, right?
And I know it’s not the most realistic thing in the world, but, you know, let’s just say you have a lot of user sessions that come in, do something quick, and get out. So, well, if, you know, the idea is to make those user sessions as quick as possible because you don’t want them hanging around doing things, well, then we need to maybe consider the type of temporary object that we’re using to push those queries through. So, table variables do have very good uses in this case.
In this case, you know, we had an 8.3 second workload versus, what was that, almost 40 seconds if I’m remembering correctly. And I should remember correctly because now I have to order a 40 later and remembrance of that fine query. So, I’ll let you know how that smells in another video.
So, anyway, you know, table variables do get a lot of flack, and rightly so. They have probably messed up performance for people more times than they have solved a big performance problem. But this is a case where I would absolutely use one.
It’s a little bit of a niche case, but there are definitely industries, and there are definitely workloads where this kind of thing is, you know, the norma. The norma. The normal.
The norma. This is the norma gene. But, yeah, if you have that type of workload where you have very, very high frequency execution, and you need to push a bunch of tiny little queries through very, very quickly, table variables just might be the solution for you.
Now, of course, like, you know, one step further is, you know, the in-memory table variable, which I blogged about recently. I suppose I should put a link to that somewhere, too. I don’t know where it’s going to end up, though.
I don’t like to make promises about putting things in the YouTube description because quite often I forget. And I’m bad about remembering to go back and do it until someone leaves a comment and says, Hey, you said you were going to put this in the thing, and then you didn’t.
I’m like, yeah, sorry. Happens to me all the time. I’m forgetful. That’s why I blog a lot and record videos so I can remember. I don’t have to remember.
Anyway, I’m going to get back to my gin fizz and my Sunday because I am Sunday. And I will see you in another video another time. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
USE StackOverflow2013;
EXEC dbo.DropIndexes;
SET NOCOUNT ON;
DBCC FREEPROCCACHE;
GO
CREATE INDEX whatever
ON dbo.Posts
(OwnerUserId)
INCLUDE
(Score);
GO
CREATE OR ALTER PROCEDURE dbo.TempTableTest (@Id INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
CREATE TABLE #t(i INT NOT NULL);
INSERT
#t ( i )
SELECT
p.Score
FROM dbo.Posts AS p
WHERE p.OwnerUserId = @Id;
END;
GO
CREATE OR ALTER PROCEDURE dbo.TableVariableTest (@Id INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
DECLARE @t TABLE (i INT NOT NULL);
INSERT
@t ( i )
SELECT
p.Score
FROM dbo.Posts AS p
WHERE p.OwnerUserId = @Id;
END;
GO
CREATE OR ALTER PROCEDURE dbo.TempTestWrapper (@TestProc sysname)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
DECLARE @rando int =
((ABS(CHECKSUM(NEWID())) % 21195018) + 1); /*this is the max id in posts for SO2013*/
IF @TestProc = N'TempTableTest'
BEGIN
EXEC dbo.TempTableTest @rando;
END;
IF @TestProc = N'TableVariableTest'
BEGIN
EXEC dbo.TableVariableTest @rando;
END;
END;
GO
/*Testing*/
EXEC dbo.TempTestWrapper @TestProc = 'TempTableTest';
EXEC dbo.TempTestWrapper @TestProc = 'TableVariableTest';
/*
ostress -SSQL2019S -d"StackOverflow2013" -Q"EXEC dbo.TempTestWrapper @TestProc = 'TempTableTest';" -U"ostress" -P"ostress" -q -n300 -r1000 -o"C:\temp\crap"
ostress -SSQL2019S -d"StackOverflow2013" -Q"EXEC dbo.TempTestWrapper @TestProc = 'TableVariableTest';" -U"ostress" -P"ostress" -q -n300 -r1000 -o"C:\temp\crap"
*/
Way back when, SQL Servers with lots of cores could hit some weird contention on CMEMTHREAD. To fix that and a lot of other issues associated with running on a large server, people would resort to all sorts of trace flags.
My dear friend L_____ (b|t) has a list. Maybe not the most up to date list, but there are a lot of good references in the post.
Something we’ve both observed working with large servers is that Trace Flag 8048 might still be necessary under some circumstances.
If you’re running a large server, this trace flag may still be of value even after SQL Server 2016.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
After my smash hit double diamond post about index tuning, I got a question questioning my assertion that compressed indexes are also compressed in the buffer pool.
Well, this should be quick. A quick question. Eighty hours later.
First, two indexes with no compression:
CREATE INDEX o
ON dbo.Posts
(OwnerUserId);
CREATE INDEX l
ON dbo.Posts
(LastEditorDisplayName);
Now let’s create a couple indexes with compression:
CREATE INDEX o
ON dbo.Posts
(OwnerUserId)
WITH(DATA_COMPRESSION = ROW);
CREATE INDEX l
ON dbo.Posts
(LastEditorDisplayName)
WITH(DATA_COMPRESSION = PAGE);
I’m choosing compression based on what I think would be sensible for the datatypes involved.
For the integer column, I’m using row compression, and for the string column I’m using page compression.
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the intricacies of how data caching affects query performance in SQL Server. Starting with a 37-gig table, I demonstrate how having all necessary data cached in memory can drastically reduce execution times from 14 seconds to just half a second, showcasing the power of efficient buffer pool utilization. As we move on to larger datasets, I explore scenarios where memory limitations and inappropriate indexing strategies can lead to prolonged page IO latch wait times, emphasizing the importance of optimizing both memory allocation and index design for optimal performance.
Full Transcript
Silly, shameful, slut of a charlatan. Silly, shame… Dang! My vocal coach is not going to be happy with me. Erik Darling here with Erik Darling Data, the Darlingest Data of them all, I hear. And on this rainy afternoon, I am enjoying a glass of champagne. If I were drinking a cocktail, I would go over the ingredients with you. Well, since I’m drinking a one-ingredient cocktail, it is a Paul Lenoir composition number three, and it is a 100% Chardonnay Grand Cru of some sort.
I did my best to memorize everything on the label, but I have probably failed you as miserably as I have failed my voice training coach. And anyway, getting to the point, today’s video… Actually, this is one of two videos I’m recording today. The other one is different, but I love when one demo spawns two things I can talk about.
So stay tuned for that useless information that has no bearing on your life whatsoever, or your Saturday afternoon. Unless you’re going to watch it, which… Thanks. Thanks for watching. But this video is going to continue on with a theme that I have been talking about in blog posts and other videos lately about how queries can steal space from the buffer pool, where you store your data pages that SQL Server gives out to queries, and how important it is to make sure that you have the appropriate hardware for your workload.
Because oftentimes when I’m working with clients, they go, well, how do I know if I have the right hardware? How do I know this out of the other thing? And I’m going to show you a little bit about that, and also a little bit about how to tell things are good, bad, or ugly for you. So, what we’re going to look at first, because I have to explain a little bit about this environment, is what it looks like.
So, this server, this VM, here we go, let’s go deep. This VM has 16 virtual processors. Mm-hmm. 16 of them.
This is a laptop. I’m very impressed. And this VM has 96 gigs of memory, because the right thing to do is to multiply 96 by 1024, and you get that number. Or else you’re a metric idiot.
So, we have that. And the laptop itself, my laptop, this 17-inch thing sitting next to me working very hard, is like so. So, we have this processor in there, with eight real cores, and let’s call it eight fake cores.
Actually, I don’t know which ones are real and fake. It could go this way, too. It could have real and fake in that direction.
I don’t really know. I don’t really care. Hyper-threaded like a loser anyway. It doesn’t matter which ones are fake. Half of them are fake. And so, I guess the 16 virtual cores that the VM have are like really virtual. I don’t know.
Extra virtual. Eight of them are fake. Eight of them are wrong. Eight of them are not things. And my laptop. Again, my laptop has 128 gigs of memory. If your production SQL Server has less than this, and you are concerned about performance, boy, howdy.
I will gladly open up my darling data’s cloud to you. Just watch out when I run demos. Things get a little hairy.
And you’ll notice that I have SSDs in this thing. I don’t know why disk D is zero. It’s a brave choice, but let’s move on. If we measure the disks that I have in here, and the only one I ran was the top line, because that’s the only one that I really care about for these purposes.
You can see that I can read, and this is gigabytes of, oh, I’m sorry. I forgot to hit a button. This is gigabytes a second.
All right. Gigabytes, not megabytes. I can read data at 3.2 gigs a second. I can write data right around 2.6 gigs a second. So that’s nice.
It’s pretty sweet, right? 3.2 gigs a second. I like that. I like the sound of that. That sounds good. And, you know, when I’m working with clients and talking about sort of like the correct hardware for SQL Server, at some point someone is always going to jump up on their desk and talk, we need faster disks, damn it.
Well, technically the fastest disk out there is memory. So let’s focus on that. But most people who I talk to are not using direct attached storage. Most people who I talk to are virtualized in some way and using a SAN in some way.
And so they are not going to get this. They’re not going to get all this goodness here. They are going to get much different speeds.
And it’s not going to be the fault of their disks. When they talk about getting faster and faster disks, that’s great. But the data still has to get to those disks somehow. And it’s usually the getting to those disks that doesn’t work well.
I mean, you can make it work well, but most people don’t. And the point of this all is to say that I have very fast disks. They are undeniably fast disks.
And I want to show you two different things here. So on that 2019 server, I have two copies of the Stack Overflow database. I have a full, let’s call it a full Stack Overflow, a full stack over here.
And the full stack database, I forget, I think the last date is at the end of 2019 or so. Maybe, yeah, the end of 2019. So it’s a pretty recent copy and it’s a pretty big copy.
And then I have this other copy of Stack Overflow that ends in 2013. So the last date in here is Christmas Eve of, or New Year’s Eve. Not Christmas Eve.
New Year’s Eve of 2013. And that’s technically when the world should have ended anyway, if God still listened to me. Which I don’t know why God stopped listening to me. I give such great advice about everything else.
Getting rid of the planet Earth was right in the plan since day one, end in 2013. I don’t know why mine’s got changed. Anyway.
What we have here is a few things. And these are helper views that I use in some of my demonstrations. And I will have links to the GitHub links to these up in the YouTube description and in the blog post, hopefully, if I, as long as I remember. And so what happens here?
What we’re going to do is we’re going to look at how big this index is. We’re going to clear out memory. We’re going to get the execution plan for this count query from the post table. We’re going to look at what’s in memory afterwards.
Right? We know that since we’re clearing out memory that nothing’s going to be in there. And then we’re going to run the query again afterwards, again getting the execution plan. And we’re going to do the same.
So this is in the 2013 database. That’s that context. And then we’re going to do that again in the big Stack Overflow database. And if you’ll notice, I have this hint on the, on this, this query, because apparently the nice people who make the dynamic management views in SQL Server are not terribly good at, at designing them and performance stinks unless you tinker with things a little bit.
So Microsoft, if you would like some consultation on how to make these things faster, I am available for you. I care about your health and wellness and happiness, especially that of Joe Sack. Everyone else?
I’m kidding. I’m kidding. You’re all fine people. So let’s look at what happens here. Let’s look. So at the very beginning, right, we have this. And this tells us how big the, the clustered index on the post table is.
Because I don’t have any nonclustered indexes right now. And by gosh, I’m a terrible DBA for that, huh? So this copy, right, 2013 is about 37 gigs.
And a number that I have a lot of physical agony trying to round between 47 and 40, between 4.7 gigs and 4.8 gigs. I just don’t know where to go with it. It’s just so in the middle.
But you can see there, you decide for yourself. It is 4741.96, yada, yada, yada, megabytes. You can go to gigabytes with that any way you want.
Any way you want, baby. So we run this query. We have the execution plan for it. We look at what’s in memory afterwards, which is basically the entire table, or at least all the pages that we needed to get a count. Which is great.
The count again to make sure we didn’t cheat. We didn’t mess around here. We didn’t count fewer rows. That is the number of rows in the table if you look at the nice matching row count there. Wow, that database sure is consistent.
And then let’s look at the query plans. So the first time this runs, we get a query that takes about 14 seconds. That’s reading from clean Bufferville.
We had to get everything from disk. And it took about 14 seconds for us to read about 37 gigs from disk up into memory. You can see all that time spent right in here.
And if you look at the properties over here, because we are on such a spankin’ new version of SQL Server. SQL Server 2019 probably patched up to the latest. Again, I’m a terrible DBA, so I don’t really know these things off the top of my head.
But if you look at the weight stats over here, so very important thing whenever you’re looking at query plans, especially actual execution plans, actual factual plans, is to be hitting the properties of different operators and looking at the stuff that comes up in this window.
Because all sorts of fun, interesting things show up there that just don’t show up in the tooltips. If you look at that tooltip, there’s hardly any information there. If you look at what’s in the properties pane, boy howdy.
Whoo! Whoo! If you’re data-driven, you could spend days driving around in there. So let’s look at the weight stats of this thing. And way up at the top, way, way, way up at the top, I think the rest of these, honestly, the rest of these weight stats in here are going to be completely useless, but way up at the top, we spend 10 seconds reading pages from disk into memory.
So for about 37-gig table, reading data at 3.2 gigs a second takes 10 seconds. Would you believe that? Would you believe that?
Would you believe that math? Would you believe that math to get 32 gigs of pages into memory? Well, I guess it’s a little bit worth a lot of pages.
I don’t know. Math works out. It’s there. It’s perfectly fine. Don’t worry about it. I’m sure there were other things involved. I’m sure there were other things involved. So that’s what happened there, right?
Cool. 14 seconds. And now let’s look at the execution plan for the second run when everything was already there. Quite a remarkable difference, isn’t it?
Hmm? Quite remarkable. About half a second to run that query. If you go look at the wait stats for this, we will no longer have 10 wait stats here. And on top of that, we will no longer have, well, I mean, crossing my fingers now that I’ve said it, we don’t have any weights on page IO.
I didn’t actually didn’t look at this before I ran it. Again, on top of being a terrible DBA, I’m also a terrible presenter. So just never watch anything I do.
You’ll be horrified. So we look at this and what do we have? No weights on reading pages from disk. We have some internal weights for SQL Server to do things that it has to do, but we don’t have any weights on disk anymore. Wonderful.
And that solved a 14 second problem for us by about 13 and a half seconds, having that data already in there. This gets worse when we have bigger data, bigger data, big, big data. What we’re going to see here is a slightly different scenario running through the exact same thing as in the other one.
We have slightly different information. This table, rather than being 37 gigs, is about 120 gigs with 22 gigs of lob data. I told you the world should have ended in 2013 and you didn’t listen to me.
And then after we read pages from disk into memory, notice now that we don’t have the entire table cached in memory anymore. We do not have that. We do not have enough space in the 96 gigs of data or 96 gigs of memory that we have assigned to this server.
We’re hobbled a little bit by the max server memory setting. If you go with the properties and we look at memory, you will see that I have about 88 gigs. Again, if you divide 90112 by 1024 because you’re a smart person, you will get back 88 gigs.
So we have 88 gigs of memory assigned to this. We read about 83 gigs of this table up into memory. I’m sure there’s other memory needs on here.
There’s other stuff going on that the TGL server needs some memory for. So we use about 83 gigs of space for the buffer pool and we have that hanging about in there. Wonderful.
Perfect. Glorious. Wonderful. Like the champagne. I’m told that I’m not allowed to drink on camera by my lawyer. I’m just going to smell it.
Mmm. That smells delicious. So let’s look at this when we have a bigger table. All right.
So we have to, when we read this one from disc, that takes 40, well, rounding this isn’t bad. This takes about 43 seconds. I’m willing to round there. I’m willing to go the extra mile for you. And if we, again, because we are very smart performance, we’re terrible DBAs, but we are very smart performance tuners.
If we go look at the properties of this. Now we look at the weight stats here. We have, oh, that’s a tough rounder. Oh, it’s so close. It’s right in the middle.
Oh, I can’t make these decisions. We now have about 32-ish seconds of page IO latch weight. So about 32-ish seconds of our lives were spent reading pages from disc up into memory. And if we go look at that second execution plan, this one’s going to be different.
Isn’t it? This one is going to have run for 34 seconds. Why?
Because we had to read stuff back into memory. We only had some of this stuff in memory. We didn’t have the right stuff in memory. If we go look at weight stats and we look at the top one, we will have spent a little bit less time reading pages from disc into memory, but we still had to read a whole bunch of pages from disc into memory, right?
So less, but still not great. If we crack that first one back open, so we go from, oh, wait, that’s the wrong one. I went back too far. Demo over.
Leave. Leave. All of you. If we go back to the source one, it’s 42 seconds versus 34 seconds. So that didn’t turn out too much better, did it? And again, this is reading data very fast. This is not slow data.
This is fast data. I like my data big. I like my data fast. I like the smell of that champagne. It’s a, it’s the, these are the few of my favorite things. And so the point here is that if you are looking at your server, if you’re looking at the weight stats on your server and you see that you are waiting a lot on page IO latch weight.
So again, we come back to these weight stats over here. If you find yourself waiting on page IO latch underscore S to the H, you most likely have a deficiency. Now, your deficiency could be in one of three areas.
You could have too little memory. That would be an obvious deficiency. You could have inappropriate indexes for your workload, either too many indexes, right? Cause too many things competing for space in the buffer pool that you have to keep reading up and flushing out and bring them back and come on again, off again and missing indexes, right?
So you could have a lack of opportune indexes for your queries. That’s another one. And you could also have queries that are battling your buffer pool for memory, for memory grants, right?
So things like sorts and hashes that require memory that will take memory away from your buffer pool. So those are three places where you could have some room to improve. Often when I look at servers, all three are true.
Often when I look at servers, they are laughably smaller than my production, my production laptop, which is again, this. Oh, wait, I should go back to the CPU graph so you can see all my fake CPUs again. So this and this, all right, that’s my laptop.
Cost me about four grand from Lenovo. It was a good sale, but you know. Put some money into your production SQL Server that runs your business.
I put some money into the production SQL Server that runs mine. It’s just what we do. Got to spend money to make money in here.
So anyway, what we talked about today, a little bit, sort of in a nutshell, is, God, I forget. Well, there was a champagne. That was good.
There was the size of my laptop, the size of the VM, the size of the two stack overflow databases, and the size, two different sizes of the post table. And how having more, and how having the data fully in memory when we needed to read it was very, very helpful. That query went from 14 or seconds down to about half a second.
But when we had a table that didn’t fully fit into memory, even reading from it again with some of the data in memory didn’t save us all that much time. We still ended up in a pretty tough spot. And we also talked about how if you see your servers waiting a lot on page IOLatch waits.
Now it could be more than .underscore sh. There are also page IOLatch underscore ex and underscore up and I think KP and KL and some other ones. But the ones that you’ll see the most often are page IOLatch sh and page IOLatch ex.
That is an exclusive page IOLatch and that is when modification queries need data. The SH is shared latches for select queries for the most part there. So that’s what you would look at.
And if you see a lot of those waits, if you like yours, if the amount of time that queries are waiting on those waits is significant, then you have some work to do. You have to look at how you have sized your server. You have to look at how you have designed your indexes.
And you have to look at how your queries are asking for memory. If you need help with that stuff, I guess that’s where someone like me comes in. But I don’t know, you’re watching this YouTube video for free.
Apparently you like free. So who knows? Anyway, thank you for watching. I hope you learned something. I’m going to take another sniff of my champagne and enjoy my Saturday. Bye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
You need to design indexes so that you can support your queries by making it easy for them to locate data. That’s your where clause, and guess what?
Your modification queries have where clauses, too.
How You Can Make Indexing Better
Make sure you’re reviewing your indexes regularly. Things that you need to keep an eye on:
Duplicative indexes
Under-utilized indexes
Even when indexes are defined on the same columns, they’re separate sets of pages within your data files.
If you have indexes that are on very similar sets of columns, or supersets/subsets of columns, it’s probably time to start merging them
If you have indexes that just aren’t being read, or aren’t being read anywhere near as much as they’re written to, you should think about ditching them
Cleaning up indexes like this gives you more breathing room to add in other indexes later.
It also gives you far fewer objects competing for space in memory.
That means the ones you have left stand a better chance of staying there, and your queries not having to go to disk for them.
How You Can Make Indexes Better
There are all sorts of things you can do to make indexes better, too. I don’t mean rebuilding them, either!
I mean getting smarter about what you’re indexing.
Things like filtered indexes and index compression can net you big wins when it comes to reducing the overall size of indexes.
My friend Andy Mallon has some Great Posts™ about compression over on his blog:
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.