What Else Happens When Queries Try To Compile In SQL Server: COMPILE LOCKS!
Thanks for watching!
Video Summary
In this video, I delve into the world of compile locks in SQL Server, a topic that has been on my mind ever since I noticed an unusual amount of non-traditional blocking in my databases. Specifically, I focus on how auto stats updates can lead to these compile locks, causing significant delays and potential bottlenecks. By walking through a detailed demo using the Stack Overflow database (yes, it’s all tables in SQL), I illustrate exactly what happens when a query has to wait for statistics to refresh during its compilation phase. This not only highlights the issue but also provides insights into how to mitigate these compile locks, ensuring smoother operation of your SQL Server environment.
Full Transcript
Erik Darling here with Darling Data. And boy, I’m just basking in the glow of my newfound TikTok celebrity status. Fame and fortune await. Stitch incoming is probably what I’d say if I knew what I was talking about. In this video, we’re going to talk about compile locks. And the reason why I want to talk about this is that I’m going to talk about compile locks. these is because I have seen so much of this happening. So like when I first wrote spHumanEventsBlockViewer, I was sort of surprised by the amount of non-traditional blocking that would happen in SQL Server. And one of the big ones that I think this one, like out of all the non-traditional blocking that can occur in SQL Server, this one is rampant. I see this so so much that I wanted to finally put together a demo that shows you what happens and talk about a few things you can do to alleviate compile locks in your SQL Server environment. So I’ve done a little bit of setup work on this already. There’s a table in the SQL in the Stack Overflow. There’s a table in the SQL. No kidding. It’s all tables in the SQL, isn’t it? There’s a table in the Stack Overflow database.
called Votes. And if you are of a similarly challenged mind, as I am, you might like looking at words and immediately scrambling them into every other possible word you can possibly make out of them. Perhaps that’s just a lifetime of doing weird word puzzles, because no one else would talk to you. Might be that. Who knows. But I’ve taken the votes table. And I’ve switched it around to be called the Vitos table. So it’s not votes, it’s Vitos. It has no practical meaning in this context. There’s nothing you can actually veto in Stack Overflow. Venture Capital makes all the decisions now, and boy, are they tired. So what I did was I took the votes table, and I stuck it into this Vitos table, but I doubled the number of rows, right? So it’s the votes table twice, union together, and stuck into the Vitos table.
And I put a primary key, because I’m a good SQL Server person, and I put a clustered primary key on my Vitos table, so that I know that all of my Vitos are unique. I’ve also run this query already to generate some autostats. Oh, boy. Must have been late when I was working on this one. Must have been another one of those 4 a.m. adventures that I like going on.
So, yeah. We’ve got a table called Vitos with some autostats created on the user ID and the bounty ID column. Now, I’m not saying that doing update stats with the sampled percent and persisting that sample percent is a bad idea. I’m not saying that at all. I am just doing that because in this particular demo, it exacerbates the particular problem that I want to show you.
There are times when updating stats with a particular sample percent, even full scan, is a pretty good idea, because you just might not get a very good picture of what data your indexes are holding with the default sampling percent. You could miss a lot of stuff. You could miss entire chunks of stuff. But on the plus side, they’re fast. And I’m using persist sample percent so that whenever there’s a stats update, SQL Server has to do the same thing.
All right. So, SQL Server has to maintain that percent sampling for those statistics. So, I’ve also got this stored procedure here. And this stored procedure is admittedly written in a rather silly way. I should have proofed this a little bit more before I started recording, but it’s okay.
Who doesn’t like listening to clacky keyboards in the background? It’s not like we haven’t had five years of that Zoom calls with, you know, everyone trying to out-developer each other by getting the loudest possible keyboard in it. It sounds like someone dragging cans down the street eventually.
So, these are just two queries that we’ll have to refresh statistics on the columns that we have auto stats on up here, right? So, pretty simple, right? And then just select from those tables, join together an ID. This thing returns like four rows. So, let’s walk through what happens when this thing has to compile, like, just on its own.
So, I’m going to run this update, which is just enough rows. I did the math. I did the square root math. And running this update is just enough rows to trigger auto stats updates when the query runs. So, what we’re going to do is, just to make life easy, I’m going to show wait stats, and then I’m going to show what happened, what, then I’m going to run the procedure, and then I’m going to show wait stats again, specifically for this wait on async stats refresh thing.
And I suppose I don’t need the order by here anymore. So, it’s only going to be one row, but it seemed like a good idea when I was trying to see the wait stats initially. So, we’re just going to order by that one row. It’s a free order by, right? It’s totally free.
So, let’s run these all together. And I got query plans turned on. And you might look at the initial result and think to yourself, I think Erik Darling has been working on this demo a bit in this window, and you would be correct.
So, we start off with about 36 seconds of wait on statistics refresh for this particular session in the database, right? And then this, then 13 seconds later, 13 miraculous seconds later, we have another result down here. And if we look, we have gone from 36 seconds of wait on sync stats refresh to 47 seconds.
So, about 11 seconds of waiting on sync stats refresh. I might have said async one time. I’m sorry about that. It’s sync. S-Y-N-C. Sync. Like the thing in your bathroom.
I’m going to stop that thought. So, if we look at the query plan, right? We can ignore the ones that go and look at the wait stats for this session.
We can zoom in on this jolly little fella in the middle. And this thing runs for 1.7 seconds. Okay. 1.7.
What did we spend 13 seconds waiting on? Well, you might expect to go into the query wait stats. And you might expect to see all sorts of good things being returned about what are query weighted on.
But you notice that even though in the results over here, we definitely spent about 11 seconds waiting on those synchronous stats refreshes. We don’t see that in the query plan when we look at the wait stats, right? There was no sign of that.
The top wait for this query is CX sync port, which you got 10 milliseconds of. Why? Why, I ask you, is wait on sync stats refresh not in the wait stats for this query? It’s because Microsoft hates you and wants you to suffer.
I’m kidding. Microsoft has added a lot of great things to query plans. The fact that we have stuff like operator times and wait stats at all is cool. But for the love of God, include the stuff that matters.
So, we can get some clue, right? Like if we zoom in way up over here and we look at compile time, compile time will just about perfectly match. Oh, that got away from me a little bit.
Compile time will just about perfectly match the number of seconds we spent waiting on wait on sync stats refresh. Like the thing you have in your kitchen. So, we can see that here.
But since we don’t have that wait stat in the plan XML, right? We don’t have that wait stat down there in the query plan. If you were looking at any other query in the world, you might be spent, you would be left with a lot of time guessing what happened. If this were like a more complicated query, you might be like, wow, was the optimizer spending all that time figuring out a query plan?
You’d have to like go get all sorts of crazy trace flags to look at like optimizer steps and memos and log ops and trees and timing and stuff. And you wouldn’t like that stuff is exhausting. It’s absolutely exhausting.
And it’s not fun. So, if wait on sync stats refresh were in the query plan XML, we could say, oh, we waited on that for 11 seconds. We didn’t spend 11 seconds with the optimizer just twiddling its thumbs. Like, ah, that left join go over here.
The left join go over here. Where do we put this left join? I have to convert all these right joins to left joins because no one in the right mind would actually use a right join. Like, there’s lots of stuff the optimizer would be thinking about with a bigger, more complicated plan.
So, that’s what happens when this thing runs on its own. Cool. Wonderful.
We are well on our way to better understanding nothing. I’m just showing you stuff. We understand nothing. No understanding going on here.
So, now I’m going to recompile this just out of an abundance of caution. Remember, remember abundance of, abundances of cautions. And then I’m going to re-update those same rows.
And that’ll take a few seconds. And, uh, it’s like, it’s like that part in event horizon where, uh, what’s his name’s eyes are all like scratched out. And he’s like, I have wonderful things to show you.
Uh, this is what, this is sort of like what that is. All right. So, uh, I have SP who is active open in a special DAC window over here. And if I run this, we get nothing back.
Because nothing is happening. Zero is happening. Uh, but if I come over here to SQL query stress, uh, this is already lined up to connect to the right server. Uh, and what I’m going to do is just for, I’m going to stick to, I mean, you don’t even need two iterations.
Cause we’re going to stop it way before one finishes, but I have 200 threads that are going to run this store procedure. All right. So, uh, if I kick this off and I run this, this is going to take a couple to run because we got a lot going on here.
It was kind of a mess, a bit of a traffic jam. Uh, and let’s give that one more run just so you get a sort of a sense of how grotesque things are. Like in Event Horizon.
And there we go. That’s what I wanted. So let’s cancel that. Cause we don’t need that happening anymore. But now it’s time to talk about these results. I’m going to scroll over a little bit here to frame things up a little bit more nicely for us.
And I want you to take a look at the blocked session count column over here. Right. Uh, the first session has 190.
Maybe this is easier. The first, I’m going to lose my finger right here. Uh, the first, first block session has, uh, first session has 199 blocked sessions under it. The second one has 198.
And for some reason the rest are just like, we don’t know what we’re doing. We have no idea what’s blocking us. Uh, this is all terrible. We don’t know what’s happening. Uh, terrifying. It’s horrifying.
We don’t know. Right. We’re just stuck. You can see in this column over here, they are all waiting on LCKMX. And they’ve been waiting for about nine and a half seconds, which is pretty close. Like I, like we actually kind of got lucky there.
If that was like another second or two longer, I might’ve missed it. Cause we have that like 11 second compile window where everything goes, goes to hell. But that first procedure where it’s saying, create procedure is compiling. And the rest that are saying exec, those are waiting to do anything.
So that first store procedure is trying to compile a query plan. Well, that’s happening. None of the other store procedures can do anything.
They are stuck waiting for this thing to produce a query plan. The reason why it’s stuck waiting to produce a query plan is because it is waiting on sync stats refresh. That’s no good.
Right. It’s not a good time. Why would SQL Server do this to us? Why would SQL Server make all these other queries wait on this one query to produce a plan? Well, what choice does it have?
You can’t have queries all compiling plans at once. It’d be anarchy, chaos, stuff. Stuff you wouldn’t like. Well, you can’t have multiple plans in the cache. It’d be awful.
What’s happening? So, what we got here is that one thing waiting a really long time. The reason I ran SP who is active the way that I did is to get back this additional info column right here. This lovely hunk of burning XML that we can click on.
And when we click on it, boy and howdy, do we get some interesting stuff. So, if we look at this query, what is this thing doing? What command is this?
Select StatMan. I’m going to end the pop culture references there. It’s not funny. I hate that song.
Every version of that song. From the original to the remake to the cover to the sample. All of them. So, this query is waiting on select StatMan. And every other query, if we look at the other rest of these, are going to be waiting on ZoomIt will cooperate.
This stuff. Object lock, Stack Overflow 2013, a SQL database or an SQL database with an object ID that matches the object ID of our SOAR procedure called CLock, C underscore lock, which is in the DBO schema.
Lovely. So, this is what queries will look like when they’re all trying to compile at once, but one of them is trying to figure out an execution plan. This can happen for all sorts of reasons, right?
It doesn’t have to be like the stats refresh thing. If you have a stored procedure that calls a lot of other stored procedures, it does a lot of other stuff, then like you might be stuck trying to compile a whole bunch of stuff across a whole bunch of plans, and that could be a real mess.
You can also get into a situation where you have like, you know, one super complicated query, and the optimizer is really just like, I got to figure all this out. I got to be really good at this.
And you can see query optimization take a long time in some cases. The reason why this stuff will happen would be like auto stats. Like most, I think most commonly would be like memory pressure clears out the plan cache. Like maybe a settings change or something would too.
You might see auto stats updates or an intentional stats update update stats and validate plans. All of a sudden a whole bunch of stuff has to recompile or compile really. I mean, I guess it’s sort of a recompile, but it’s like compiling again for the first time.
Something like that. And then you might also see like if you use temp tables a lot, changes to temp tables might cause recompilation stuff. So there’s all sorts of funny reasons why you might see a query plan decide to up and recompile out of nowhere.
But you might also see if you run that, if that a whole bunch of other things are also trying to run that store procedure, coming up with that query plan for the first time will block them. Will it always be for a long time?
Maybe not. Depends on a lot of stuff. Will it always happen? Well, I mean, assuming that all of the stuff that I’ve taught or all or any of the things that I’ve talked about happening and your store procedure needs to come up with the query plan and you have a million other people trying to run that store procedure too, then yes.
But like in general, it won’t always happen. I made this bad on purpose. I wanted to show you wonderfully.
So some stuff you can do that would help this particular situation would be to turn on auto update stats async. Right? So you don’t, you’re not like this query isn’t dependent on stats updating so it can generate a query plan.
It generates a query plan based on the current stats. And then something goes off in the background and it’s just like, yeah, I’ll figure out stats for this on my own, on my own time. I’ve also seen dynamic SQL be really successful with this stuff because it sort of like a different context.
And so like the store procedure is just kind of like, oh, cool. I’m just going to go find something somewhere else. Microsoft has a whole document about different reasons or like different things you can do to prevent this, but I’ve never been successful trying any of them.
It’s like, don’t prefix them with SP underscore, make sure that like the user who calls them like has a schema prefix like dbo dot whatever. Like this is all stuff that was true here. Right?
Like something about like encryption keys and opening them and closing them. I’ve just never seen any of those things cause the problem. Like I know that whoever wrote the document probably saw that stuff happen and probably might have fixed it that way. But for me, I almost never see that stuff. I always see this stuff.
So auto, auto, auto, auto updates, that’s async. Async. Right?
Not sync. Like it’s like the default async, like the non-default or sometimes running your, running, running the query that was waiting on stuff with dynamic SQL can be helpful for that too. So yeah. Those are two solutions that I’ve, that I’ve, I’ve used quite successfully to fix these issues.
Um, so I think that was it. I hope, I hope that you saw wonderful things. Sure.
I hope that you saw at least things. Saw sync. That’s cool. So, uh, thank you for watching. If you liked the video, I do appreciate it. I appreciate the thumb.
If you like me, I do appreciate the subscribe. Um, I hope you learned something. I hope you enjoyed yourselves. Uh, I hope you will also, uh, uh, be, uh, witnesses to my meteoric rise as a tick tock SQL Server influencer. I’m probably gonna have to get goofy hats and stuff to like, look like a, look like a gamer streamer, but, um, I don’t know.
Maybe I will. Maybe I won’t. We’ll see. Anyway, uh, I have another video to record today.
Uh, I’m not, I’m not gonna tell you what it’s about now. Uh, but it’s gonna be a, gonna be a banger too. So, but this was, this was a pretty good, this is a, I think this was kind of a banger.
Right? Cause this is a really interesting situation. This, this kind of stuff will show up in the block process report too. Um, uh, and the next one’s gonna be cool as well. Cause it’s gonna really challenge how you think about query tuning in SQL Server.
So, uh, let’s cut that off here. Again, thank you for watching and I will see you in a little bit once I’ve thoroughly prepared all my demo material for the next one. See you then.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.