Free SQL Server Performance Monitoring Memory Graphs

Free SQL Server Performance Monitoring Memory Graphs


Summary

In this video, I dive into the memory monitoring features of my free SQL Server performance monitoring tool, Darling Data. I cover how memory gets utilized in SQL Server, including an overview of memory grants and the plan cache, highlighting their importance and common issues. I also explain various memory clerks such as the buffer pool and lock manager, showing how they impact overall server performance. The video includes detailed explanations of the charts and graphs available for monitoring memory usage, helping you to better understand and manage SQL Server’s memory consumption. If you’re interested in checking out the tool or have any questions, head over to code.erikdarling.com where you can download it for free from my GitHub repository.

Chapters

Full Transcript

Erik Darling here with Darling Data, monitoring tool moguling my way to, well, fame and, well, I think I’ve made like 150 bucks so far, so that’s cool. If anyone out there is feeling generous and wants to sponsor open source projects, well, I’m wide open on that one, so I don’t know, maybe you’re listening, maybe you’re not, I don’t know, it’s hard to tell. Anyway, this video we’re going to talk a little bit about the memory monitoring stuff that’s available in my free SQL Server performance monitoring tool. We’ve got sort of an overview of how memory gets used generally, memory grants that your server has seen, how the plan cache is being used, which, you know, as much as I hate the plan cache, I do like pointing out how crappy the plan cache is.

So, seeing those numbers like rise and fall is always kind of like, ah, look at that, eh, that thing stinks, use query store. How various, so memory clerks, which is sort of various bibs and bobs inside of SQL Server that use memory. So, you know, you have your buffer pool, you have, you know, like lock manager, you have like different, like the plan cache, different things like that.

So, I like to surface those. Um, and also tell you if there are any memory pressure events going on, which is stuff that comes from ring buffer notifications. Uh, there are, of course, some MCP tools built in, so you can talk to all of this data with the LLM of your dreams.

Um, I find Claude the dreamiest, but you might have bad taste and like another one. Um, so all of these things sort of tie in together and I kind of group them all together because a lot of people, I think, overlook how important memory is and how it gets used in SQL Server is, um, often not, I would say not, not exactly straightforward to reason about. Um, you know, one of the biggest things that I see generally day to day is, I mean, like servers, like mostly don’t have enough memory in them.

Um, but I think what people kind of don’t understand is how vital having data cached in memory is and the tug of war that happens between your buffer pool and other memory consumers, particularly query memory grants. So having all this stuff sort of trended in, like obvious to you, makes it a little bit easier than like, you know, hitting F5 on a script, running it, seeing a snapshot of results and being like, oh, well, you know, you know, I mean, I guess it’s okay.

And, you know, like this all ties into, you know, like sort of like the weight stats graph as well, because if you’ve got a lot of page IO latch up there, then that’s going to be a sign that, you know, you’re constantly just reading pages from disk. And so there’s a lot to sort of get into and, um, there’s a lot that the graphing and the monitoring tool makes, uh, for a very easy and often compelling story to tell about what sort of problems a SQL Server is having. So, um, if you want to check it out, it’s all at code.erikdarling.com.

Um, that will take you to my GitHub repo. You can download, uh, the performance monitoring tool stuff there for free. Uh, it’s all going to be in the latest releases.

So if you go there, grab a zip file, uh, you can just open that zip file up and run the stuff you need and start monitoring your SQL Server. Very, very low friction stuff. But let’s talk a little bit about, uh, what’s going on with memory in SQL Server, which is, you know, a pretty smart thing to talk about.

So, uh, we have a few different things that we trend here. Um, looking for any sort of buffer pool pressure, which would be like, um, it’s a little hard to see, but if there were, there’s a little dotty line there. So if there were pressure, we would see some dots up on the screen.

Thankfully, I don’t have, uh, too much memory, too many, many, too many memory problems going on here. Um, I have total memory, uh, which is sort of like, you know, how much SQL Server has available to do stuff with. Um, the buffer pool, the plan cache, and available physical memory.

So, um, that tells you sort of like how much SQL Server has left to give. So if you see that start to like really drop off, then, you know, you might, you might get worried there. Uh, over in this tab, we have memory grants.

So this will tell you, um, sort of broken down by pool. Uh, so every SQL Server has at least, uh, an internal and a default pool. Um, some of them might have more pools if you set up resource governor.

Uh, but this will tell you, like, how much memory has been used from those various pools. Uh, down here, we just have memory grant activity over time. Um, so this will tell you sort of like, um, like when memory grants happened.

Up there, that’ll tell you sort of like, like, like how big they were when they happened. Down here just gives you like, you know, if you had any at all. And there are some like, you know, very moderate spikes in here.

There’s not, there’s like one there. There’s like two there. They weren’t very big memory grants. So the, the charts and stuff up here aren’t really like budging all that hard. That’s okay.

It’s just example data for you to get a feel for the tool with. Uh, memory clerks. Uh, so by default, um, I just grabbed the top five by usage. Um, this might not be like the most useful five that you will ever, ever see in your life, uh, in this example, but this is just what’s going on at my server at the time.

And if we look over here, we can see that this is the memory, memory clerk for the buffer pool, and it’s kind of going up and down a little bit as stuff gets read in and knocked out of memory and like the buffer pool shrunk down there and then it grew back there and then, uh, went up there and then it, uh, I guess it dropped off a bit there.

So sort of, uh, different ways to, um, like the little legend down here will line up with whatever is selected up there. You can, if there’s other stuff that you want to look at, you can, of course, click in, click boxes and get stuff and be happy.

And everyone, you know, knows more about SQL Server. Uh, then there’s the plan cache. Um, basically what I’m trending here is just single and multi-use plans. Um, you know, like, again, like my, my server isn’t crazy busy.

So, you know, um, you know, just like sort of like having this stuff in there, you can just kind of like, and this is in megabytes. Like you can just kind of see over time, like, you know, Oh, like what happened with my plan cache?

Why are things terrible? Why did, why did everything, why did everything get cleared out here? And now I needed to compile everything again. And that was, that was awful too. So, uh, just some useful things to help you sort of track down, uh, SQL Server issues. Lastly, over here, I have, uh, memory pressure events right now.

I don’t have anything. I don’t have any memory pressure on my server because I have a very well taken care of SQL server. I am so good at SQL Server, uh, running in a VM on my laptop that I don’t have any memory pressure at the moment.

But if I did, you would see some cool stuff in here. Um, anyway, uh, just a quick overview of. Sort of the, the memory, uh, graphs and what gets collected in the, uh, performance monitoring tools that I offer.

Uh, again, these are all free. They’re over on GitHub. You can go download them and use them. And, uh, if you have any problems or issues, uh, or questions or anything like that feedback, uh, just file a GitHub issue for it and I will get it taken care of just as soon as humanly possible.

All right. Thank you for watching.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

bit OBSCENE Episode 12: Interesting Inserts

bit OBSCENE Episode 12: Interesting Inserts


Summary

In this video, Joe Obbish and I delve into a fascinating discussion with Joe Obisch from Someplace, Somewhere, about an intriguing question that popped up on Stack Exchange. The question revolves around generating sequences in SQL Server for partitioned data types, and it highlights the challenges of handling high concurrency while maintaining performance. We explore various approaches to solving this problem, including using sequence objects, which are a relatively new feature introduced in SQL Server 2012. While we discuss the pros and cons of these methods, we also touch on the broader context of research and how to approach similar problems when faced with limited information or outdated systems. This conversation not only provides valuable insights for database administrators but also offers a glimpse into the thought processes behind crafting effective questions and answers in the tech community.

Chapters

Full Transcript

Erik Darling here with Darling Data, joined by Joe Obbish of Someplace, Somewhere. Am I still in line, Joe? Shouldn’t I be? Oh, yeah. No, no, no. I’ll change it. God. You know, you interrupt me at the worst times. Let me insert Joe now.
All right. Before I was so rudely interrupted. Welcome back to the Bit Obscene radio program brought to you by Pistachio Poppy toothpaste. It is very, very pistachio flavored and has a very, very pleasant green color on the toothpaste, on the toothbrush, rather.
This is not my actual toothbrush. This is a demo toothbrush that I use for cleaning shoes and other things that I don’t want to put in my mouth. Anyway, thank you, Pistachio Poppy. I don’t think that was a very good endorsement, but I’m sure they’ll forgive us, right?
Pistachio Poppy. It tastes a lot like pistachios. All right. Anyway, Joe had an interesting question pop up on his Stack Exchange. He got it. He got it sent directly to his email from Stack Exchange.
That’s true. And he answered it. And Joe would like to talk about how interesting this insert question is. Well, it’s not only about my answer.
I felt like the question itself and some of the answers provided a good learning opportunity. Really? And considering, like, nowadays, all this stuff is just used by AI agents that are trying to destroy the world.
But at least, like, you know, get some positive value out of our questions and answers while we can. Sure. And everybody’s looking for easy content.
That’s actually the hardest part of this, not recording them, but figuring out what to talk about. Which is weird because we normally have no idea what to figure out talking about. I think I’m viewing my browser.
You sure are. Yeah. See, this is where you kill me. So this whole, like, I was like, when you said you wanted to talk about this, I was like, and I saw the open browser tab in the last video. I was like.
That was a preview. It was. And I was like, ah, Joe’s going to go look at it. And while he’s looking at it, I’m going to reward the bounty and his score is going to go up 500 bucks. But you ruined it. And now it’s not going to happen.
That’s okay. Because if you scroll up, you can see my generosity. I mean, I’m not a stressed out person because I’m using Microsoft Azure products. Oh, okay.
I’m just all like, I’m just calm. I’m in a Zen state. Yep. I’m not worried about it. This really does smell like pistachios. It’s crazy. It’s like pistachio frosting.
I hope one day in the future, we have podcasts that, that present smells too. Me too. Only. Me too.
Everyone should smell this is a delightful aroma. So I assume that some of our audience are like running marathons or shoveling snow or whatever.
They’re not actually reading the question. So I feel that, well, first let me follow good, good practices and make it bigger. Right? Yeah. That’s a good idea.
Because everyone, everyone listening to a radio show thinks, ah, I can’t wait to read this screen. That’s right. So, so really, I think it’s a somewhat common problem. The question asker effectively needs a sequence, but it’s partitioned by another common table.
So his two types as an example are invoice and the receipt type. And the sequence is one, two, three for invoice and one, two for receipt. So, you know, think of it as like many sequence objects in the same table with one per type.
And he says that procedures execute very frequently over a hundred thousand times an hour. And he’s, he’s very, very focused on it being concurrency safe and it performing well, which, you know, fair enough.
He, he also presented his attempt at solving the problem, which is not, which is not handle concurrency at all, which he confesses. Um, and then once again, he talks about how, you know, he, he has really heavy concurrency and he’s a scale well, and so on and so forth.
Now there are a few things that stand out to me here. Um, well, first of all, in terms of like, is this a good question? I mean, considering the typical question is how is Babby formed?
This is like an excellent question, right? It’s not a perfect question, but it’s an excellent question. Um, he has a table definitions and example data, the, uh, code he tried.
He explains the problem. He tries to lay out his requirements. You know, he’s very concerned about high throughput and high volume and, and so on.
Now, in terms of making this question better, it would be good to define, like, if gaps are okay. Like, um, I also think that, I mean, just speaking for myself, I usually Google a problem or search for it on the website before asking a question.
Some, some people don’t, and more part of them, questions have to come from somewhere. But it would be good to know why he’s not using, uh, sequence objects. And I can guess the reason, like maybe has a very large number of distinct document types.
Like he has a hundred different document types. He doesn’t want to create a hundred sequences, or maybe the list of document types is, is dynamic or unknown. And he doesn’t know all the lists and he doesn’t want to dynamically create sequence types.
Now the great thing about stack exchange as opposed to real problems is you can like pretend what the question is. Like you can make the question be whatever you want. And sometimes I will find some ambiguity in a question and tilt it towards being more interesting.
And I’ll answer what I think is interesting, which is probably helpful to someone, but not necessarily the person himself. One thing I would like Eric’s opinion on is. Oh boy.
Oh, he, he frequently talks about his scalability and, and one of the comments. Yeah. He says that he has a hundred thousand logged in users at the same time. Now I am not as promiscuous as you when it comes to SQL servers.
Have you ever experienced such a workload? Okay. So not on a single SQL Server.
Um, I’d also like to maybe different, maybe like, I don’t know, there, there’s some terminology in there that leads me to believe that there is perhaps some, I don’t know. Hmm. It feels misleading to me.
So a hundred thousand logged in users, right? Um, okay. So maybe there are a hundred thousand users logged into a website, like me, probably not all actively hitting the database server. If you have that many users, you probably have enough developers to set up a reasonably good caching layer.
You probably have other databases that have, that do other things in there. And SQL Server just handles the transactional end of it. When there is a transaction, you know, like Amazon has completely different stuff for like product search than it does for like shopping cart and checkout, right?
It’s like different systems. So they might have a hundred thousand users logged into like whatever web, whatever website hitting other database services, but a hundred thousand active users on a single SQL Server. Do you have any idea what thread pool is out of your mind?
If like, no, that is bonkers to me. Yeah. Like if you just take a literally, like, I don’t know how big of a server you need to even have a hundred thousand active connections, even if they aren’t doing anything.
All right. It feels like it would have to be pretty big. Um, I would personally be very interested in seeing that server.
Modern hardware is a lot better than I thought in terms of getting very wide sockets. Like you can have a, what was it like, I think like 256 physical core, two socket server nowadays, but divide that by a hundred thousand users. It’s everyone gets a very, very small, uh, portion of one CPU.
The other thing that’s worried about it is if you’re actually living in this business environment, wouldn’t you be like super good at handling concurrency? Right. Right.
And you wouldn’t be like asking questions to stack exchange or. Okay. So let me ask you, let me, let me, let me throw this one at you. Um, if like you look at that code and does that look like the code of someone who runs a SQL Server with a hundred thousand logged in users?
Can you imagine? Like if, if that’s an example of like the starting code that they’re like, this is how it’s going to work. Um, man, I, I weep.
I weep. Yeah. I could be making a lot of money off these people. Yeah. I was just taking that. I could be, I should be by all rights, making a lot of money off whatever this situation is. But, uh, you’ve got to, uh, you’ve got to try this guy down.
Yeah. Yeah. What, what is there? What’s in his profile? Like, is there an email address? Cause life is a journey.
Yeah, sure is. Yeah. Um, yeah, I mean, it, it does feel suspicious, but like I said before, you know, if we can, make questions more interesting for ourselves, it’s, it’s a bit more fun.
Right. So I don’t have a problem with that. Um, I don’t believe it. I don’t think it’s relevant. I, I will, I will give them credit for actually specifying a target, even if it’s wrong or misleading.
Cause you know, as, as, as you’re well aware, like it needs to be fast. It needs to perform. Well, it needs to scale.
Well, we’re running it super frequently, right? Like you’re often to get that level of detail as opposed to a number. So I will give props for that. And maybe we should be more considerate to this poor suffering developer who has a hundred thousand active users at all times.
Cause it sounds like his professional life is pretty tough. I’m, I’m ready to get to the answers. If, if you are.
Oh yeah. Let’s do it. Let’s see where I want to go to first. So first one, use a sequence object or type. It’s a, I mean, it’s, it’s, it’s an answer.
Um, I think it works well enough. If, if there’s a relatively small set list of types, which could be true is probably true. Um, could have been good to talk about the cash for, for sequences.
Do you remember sequences cash by default? Uh, it’s like a thousand, I think. Okay.
Well, a thousand or 10,000 or something. If there is a 10,000 default cash, then. No, you know what? It’s not. There’s no way it’s 10,000. Let me see. From what I remember.
I think it’s a thousand. I, I, I think it doesn’t cash by default, but I, I’m willing to be proved wrong in real time. Uh, okay. So if the cash option is enabled without specifying a cash size, the database engine selects a size.
However, users shouldn’t rely upon the selection being consistent. Microsoft might change the method of calculating the cash size without notice. Oh yes.
But it defaults to cash. So, okay. So it defaults. Yeah. So it says it defaults to cash being on. Uh, but then if you don’t say what cash size you want, Microsoft’s like, roll that dice. Yeah.
It’s, it’s the way it lived dangerously. You know, this, this with this hundred thousand active users, he should probably specify that a cash. Cause I, uh, do doubt that Microsoft is, is picking their cash with a hundred thousand active users in mind.
Right. Yeah, absolutely. I mean, think of the licensing. Yeah. Um, yeah, but you know, overall, like if, if the guy doesn’t know about sequences, this could be a very bad lines for him because he, he does specify in some of the comments that he’s okay with gaps.
So maybe the solution is good. Um, I think I would like put this in a variable or a case statement or something. So we’re not duplicating all the code, but I’m just kind of quibbling at this point.
Anything to say about the very boring, but safe answer here? Uh, no, I mean, it’s fine. I’ve, I’ve not, I’m not shamefully, but, um, I’ve, I’ve done similar things to give people a, you know, unique thing per thing in a table.
Um, you know, it’s, it’s, it’s a lot easier than a lot of like before sequences, you know, like a lot of this stuff had to be done either with a cursor or with a window function in ways that were very unfun. But, you know, um, uh, so I, I, I, when the sequence objects came out, you know, like this, that, like that was the one use that I ever found for them, at least in client work. Yeah. So maybe that was the solution.
Of course. Yeah. What’s annoying. So like in your, in your answer. Well, which is it, which is a great answer, by the way. Well, thank you.
Um, you, you linked to the Paul white post on sequence tables. And every time I’ve seen a client using sequence tables, it’s always been really messed up and like a lot of locking and blocking around them. And I would show, I would show them two things.
Paul white’s article. And the fact that sequence objects exist and then never do anything about it ever. It was like, no, we can’t touch that part of the code. It’s too scary.
It’s too old. Every, like if that breaks, like we’ll go out of business. Like this is all screwed. Like, no. I don’t know. Like, I guess it’s on the one hand, it’s nice that this person is asking before they screw everything up. On the other hand, I looking at their code.
I think it’s going to all be screwed up anyway. This, this does go back to what I wanted. One of the things I wanted to talk about, which is like how to research and answer your question.
So no shade at this guy. Maybe he’s didn’t know about sequence objects and had no way of knowing. Maybe his work VPN doesn’t let him Google things.
I don’t know. But if you’re aware of sequences and they don’t work for your use case, in my mind, a logical thing is like, okay, well, this was a feature added in SQL Server 2012. What did people do before the sequence was added?
Because the problem didn’t, you know, come into existence in 2012. Right? Like it was.
No. All the code that I see from this is from like the early 2000s. Yeah. Then I thought when I was trying to answer the question, I was trying to make them more interesting. Well, like maybe he has a thousand different types or an unknown number of types or he, you know, he has a good reason that you sequence.
Well, what did the old, you know, wizards, you know, back in the day, like, how, like, how did they solve this? Because I’m sure there are many bad solutions that are posted and with a couple of good solutions. And yeah, you can’t really go wrong with, with Paul White.
And there is a lot more in here that’s needed for the question. But sure. I used it as a starting point because I was too lazy to do anything else.
Honestly, like, you know, there are a select group of people who, every time I need to do something, I am like, I always Google their name and that thing. Because I know that it will, like, if I find anything that they did with it, it will give me a much better starting point than all of the stuff that I would have had to, like, break my brain on along the way. So 100% no problem with that, like, yeah, no, I’d rather, I’d rather start from like, the 10th floor than start from the ground floor and screw up for 10 floors.
Or you could be starting from the basement, depending on what you’re looking at. Yeah, you know, I don’t know that. I think it’s an insult to basements, Joe.
I don’t know, I think. Anyway, we have an answer that that comprimes it as the correct way to guarantee. With guarantee in bold, which sounds like a very strong guarantee, right?
Oh, I mean, it’s guaranteed to deadlock like a son of a bitch. Yeah, it feels a bit too narrow, like answering the literal question. And the guy was very concerned about scalability.
That’s what he said like five times. And, you know, serializable isn’t very scalable. I would say it’s anti-scalable.
Especially serializable with an up-delock on that. That’s gonna sting. Well, and presumably this table is being used for other things too, right? Yeah.
So you’re just, I mean. Documents sounds like a well-trafficked table to me. Yeah. I’m not much of a locking guy, but this means that no other session can even select from the query. Right?
I mean, depending on, you know, indexing and, you know, where other queries are hitting in the index, there could be a significant amount of locking there. But, you know, when you look at the query itself, right? Like, you know, you have, like, your where clause is on document type and you’re getting a max serial number every time.
If you don’t have very good indexes in place to support that, that’s not gonna be fun for you at all. Yeah. And to be fair, the index is mentioned here.
No, I know. So, like, I’m just saying. Just to clarify for the audience and not totally me. The serializable hint for this query is applied.
It’s not applied to a table level lock, but it’s applied to the locks that are otherwise taken. Right. Is that correct?
Correct. So serializable would be maybe one row or, like, not many rows? But, like, I think you have a hard time guaranteeing that.
Also, like, with that index in place, you would probably be okay. Assuming that document type is unique, which I don’t know. Because, like, if someone’s, like, storing a table of documents and they store a document type, then it’s usually not unique.
And, but, like, the combination of document type and serial number might be. So, like, maybe, but, like, I still think you’d have a real tough time, like, figuring out if you were getting, like, row level locks or page level locks. I don’t think this would be a situation where you’d escalate to a table lock, but, like, like, either, like, row or page, you would end up, you could end up jamming things up pretty good.
Especially if there’s, like, just a whole buttload of these running at once. Right? Yeah.
Or if you had, like, just this random reporting queries on the table that are looking at who knows. Yeah. Scanning. Scanning, right? Yeah. But you can always put no lockings on those.
I’m sure that the 100,000 active user database is very well tuned. So you’re not going to, you know, you’re not going to have those. One can only assume.
Yeah, I mean, like, maybe it works in practice. I would be very wary of this. You know, I would say this is, again, a non-fun answer to the question, which, to be fair, can be often useful. For people.
Yeah. But, you know. Whenever I go to stack exchange, I’m generally looking at. Yeah. But I think, you know, what’s interesting here is, like, you know, a serializable up-delock that does give you, like, a boatload of pretty reasonable guarantees under the default recommitted locking isolation level. If you’re using a different isolation level, like a row versioning isolation level, like RCSI or snapshot, you kind of lose out on a little bit of that with other select queries.
Like, maybe not like other ones of these because of the hints that are there, but other queries might, you know, start seeing some weird stuff. And what was the other thing I was going to say? It was a very useful point.
Paul White’s post with the sequence tables. He doesn’t even use serializable or up-delock. He only goes to read committed lock just in case you’re using RCSI, which I think is a very, very interesting thing. Like, like, because, you know, I assume whatever Paul says is right.
So, like, because what am I going to argue with? Yeah. And it makes sense to me, my very narrow understanding of lock ink. The fact that you could use read committed for Paul’s code and you need serializable.
Yep. I’m the next answer from Steve, who says, design-wise, that is quite clearly idiocy. How does Steve have no upvotes?
Hang on. Hang on. Hold up. Steve. Maybe his upvotes were canceled by downvotes. You know? Who knows? Steve, I’m sorry I missed you.
Hang on. We’re going to watch Steve get upvoted in real time. Bam. Yes. This is, you know, this is part of why it’s important to be as clear as you can about, oh, can I have gaps?
Like, where are my exact rules? And so on. Because we have to, you know, write conditional answers based on this.
I think the basic point here is if it’s going to show without gaps, it’s going to require like a resource which is being locked in some way and that limits scalability, which is true. Yeah. But there isn’t any way to really to get from that to this because computers are faster than some people think.
Yeah. And if you have the right design, you can quite easily pump out way more than a hundred thousand locks an hour. Yeah.
No, that’s not a problem. Even if you were using like SP get app lock to manage concurrency for this, like, I can’t imagine, like, that’s not hard. Yeah. Yeah.
SP app lock does a lot of overhead. It does. Compared to Segal locks. Yeah. But I think that you could get to that level. I agree with you. Yeah.
I wouldn’t use SP get app lock. Right. Just saying. But like I’m saying, even if you were like, like, I got to use SP get app lock for something. Here goes nothing. I think you could do it.
Like, you’d have to be pretty bad at stuff. And before, um, I, I think it’s good to, to keep your options in mind in terms of like, well, I’m trying to store something. They’re going to start going to calculate instead.
The main issue had what this answer is. So, so in this answer, he’s using a row number to calculate the serial number for the people who are, who are shoveling snow or doing whatever else while listening. If a row gets deleted from the table, like now your serial numbers are changing, which doesn’t feel like a good property to have for a serial number.
Yeah. So I do feel like it’s kind of done the water for this use case. I mean, another, I think another thing that’s missing is that, um, for, okay.
So maybe, maybe not here, but in a lot of cases where I have used row number for this sort of generation, uh, you always have to go find the current max anyway, and then add that to whatever row number you’re generating. Cause like, that might not be necessary here based on some of this stuff, but like a lot of the times, like you are like, you have to go find like what’s the current high one anyway. And then add that to whatever row number you’re starting at so that it gets the next bunch.
Um, like I realized this is a little bit different cause I think it’s a unique group for a new document ID going in there or something, or like, like something along those lines. But like a lot of times I have to go, a lot of times when I’ve done this, it’s been like, well, like we like lift off here. It’s like, you’re almost like remaking an identity, but like kind of faster.
There might be a use for this. If you wanted to be really slick in that, say you don’t calculate it on insert, have a background job, which updates it. Yeah. And then, and then you use this to calculate it for the rows that haven’t been processed yet by the background job.
So it’d be a very small number of rows, you’d remove the scalability bottleneck. So such as it is. And since the life cycle of, you know, going from null to assigned is pretty short, you’re probably getting away with the, well, if I delete a row with what else happens.
If you’re not doing something like that, then this also just gets slow over time because you’re going to have more and more rows. You know, like he was saying higher thousand executions an hour, that’s like almost a billion a year. If it’s 24 seven, you know, it’s questionable if that’s true, but as, as, as all of us know, of course, this type of query where you’re basically, you know, you’re forced to scan.
All of the relevant rows in order, if you have an index, great, but that’s still going to be a, probably just a serial scan of. Yeah. That index in order.
And as you’re getting into billions and billions of rows, it’s not going to be particularly quick or, or maybe the more important thing is to say that it is going to get slower and slower time, which. Right. Isn’t performance will degrade.
Yeah. It’s something that I like for the solutions that I’m designing. Yeah. But like, and then, and then, you know, if you, if you do something like that, you know, the, the place most people are going to go with how to, how to keep that up as quickly as possible is most likely going to be a trigger. And then you’re dealing with someone who maybe doesn’t like have a good grasp of concurrency and edge cases and other fun things might write some rather unsafe trigger code and screw some stuff up along the way.
But, you know, perhaps, perhaps everyone is, you know, much better educated and making much more well-informed decisions than I’m used to seeing out in the world. Everybody knows you only ever insert a single row at a time. That’s true.
Makes the, it makes the trigger code simple. So let me ask you a question. If any people only ever insert one row at a time, why is it always in a loop? Like, why does the trigger always have to have a loop in it?
It’s insane. It’s people. Anyway, go on. Time to get to the answer. The answer.
The answer that I scrolled past, which, which is my answer. This is my answer. Um, and like, like I said, I was trying to have fun here. Uh, this is a quote from, I forget the guy’s name.
The quote was more memorable than the name. That’s always a good, good, good thing. Uh, Slava. Slava Ox?
Yes. Oh, right. That was in your columnstore presentation. Yeah. Yeah. That’s right. Um, it’s, uh, still serving me. Yeah.
Like main point here being, if you say I need sequences and I can never have gaps and I must have a truly strict ascending order. Um, that’s harder to do that limit scalability because you have to hold locks for a longer period of time, right?
Like for that type of no gap approach, you can’t use sequences. Yeah. Yeah. So you’re going to need some kind of table based, uh, sequence table and the transaction has to include both getting the sequence number and the insert.
Yeah. Which is less scalable than being able to break those, those transactions up. So the thing I’m advocating for is like, you, like you probably don’t actually need, like the way out, the way out phrases a requirement is unique integers that start small and generally trend up with the occasional gap being acceptable.
And the reason that I, I did that switch is I want transactions that are as short as possible. And, you know, like I, I, I want to make my sequence table to be as scalable as possible. The reason I went to a sequence table is there was already an answer for sequences.
Sequences are, are, are boring, but maybe has 10,000 document types. Who wants to create 10,000 sequences? That’s not a good time.
Not me. Yeah. Not you. Yeah. Do you remember the time, what were you doing? You were, you were trying to create like a whole bunch of things all at once. And it just like dragged the server down.
You were trying to create like a thousand tables or something. And like, every time, every time you added, every time you created a new table, it was like 10 seconds slower after some point.
Was it tape? Was it schemas or tables? It was some, it was one of those things. I remember, I only remember it a little bit, but it was, it was, it was one of the most hysterical things. I think the thing that I’m thinking of is, which might be what you’re thinking of.
I had a very busy server and it would take like a minute to create a table just because of the latches that were needed. And there was just so much signal wave.
No, I remember that. Yeah. That was a, that was a different one. That was a different one. Yeah. I don’t know if I remember that one. Okay.
I get a little bit more coming back to me. You were building a dynamic string that would have created like 10,000 either schemas or tables, but like the bigger the string got, the longer the whole thing took or something like that.
And at some point, just building the string past a certain point started taking a long time. Does any of that ring a bell? I have no idea why I would be doing that or. Maybe it was like partitions or something like that.
I don’t know. Anyway. It could have been partitions. I think it was, maybe it was partitioning related. I might’ve been experimenting with like having 10,000 partitions actually on a table just to, just to see what would happen.
Cause I do remember there being a design discussion around like, cause I don’t, as I think they increased the limit and then the max partitions at some point making it 15,000. Yeah.
But that was a long time ago. Yeah. Right. But like I was new to partitioning. I was trying to find guides in terms of like how many partitions is too many. I think it went from a thousand to 15,000.
If I’m. Something like maybe like a thousand 20. And it was, it was surprisingly hard to find like any real data or like anything written about it.
So that, that might’ve been it, but I honestly don’t, don’t remember. Um, so in the transcript, I’m sure. Somewhere.
Going back to this. Clearly you’re, you’re, you’re, you’re going to want a big answer. Our. Biggest ins you can get. Very scalable, highly concurrent workload with many rows and many executions. So the thing that I did was using my relaxed problem statement.
I don’t think there’s anything too fancy here. Store procedure to give you the serial number based on document type update the sequence number table and increment by one.
If the update doesn’t do anything, then that document type has never been seen before. Right. So let me, let me ask you a question.
How come you didn’t use, um, like row count big or something instead of the serial number? You, you, uh, could, um, I don’t really have a, have a reason for it. I just didn’t know if like you tried it and it sucked or something.
Oh no, no. Just like. Serial number. Summer was right there. And I thought. Why not? Use it, abuse it, overload it. Yeah.
Like, uh, to your point, the more common pattern for this type of thing would be to use for a row number. But since we had this convenient variable right here, I just used it. Um, then I’m inserting into the table with using UP lock, which I think you need to actually make this work, uh, to prevent duplicates.
Yeah. Uh, yeah. So, uh, I think, I think that, I think you might actually need, you might need serializable for that, but.
Oh, well. I can leave a comment. Um, and in, uh, this case, I’d be fine with using serializable just because, you know, like the insert workload, the insert creation runs so infrequently for this.
Right. You would hope so. Yeah. I mean, well, like if that isn’t true, then you basically have a bunch of types to have like one row.
Well, if it, if it isn’t true, then you just, if it isn’t true, then you just reverse the order. Yeah. Yeah. Sure. That is also a fair point. Um, but well, wait, no, I’m totally right.
Cause you have a hundred thousand executions an hour. You’re not going to have like, like a billion document types a year. Right. Hmm. I don’t know what kind of documents they get. Hmm.
You never know. I suppose. Never know. Yeah. Is it advisable? Maybe it doesn’t. Who knows? Um, only one way to find out, run it. Once again, I am not, well, I, I ran it and worked fine at my machine.
So there you go. Uh, I think normally you have a transaction here. I don’t have one because I don’t actually care which session wins.
I think that like, you know, this is a bit of a technical point, but in terms of defining, like what the right order is, it’s kind of a meaningless statement. Mm hmm.
And that’s like, you could have session a call this procedure, you know, one microsecond before session B, but then session a immediately gets off the scheduler. And then like session B executes this code first.
And like that session wins. Like it ends up like not really being, no, they’re a million meaningful thing in terms of like, like, like, like, I feel like there isn’t really a well-defined order here that it’s meaningful.
So, so that’s why I don’t have any transactions here because it’s, it just seems it does not practically matter in like any business sense. Yeah. So then after square is run, we know that some session inserted, hopefully one row and there’s only one row.
And now we have our one row. We can just blindly update it. Yay. That’s how we’re getting our serial number document type. And on my low end machine, I have eight physical cores.
I ran 16 threads and I was getting about 40 million executions an hour. How many hours did you let it run for? 0.016 hours.
Ah, okay. So after scaled up. Okay. So, um, we don’t actually know in the, in the longterm that this would continue to be scalable.
I think we do, but Hey, if you, if you, if you want to run overnight and feel free. You have Azure. You’re the one who can just run things.
Do I have Azure? Yeah. You said you did. You said that was making it calm. That’s true. Yeah. You can’t take that back now, Joe. It’s true.
All right. It’s not like we can edit the video or anything. We must do our, we must do our closing statements here because I have to pick a kid up in about 10 minutes. So, um, I do have to finish this. All right.
So closing statements. Eric rudely challenged me to make this faster than a sequence, which I think is very unfair because I made you a better person. Sequences are custom designed to do one simple thing and very optimized for it.
Because, you know, tables have all kinds of baggage and locks and latches and all kinds of transaction logs and so on. Now I gave it a try.
I failed. I did get really close though. And the way I, I, I gave it a try was, you know, like given our very relaxed, uh, definition for how we’re assigning the ideas, we can actually hash the table.
So take the session ID. Modulo 24. So now instead of one row, which is being locked by everyone for the document type, we have up to 24 rows and it is way more concurrence.
In fact, it’s so concurrent to the point where if, if, if I changed the workload and only ran my new procedure, which uses the, the, the version that that has the hash key added to the primary key, there effectively is no waiting on latches or locks.
It is solely right log, which I can figure out how to do. Um, I, I tried some in-memory LTP and it ended up bombing out due to right conflicts. I feel, I feel like in-memory LTP isn’t really great here.
Cause you kind of, you kind of like need, you kind of need some lock or that latch. Yeah. Like it’s not like you want like snapshot isolation.
It doesn’t make any sense for this. No, no. Right. And, and, and the table didn’t needs to be written eventually anyway. Like it can’t be schema only or I don’t know what it’s called. Like it can’t be schema only.
So I don’t like, like maybe there’s some way to make a memory LTP worked. Uh, one thing I noticed in my very brief attempt was if you want it natively compiled, you, you like, can’t do like serializable hints or tab lock hints or it’s just not compatible.
Yeah. So it probably couldn’t be natively compiled and you had to figure out the right conflicts. And I don’t know, I couldn’t get that to work.
Maybe someone could, but with my row based attempts, I did get within like, what is that? Like, not like 93% of the. Buddy, you want me to figure that out?
Of the rate for sequences. Um, so 166 million per hour for my eight physical core machine. So I do think that we have achieved the scalability needed for the question and answer.
Well done, Joe. Well done. And on that note, uh, I’m going to go get a child from the outside world.
Uh, thank you all for listening and or watching to the bit obscene radio program. Uh, if you’re, if you’re sitting, uh, at home or in the car, uh, you can safely do it. Give, give Joe a round of applause for nearly beating sequence objects at their own game.
Well done, Joe. Do you have any, any parting words for us? Maybe, um, someone else’s hardware.
My code is, is actually faster. That’s what I’m, uh, coping with, you know, cool. Maybe, maybe it’s faster on your hardware. Maybe.
Well, we will never know because you have still haven’t put the table definition that you used in there. Uh huh. So as soon as, as soon as we have that information, the, the darling data foundry will, will put that code to the test. All right.
Thank you for watching. Thank you, Joe. And, uh, we will see you all next time. Uh, as long as our pistachio poppy sponsor continues sending us toothpaste so that, um, I don’t, I don’t know what I’m going to do with 10,000 tubes of pistachio toothpaste, Joe, but I’ll figure something out. All right.
Goodbye.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

bit OBSCENE Episode 11: Scalar UDF Inlining

bit OBSCENE Episode 11: Scalar UDF Inlining


Summary

In this video, Joe Obbish and I delve into the topic of scalar UDF inlining, discussing its relevance and impact on SQL Server. After some contractual disputes that temporarily sidelined our Bit Obscene podcast, we revisited the subject with a fresh perspective. I share my initial impressions from five years ago, noting how the landscape has evolved significantly since then. The video explores the challenges and limitations of scalar UDF inlining, emphasizing its fragility and the complex rules developers must adhere to for successful inlining. We also touch on the broader implications for existing codebases versus new development, arguing that while it might offer a free performance win for legacy systems, it’s not as compelling for modern database development practices.

Chapters

  • *00:00:00* – Introduction
  • *00:01:33* – Sponsor Announcement
  • *00:02:46* – Scalar UDF and Compilation Issues
  • *00:07:12* – Reporting Privately vs Publicly
  • *00:10:54* – Query Optimization Rules
  • *00:18:04* – CTE and Inlineable Functions
  • *00:23:05* – Table Valued Functions
  • *00:26:14* – UDF Inlining Constraints
  • *00:29:32* – General Take on Scalar UDF and Inlining

Full Transcript

Hello, and welcome to another episode of the Bit Obscene podcast. There were some contractual disputes between myself and the temporary guest co-host that I stupidly let form a union, but we’re all settled, we’re all resolved now. The union bosses have driven off in their Cadillacs, and we are ready to record again. So with that out of the way, uh, welcome temporary guest co-host and union benefactor, Joe Obbish, uh, to the, to the show. I have to rename Joe in a moment. I just noticed that. And, uh, in this episode, we’re going to discuss Scalar UDF inlining. Uh, maybe, maybe, maybe not the hottest topic on the block, since it is about six years old at this point, but, you know, there’s always, always room to discuss. So here we are. And, uh, Joe, take it away. I’m sure there’s someone upgrading to SQL Server 2019, and it’s brand new technology to him. I did hear we got some complaints about the sponsor. I don’t know if you’ve been able to get the, uh, pistachio toothpaste as a sponsor yet.
Uh, oh, well, would you like to do your proud demonstration? Uh, no, you know what? I’ll, I’ll, I’ll save that. If you would warn me that you were going to make me like use props in this, I would have go gotten, gotten, gotten it. But here we are. So here you are sabotaging me when I’m not sabotaging you. We need to make sure our sponsors get their, get their money’s worth, right? So, okay. So we are brought to you by pistachio poppy toothpaste. Uh, it tastes like pistachios. Uh, I’ve heard that nine out of 10 DBAs say that it improves their, uh, query tuning.
Yeah. Uh, it also makes you a good kisser. So it depends on, depends on your, depends on your priorities there. I remember, uh, what do you remember, Joe? I remember my first impression was scalar UDS. And I think I remember you like being really excited about it and like reading like white papers, like the research papers that were, that really wasn’t for me. Um, uh, uh, uh, I don’t like white papers so much, but we had tried it on a preview version of SQL storage 2019.
Yeah. Yeah. I remember it used to be cool and get all sorts of preview stuff. Yeah. I used to be a big deal. Um, mighty have fallen. And the thing that we noticed was we had a function. It was about like 50 lines long. I think, uh, basically a series of if this return this, if this return this and so on.
Um, and it added 30 seconds of compile time with scalar UDF and lining being enabled. Yeah. So as, as, uh, good SQL Samaritans, we tracked down the, the, uh, scalar UDF guy.
Yeah. Who I don’t remember anymore. And we emailed them.
Power thing. And I, I, I think this was in like January of 2019. So about a year before release. And he said that they had much bigger issues to solve than that, which to be fair. Turned out to be true.
Yeah. Like, I mean, sure. You know, like that could definitely be true. Um, personally, I think 30 seconds of extra compile is pretty bad, but yeah, you know, it’s sure he had other things on his mind.
Um, I think we, I think we followed up six months later and didn’t get a response. And we were like, well, whatever we tried. Um, yeah.
And SQL Server 2019 has glorious debut. Yeah. It’s full release. And some of our customers were going to use it and they wanted our guidance on, you know, should this be turned on or not.
And we found that our 30 second compile issue had not been addressed because, you know, there are more important things to address. And my colleague at the time, who was also a good, who’s also a SQL good Samaritan or a good SQL Samaritan or whatever.
Um, he had like a, I think like a blog post about it, um, which, you know, like as a database professional, I feel that it is good for the community to let them know your learnings and, you know, to maybe issue some warnings like, Hey, this brand new feature might not be quite there yet because it wasn’t, you know, 30 seconds of compile time is a long time.
If Eric and I just stopped talking for 30 seconds to the demo of that, it would be really awkward and boring for everyone. Yeah. Um, so we’re not going to do that. Um, anyway, some overrated consultants took personal offense.
Ooh, at my colleague’s warning and accused him of SQL cloud chasing, whatever. Oh, right. Yeah.
I remember that. I don’t know what that means. I’ve, I, I, I, I, yeah, I don’t know. Well, there used to, there used to be this website called cloud that you could use to, uh, sort of get your social media credit score.
Perhaps that’s what they were referring to. Could be. Um, it was close with a K. Oh, well, surprisingly, he, he, he wasn’t the only one. And there was a bit of like a, like a mob with pitchforks.
Oh, I think. Really? Do they all have, do they all have MVP after their name? Cause if so, they can beat it. It’s, uh, it’s possible.
Um, yeah, I, I, I, I think the critique goes, Oh, well, you should have contacted Microsoft. Microsoft. Right. Yeah.
Let them know. And, you know, they totally would have fixed it. Like, why are you. Yeah. People act like, like that, that doesn’t happen. And it goes nowhere. Like a number of things I’ve like, okay. So like the number of things I’ve content contacted Microsoft about that, they’re like nothing.
Just like crickets on, um, you know, like, like, like when Joe Sack was there, it was different. When Joe Sack was there, it could be like, Hey Joe, this is weird and interesting. Check this out.
And like, it might not get like immediate traction and might not get like fixed in like the next CU. Uh, and to be fair, Joe did backlog me on a couple of things, but like, for the most part, when I was like, Hey, this is weird.
Joe would be like, like, it has an engineer. I’d be like, like that was, that was the, that was like Microsoft action. I don’t know who’s there now, but I don’t know. It’s all fabricating themselves.
Oh, I’m sure if you had an MVP by your name, they would take you more serious. Yeah. It’s possible. Not all of us do. Um, so that was a very weird thing.
I, I had to play peacemaker of all, like me, of all people. And I was like, Hey, um, Hey, Microsoft person who is in swept up. And outrage, we actually like did let, you know, let’s like, Oh, like calm down and be like good partners.
And I think it got fixed in a senior too, which, you know, if, if, if Microsoft adjacent people want to extol the virtues of reporting privately and not publicly, it’s not really a, it’s not really a good, uh, case study.
I also realized that I totally told that story in the wrong way. I could have had like, like the, the, the dramatic reveal, you know, if, if, if, if I had started with the 2019 release, but, uh, Oh, well, we’re, we’re, we’re, we’re not going to record this again.
It’s, you know, no, no, we’re, we’re, we’re, we’re too far into it now for sure. Plus if, if, if, if we, if you get overtime, the union is going to make me pay you double for it.
So I can’t, can’t afford that these days. AI is, AI is taking everyone’s money. I believe I am sharing my web browser. Do you, do you, I see, I see, I, I do think you are doing that now, you know, to be fair, I only gave a first impression, which is like five years ago.
Um, that list has grown significantly over the past five years. Yeah, it has. Cause like, I remember like doing this page a long time ago and I don’t remember it being very long and now it’s like.
Yeah. Much longer. And it’s not even a complete list. You have to go to some KB, which I’m surprised they haven’t like deleted yet. Yeah.
There are some things in this KB that aren’t in there and so on. Uh, so, so the thing that really, I know what you’re going for. Yeah.
The thing that really, uh, sticks out to me is I feel like general developers have like, uh, have like a small bucket that, that you can fill with database knowledge. Some of them have a very, very small bucket.
And you know, like if I can, if, if I can only teach a developer, like a few things about databases per year, one of the things that I don’t want to teach is, Hey, if you want to make UDF, which can be inlined, it’s simple.
You just have to like follow these like 30 rules and then, then, then, then the queries to call it have like 10 rules and these can change over time. So, you know, be sure to bookmark this page and, uh, it’ll be great.
Right. Like, like, like, I want them focusing on like big, impactful, wide ranging things and not, you know, correct.
This, um, is like, you know, even if someone does make a successfully inlineable UDF, which can also be successfully inline based on the query definition, which the optimizer also chooses to inline when it’s compiling the plan.
Like I hope, like I want developers thinking in, in terms of sets, right. Like that, you know, oh, I’m going to arrange my declares and sets and my ifs and my else’s correctly to get inline.
Right. Yeah. But inline tilde-like functions, if I can get a developer to create an inline tilde-like function and not a scalar UDF. You have won a battle.
Yes. Yeah. Yeah. The thing is that like, it’s real hard to get developers out of the, like, I need a loop for this mindset or like, I need to return one value. Like I’m just going to write, like, like it’s real hard to break them out of that.
Um, and like, I, I, like a lot of the UDFs that I see out there are not complicated. They’re mostly created out, like out of some weird convenience. So like, you know, uh, like, like the most common one that I see is like, like date formatting.
And I’m like, why, why, but you, but it doesn’t have to be a scalar UDF. And like, like, like, is that really so hard that you need to like encapsulate it? It’s very, very weird to me.
And even if someone does successfully make one of these inlineable scalar UDFs, you know, someone else might come along and make what they think is a small and consequential change. And they could be violating one of these rules or Microsoft could change one of the rules or one of my, my favorite rule on the list is the one about CTE.
Oh yeah. Cause it’s, I was just about to get there. Or, you know, like the UDF isn’t useful unless it’s being used by, by a query and their rules for the query.
So someone could change the query, like, I don’t know, adding a CTEand suddenly none of your scalar UDFs are getting inlined and performance tanks or this set of rules could change. And remember, like, this actually isn’t the full set of rules.
There’s a KB article too. There’s a really, so like, there’s a funny thing about all this too, where it like go up a little bit.
So like where it’s talking about, like, you can’t use string ag and you can’t build strings like, like with the assignment thing. So like the, but so like there’s, there’s actually misleading in here too, because it says that the UDF doesn’t reference XML methods.
So XML methods are things like value and exists and nodes, right? But if you just use, and so like, if you build a string, like a comma separated string in a UDF with just for XML path without like values or anything else in it, it’s in lineable.
But if you use string ag, or if you use the string builder method to do it, it’s not right. It’s like, and stupid. It’s like, just like inconsequentially stupid.
I can’t, I cannot believe some of the weird stuff that they were like, no, you can’t do that, but this is fine. Like, it’s just weird to me. Yeah.
I feel like some of these are reasonable or you get over it. Like, yeah, some of them just like, okay, you get, you’re getting what you deserve if you’re doing it. And I haven’t done any like research here, but based on how this list is arranged, it feels like the list only gets longer over time.
And there’s never an update where they’re like, hey, good news. You can now use XML methods in your UDFs, right? The UDF doesn’t contain a select with order by without a top one class.
Like, um, it’s all funny. So in terms of new developments. Oh, nothing’s getting it.
It would be nice to, well, like in, in terms of new development by database developers or people writing TC code. I certainly feel like I, I, I can’t get to a place where I would say, yeah, for, for, for, for new development, if you want to use a scalar UDF, go for it.
Cause it’s inlineable and it’s like super easy and convenient and it works the way you want to write your code. Like, no, it’s, this seems so fragile. Yeah.
I think a fair word for it. Yeah. No, it’s, it’s, it’s fragile. And, um, you know, uh, excuse me. Like when I first heard about the feature, this list didn’t exist. So like, I was a lot more excited about it because I, cause like, I mean, like really like, like investment into a problem that big, like it’s a big problem space for a lot of, a lot of people who have like, you know, at SQL Server, like code bases get developed over this point, like 20, 25, 30 years back when no one knew how bad this stuff was or like, there weren’t enough rows to see how bad this stuff was.
And so when I first heard about it, I was like, I mean, like a, like Microsoft invested like serious time in this and it’s like, that’s a crazy problem to solve. But then like, you know, like, like everything else, once, once the, once the fine print starts making it out into the, into the, the documentation, you’re like, oh, well, I can’t, I’m not, that’s not going to help any, like even when I see it kick in, like there’s a good chunk of the time where like, and this isn’t scalar UDF end linings fault, but there’s a good chunk of the time where like you had a UDF, it was slow, but like when inlining kicked in, it got slower because like all of a sudden your plan was full of eager index spools off like giant tables.
And it’s like, like, okay, well, like I don’t know if the optimizer in UDF inlining, like didn’t shake, couldn’t quite shake hands on like, Hey, can you make me look good? No, I’m just going to keep doing the eager index pool thing.
I think the point you’re making is where I wanted to get to next, which is, when we think about the, the tens of billions of lines of old TC code, which are never getting updated. Yeah.
Maybe when you upgrade the SQL service 2019, you know, if you have a hundred UDFs, half of them get inlined, your overall CPU goes down, a lot of queries run faster, a few run slower, like maybe it just feels like a free win in that sense, if no one’s touching the code anyway.
And that does feel like the sweet spot for this feature, which, I mean, you know, don’t get me wrong, making old code better for free. Yeah.
Sort of magically. This is certainly good. But if you’re thinking about all, all the new development you plan to do, this, this really doesn’t seem to be a good fit. Uh, cause looking at the rules for the queries, like, you know, I mean like, like some of these, some of these seem pretty weird or easy to avoid.
Like I’ve like, personally, I know that group grouping sets exist. I haven’t used it in like 10 years. I can go another 10 years without using it.
I think. Yeah. Didn’t you have a question about grouping sets on stack exchange? Isn’t that an early question? Like, yeah. Like ancient history. Yeah.
Yeah. That was very old. Um, or like, yeah, like don’t order by UDF. Okay. Yeah. Fair enough. But I mean, like this one, I don’t know what this means exactly. Cause I did try nesting a few UDFs and it worked.
Like, yeah, I don’t know if this is a return for a start procedure or if there’s a long time ago, I wasn’t sure about that. And I’m pretty sure Paul gave an example of it.
Um, but I, I, I, I, I didn’t follow it. Like, like the code was just so strange. I was like, all right, you win. That that’s, that just is what it is.
Um, but one of these things is not like the others. Yep. Cause what this actually means is if there’s, if there’s a CT in the query, even, even if it’s like doing nothing, you know, select one from nothing, the CT isn’t even used in the query, it could be, it could be simplified away by the query optimizer and it is simplified away that will block UDF inlining a totally useless simplified out CT is enough to block inline.
I am pretty sure that a lot of the UDF inlining rules are almost just like, they’re not implemented. Like, I’m pretty sure this is like T-SQL linting for some of them. Like, I don’t, I don’t think they actually like, look at what the resulting plan would be.
Yeah. Like if, if, if I, I tried to get more information about this and couldn’t, if I had to make a guess, someone had some weird recursive query with a UDF and it didn’t work and, you know, Microsoft wanted to fix it in a CU and the simple fix was okay.
If it has a CT, then no inlining and that, that, that, that was done in CU 11. So it certainly feels like we got away with using CTE for a while until someone ran the wrong bizarre query and now no one can ever use CTE with inlineable scalar UDFs ever again.
Like, you know, we were talking about things like you can add restrictions, but can you please eventually get rid of some of them? Like, well, I mean, the CTEthing. Could this be like, maybe it doesn’t work if you’re doing the recursive CTE or if the CTE don’t reference the UDFs, it’s fine.
This is just like, so broad. No, that’s a, that’s a giant one, but, but I can, I can actually sympathize a little bit because people abuse CTE to the nth degree in like, in like, you know, the, the regular work that I do, like they are just beat up.
So I can imagine cases where like either someone, um, so I imagine like three, three overarching cases that would make it reasonable to do this. One would be sort of your example with the recursive CTE, especially if other UDFs are called in the sort of like the body of one of the recursive parts of the CTE.
The other is just like a bunch of like highly stacked CTE with a rather like, um, maybe it’s like complicated set of joins at the end. And the third is someone like with one CTE that they referenced like 50 million times and all of that stuff would do kind of like the issue you ran into with the compilation time, just blowing up the plan.
So I can absolutely imagine situations where inlining like arbitrary CTE blocks in a UDF would be nuts. Cause remember like UD, UDFs aren’t, aren’t just like, you know, like, like, like inline table valued functions where you have like one simple select that just returns a thing.
People probably had all sorts of crazy procedural code and like, you know, like, like, like maybe could like imagine someone with like 10, like if blocks of code in a scalar UDF and they all have like crazy CTE in them to figure out some values, uh, life would be miserable.
So yeah, I would say in response that there is some kind of compile based check and like, I would like to see that type of thing. Oh, I have a UDF getting used a hundred times in the plan.
It has five joins. QueryAthema isn’t going to add 500 joins to the query. So this UDF isn’t getting in line for, for this query plan. Like that makes perfect sense to me.
Um, I, I, I think you’re being too generous here. Uh, I don’t know, man. It’s like, like, like, like, you know, it’s, it’s one thing to look at this from the perspective of like code we would write.
It’s another thing to look at this from the perspective of like the corpus of SQL Server installs that Microsoft has to deal with as support cases. Like I, like, cause I see some nutty stuff out there.
It’s not anything I would ever write, but I see some real nutty stuff out there. And I can only imagine what Microsoft support gets hit with on some of these where you’re like, Oh girl, like, I’m lucky.
Cause at some point I can say like, like, like that is way too much to deal with. Um, but, uh, Microsoft, you know, that’s a, it’s a paid support ticket there. They, they got to do something with it.
Let’s show speech. It’s true. You almost had 30 seconds. I was, uh, compiling my response.
Um, it’s, it’s somewhat similar, right? Cause I think this is really broad. I think the examples you’re given could be pretty easily included in whatever heuristic they would want to use to make this less broad.
Sure. But instead what we get is see you live in which one even was that like 2020, 2021? I don’t know.
Click on the link. There is no link. Oh, well, I don’t know. Yeah. They don’t, they don’t really believe in links. Um, who writes this documentation? Linkless.
There’s all, I think that’s also a typo in there. All of the following requirement of the execution context must be true. It’s true. You know, if you were a good citizen, you were a good citizen, you would open up a GitHub issue and add an S there.
Not a good citizen. I wouldn’t want to be accused of chasing clout. Ah, well, yeah. Your choice.
This is why you’ll never be an MVP, Joe. Yeah. That’s, that’s why. Speaking of arrogance, I feel like Microsoft did this totally wrong and I have a better design and they should have done it the way I’m picturing.
All right. They should have done is whenever you create or alter the function at like creation time, the engine tries to compile an inline televalued function version and then like saves that internally.
Because if, if you look at how, how narrow this is, it certainly does seem to, to fit within an inline televalued function. Like you only have one return.
If, if, if, if, if you put it, if you put a while loop in your UDF, it’s not inlineable anymore. Like, right. That just doesn’t work. So, and then when you have a query that’s using the UDF, replacing it with a inline function is more of a parsing thing.
Like, is it allowed to actually do the replacement? Cause in, in some like edge cases, you can have a UDF, but you can’t have a sub career for a different reason.
And if, if, if, if, if it’s allowed, put it in there and see what happens. Cause there’s just so many, it’s just like so many weird, like, like, like, like, how are these even like, like, like, like, how do you get to where, oh man, string ag.
Like, well, you can’t use that. I always wonder about that one too. Like, oh man. Like the, the, the, the, the, the technological issues of making get data inlineable.
We, we just don’t have the, the, the technology. Like, I don’t know, maybe this is something like if you’re executing a scalar UDF has get date will return different values as the, as the function that executes, like, is it trying to maintain that behavior?
Do you think? I am, I am like new sequential ID. I understand a little bit, but get date is just a very strange one. That’s the only thing I can think of. Cause if you have a non inline UDF, then that’s, it’s a different context when it executes.
So when you get like different get date values is the, um, so yes, so yes. And I think, um, I, I, I wrote a demo recently about, um, RCSI and UDFs where, um, like things just get extra weird with that.
I guess not, not a good time. Um, but I can, so I can imagine the separate context of get date presenting issues in, in some circumstances, I guess.
Um, if there’s anyone out there who has a scalar UDF and they’re relying on their get date function within that scalar UDF, returning different time values as the query executes, please leave a comment.
Cause I w I would like to scold you, but I’m also interested in, uh, whatever weird thing you’re doing. Yeah. But I don’t know. It just, well, oh no, that would be the opposite of fixing it.
Nevermind. I get, I said anything. It’s forgotten. All right. We can move on. Let’s move on. Uh, well, I don’t, I don’t know where you can move to. I’ve said just about everything I want to say on this.
All right. So, um, uh, yeah, I mean, you know, my, my general take on scalar UDF and lining is I think it was a very good investment in a very hard problem that the, the, the realities of things, um, kind of just killed it a little bit.
Uh, like just to, just to beef up the content a little bit, uh, I’m going to go inside baseball on scalar UDF and lining. So your friend who didn’t respond to you for months about the 32nd compilation stuff, uh, this guy Karthik, I mean, it was like very smart, very nice guy.
No, no beef with him, but the whole scalar UDF and lining thing was, um, was basically like written, uh, down the street from you at the university of Wisconsin and, uh, they, they did all the like Microsoft, like, cause like Karthik’s like in with Microsoft, they have like research stuff or whatever, but, um, like, uh, so Karthik and his team at, at, at, at Wisconsin, like basically built this whole thing with Microsoft source code sitting there and then, uh, like basically stapled it into the engine.
Um, and then, um, when Microsoft got it back, it was up to them to sort of do their millions and billions of unit tests on it. And I think just slowly over time is, you know, like either support cases or those millions and billions of unit tests finished, uh, a lot of stuff ended up getting like, sort of just like cutting this and like, especially a lot of security stuff, um, I would imagine posed a problem, but like, like really like the number of limitations on it, um, you know, really, really messes up the, the scope of usefulness for the feature.
But I do think, oh, like at the end of the day, you know, it was a very good investment in a very hard problem and it does solve a pretty good set of problems for people, but you know, the number of UDFs I see that are still not in lineable or, you know, even if someone’s on 2019 or better, um, cause this isn’t standard edition, even if they’re on 2019 or better, they’re not in a compatibility level where UDF and lighting kicks in.
And so there’s just a lot of stuff that like, is sort of unfortunate, um, that it got gated behind, but anyway, that’s about it there. Do you have anything else?
I do remember hearing about the university thing and I vaguely feel like there might’ve been an opportunity to say hello during the development process, which we didn’t do. And man, if only, if only, if only, if only I had gotten in there, I could have whipped them in the shape, you know, but missed opportunity.
I did try the UDF with get date and query and the value does change throughout the query execution as we thought. So I suppose that get date thing maybe isn’t a technical limitation, but, uh, you know, there’s some dumb ass out there who’s, who’s relying on the old behavior where, you know, I’m getting a new get date every time I execute that, that function and the rest of us suffer.
Yeah. Cause the thing with RCSI was that every time the function executed, it read a different set of row versions.
So it is really like a completely separate context. Didn’t like every, like every iteration of it just starts afresh. It’s wild in there. All right. Uh, do you want to, do you want to call this one and then talk about the next thing?
Final comment. Don’t forget. Uh, nine out of 10 DBAs get inlining more frequently than they use. Disashio.
Based text poppy to a four or five, four, three, four. I don’t know what that does, but it might be funny. All right.
We good. All right. Thank you for watching this episode of the bit obscene radio program. Uh, we will see you in the next episode in which Joe will talk about and interesting insertion problem.
So, uh, we will see you then. All right. Thank you for watching. Well, I want this.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Resource Usage Overview

Free SQL Server Performance Monitoring: Resource Usage Overview


Summary

In this video, I delve into the resource metrics tabs of the full dashboard from my free SQL Server performance monitoring tool. I explain how these tabs provide more detailed information compared to the lighter version, including CPU utilization, TempDB usage, memory usage, and a selection of perfmon counters. The goal is to help you quickly identify when your server might be experiencing issues by showing real-time data and historical trends that can pinpoint specific performance bottlenecks. I also highlight how the weight stats section helps in understanding wait statistics, making it easier to diagnose problems related to resource contention and query compilation.

Chapters

Full Transcript

Erik Darling here with Darling Data and returning to talk more about free, completely free SQL Server performance monitoring. We’ve got, today we’re going to talk about the resource metrics tabs from the full dashboard. These do have a bit more than the light dashboard has in them because there are some extra things that are a little bit easier to collect and grab the way that this is set up than it would be with the light dashboard. Again, the light dashboard serves most people’s needs most of the time, but if you need a little bit extra, the full dashboard has a little bit extra in it.

So, we get some server trends up here. We collect waitstats, file.io, tempdb stuff, perfmon stuff, and these DMVs, they’re all cumulative over time, and one of the most painful things for me about being a consultant is coming into some server that has either just been restarted or has been up for like a thousand hours, and you’re like, well, you know, like either the server just restarted, it’s like there’s nothing useful in there, or like, you know, the server’s been up forever, and you’re like, well, I don’t know if these waits happened yesterday, today, you know, 900 hours ago, like who can tell, right? What in here is meaningful? What in here is consistently happening?

And so, it becomes a little bit harder to, you know, give people good answers on the spot about what their current pain points are, because the cumulative metrics can not only lie to you about, or either disappear if the server restart, lie to you about when they happen, right, their significance, or worse, some of them just sort of smooth out over time.

You consider like a bursty workload where, you know, you have like big spikes of activity and then nothing for a long time. It’s like nothing tends to stick out terribly, so you have to go looking in different places, right? But all of the stuff in here that we collect is pretty nice and useful, and I don’t know, I like it.

So, if you want to check out the monitoring tool, it’s at code.erikdarling.com. It is completely free, it is completely open source, you can see everything that it does and is doing, and it is a step up above a lot of the paid stuff that you will see out there in many ways.

Let’s go look over at the resource metrics tabs now, though. So, what we collect up here is CPU utilization, TempDB usage, memory usage, and some limited perfmon counters over here. This is a pretty good way to just get an idea of like when stuff might have gotten a little wacky with the server.

You’ll notice that there’s a spike in some of the perfmon stats lined up with some of the TempDB stats up there, so something neat happened, right? Memory’s been generally okay, the buffer pool got a little hot data injection down here, and again, if you hover over any of these things, you can see the percentages and numbers and depending on what they are, and all of this stuff just kind of serves to make your life easier evaluating performance on a SQL Server at whatever time you’re looking at it.

The weight stats section over here, what I’m trying to do that, I think I’ve mostly got right now, is the, like, I wanted to call out all of the poison weights, so like resource semaphore, resource semaphore, query compiled, thread pool, and then sort of like the known, like, group of weights that most people commonly see, like SOS scheduler yield, the CX weights, stuff like that.

And then also, like, the top 10 weights other than those. So this starts off with a whole bunch of weight stats checked and lets you see sort of, like, what’s going on. A lot of people just, like, are, like, it’s, like, heartened to see when, like, you know, like thread pool and all the other stuff are zero, because if you see those ones go up, your server’s not having a good time.

But again, if you hover over any of these, you will see, like, actual values for them. So you can see which weight was weighting and for how long. You can pick different weights to put into here.

So, you know, that’s, that’s all hunky-dory. And over here is tempdb stats. tempdb not terribly busy on my server, but apparently there was a fun little spike up here.

This is tempdb latency and this is tempdb space used. So, you know, like, just kind of, like, like, I have, like, unallocated up here and then actual usage down here. The legends will tell you exactly what each line color means.

So if one of them sticks up above the rest, it should be pretty easy to figure out. Down here is just read and write latency. For file I.O. latency, you’ll have reads and writes, and you will have this per database, right? So, like, all the legends in here, all the databases I have on my server.

And we’ll see that tempdb had a tough time at some point yesterday, right? And stackoverflow had a tough time at some point yesterday. The perfmoncounters one is one that I actually just made an improvement to.

And one that I’m pretty happy, what I’m pretty psyched about is, like, I always forget sort of, like, which, which groups of perfmoncounters I want to look at for specific problems. So this one is locking and blocking, right? So, like, if I want to, if I select a different counter up here, like memory pressure, this will change and select all the counters that have to do with memory pressure.

So you can, you can start figuring out, like, oh, crap, when did, like, when did something terrible happen, right? So stolen server memory is spiking up at various points. The next tab over is session stats.

Now, what this is essentially collecting is aggregated session metrics. A lot of people who I work with have the big problem with sleeping sessions doing terrible things on their servers. So what this, these lines show you are the total number of sessions, how many of them were actually running, how many of them were sleeping, how many of them were background processes, and how many of them were just idle for more than 30 minutes.

So, like, like this, I mean, this is pretty crazy on my servers, like 45 of them, like, I guess that’s, I guess that’s mostly SQL Server doing stuff and me doing other things. But, like, you know, a lot of wacky stuff going on on here. And then we have latch stats.

This is just, like, the regular, like, DM latches stuff. So, like, you know, stuff that you would expect to see if you were doing, like, a deeper analysis on a server that’s having problems and you wanted to analyze latches. We got all those collected here.

And I know, I know, it’s never spin locks, never be spin locks, right, but it’s always nice to be able to prove that it wasn’t spin locks. So I have a graph that will let you very, very quickly absolve spin locks of any potential involvement in a query performance problem. That one time every 10 years when it is spin locks, you can also do that.

But the 999,999 times it is not spin locks, you can prove that, too. So this is just a quick overview of the stuff that you’ll find under the resource metrics tab in the full dashboard. Again, the Lite dashboard does not quite have all of that stuff.

You know, it does capture CPU, memory, file I.O., tempdb, but, like, you know, it just doesn’t capture the full spate of stuff. This is meant to be lighter weight and not have as much getting collected, but still give you all of the most important stuff that you would look at, right? There’s, like, not some, like, the deeper dive stuff, but all the really important stuff that you would care about is in here.

So, anyway, I hope you enjoyed yourselves. I hope you learned something. I hope that you will try out this amazing free, completely free monitoring tool that does way more than most paid monitoring tools ever did. It’s at code.erikdarling.com.

Again, go to the releases section. That’s where all the zips are. That’s where you can grab it. That’s where you can, you know, run everything from. And, again, if you open up a zip and you don’t see the executables in there, it might be some security software blocking them. You’re just going to have to check to unblock them.

Anyway, thank you for watching, and I’ll see you, well, I guess this might be Friday’s video, so I’ll see you Monday for Office Hours. Monday, Monday. All right.

Have a good weekend.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Query Performance Tabs, Including Query Store

Free SQL Server Performance Monitoring: Query Performance Tabs, Including Query Store


Summary

In this video, I dive into the query performance tabs within my full dashboard, a free and open-source SQL Server monitoring tool. I highlight key features such as the QueryStore subtab, which provides detailed insights into query execution plans and regressions over time, helping you identify and address performance issues more effectively. Additionally, I explain how the custom trace patterns tab offers valuable data for safeguarding against overlooked performance anomalies, ensuring a comprehensive view of your SQL Server’s health.

Chapters

  • *00:00:00* – Introduction to Query Monitoring Tool
  • *00:07:17* – Detailed Execution Plans and Metrics
  • *00:10:29* – Conclusion and Future Updates

Full Transcript

Erik Darling here with Darling Data, bringing that big monitoring tool mogul energy that the SQL Server world was so clearly missing. I want to talk today about some of the query performance tabs that are in play, at least in the full dashboard. The full dashboard actually has one extra tab that the light dashboard does not, and that’s the query trace patterns. So that’s, like, sort of a custom trace that I spin up in the background. It’s, of course, completely optional. You can turn it off, you can get rid of it if you don’t want it.

But something that I use to sort of, like, as a sort of, like, safeguard to, like, make sure that I’m actually getting some stuff that other things might be missing. But it collects from all the normal sort of data sources that you would look at when you’re doing a performance analysis. So, like, you know, like the query stats, procedure stats, trigger stats, function stats.

It goes to QueryStore. Now, a lot of paid monitoring tools out there do not touch QueryStore. And quite frankly, the vendor should be embarrassed, right?

It’s 10 years old and you haven’t put it in your product yet. Just close down, right? Just stop, right? Like, you don’t care, right? You’re coasting, right? It sucks and people pay you for this, right? So they, like, made me angry enough to do all these things.

So the tabs that you’ll see in the full dashboard are these. You’ll see all these except the query trace patterns one in the light dashboard. I mean, slight differentiation.

This is just something that’s a little bit easier to manage with the store procedures and stuff that I have in the full dashboard that run everything than it would be to do with the light version. So just a small difference there.

But like I said, like, all the query grids that you see will have a pop-up, like a drill down menu. So if you double click, you can get more information from them. There’s also all sorts of stuff in there that you can do that will give you sort of an experience, like looking at the QueryStore GUI, where you can, like, change metrics, except I don’t actively hate you and I want you to enjoy looking at things.

So, like, you know, I try to be a little bit kinder. Anyway, the QueryStore subtab, we also have a QueryStore regressions tab, right? So there’s one that just looks at general QueryStore stuff and one that points out queries that have, like, regressed in some way over, like, a period of time.

So you do get some of that, like, like, oh, cool. Like, like, yeah, that thing did get worse. Like, I should go look at that.

And then you can, like, see all the plans for stuff that got worse. But a lot of the DMVs I discovered are just sort of cumulative over time. So if you just keep logging data from them, like, the numbers just kind of keep going up.

And if you, like, it sucks, right? Because you just keep, like, oh, yeah, it’s like, oh, my God, it was 60. And now it’s 60.

And now it’s seven. Ah, crap. So I calculate deltas for some of the DMVs that are cumulative. Like, QueryStore doesn’t really have this problem, but some of them do. And you have to do some math.

And I got some help with the math because I am most assuredly not a math person. Where I do some sort of per second rate normalization, which means that I basically divide the change between one collection and another for a query or a procedure or something. And sort of, like, they divide it by the elapsed seconds between collections, which gives me sort of a consistent line, even if collection intervals vary a little bit.

But it makes comparisons a little bit more meaningful when you’re looking at time ranges. So, again, if you want to check all this out, it’s at code.erikdarling.com. It’s in the performance monitor repo.

If you want to download it, it’s all going to be in the releases section. That’s where all the zips are that have the EXEs in them. So, you can go in there.

You can get it. You can try it out. And, again, if you go to unzip it and you can’t see an EXE, sometimes that’s Microsoft security stuff. You might have to, like, unblock it to get it to show. But let’s go look at the actual dashboard because I’ve been talking about looking at PowerPoint too long.

My eyes hurt. So, when you first go in to look at queries, you’ll have the performance trends graphs. Apparently, I really am turning into a developer because I have not executed a stored procedure in 24 hours. I don’t know what’s wrong with me.

My head’s spinning. But, sort of behind me here under this armpit, this is where you just see sort of query executions a second. So, like, spikes and, like, when, like, activity really ticked up. That’s this graph over here.

Apparently, we had a real high point at 4.5 a second. Whew! Yeah, I’m thrilled with that. And then we have query durations from, like, the plan cache over here and from query store down here. You can see that there is some stuff that the plan cache got that, or rather, the query store got that the plan cache didn’t, which is fascinating to know about, right?

Plan cache is unreliable. That’s why I prefer query store. This is why monitoring tools that don’t touch query store should go just, like, head in the oven goodbye.

Over here is the active queries tab. This basically just runs spwhoisactive and logs it to a table. You know, like, all the mechanics of that are boring.

Ah, crap. Sorry, add a mechanic is not boring. The mechanics of how that works is boring. So, if you want to look at what’s currently running on a server, you have a current active queries tab. So, if you push the refresh button, this will actually run one of the queries that I use in sppressure detector to just, like, execute against the server, grab whatever’s running now, and pull it back.

This is just a little bit easier, because, like, if I wanted to do that for spwhoisactive, it’s, like, you start thinking about, like, exposing all the different parameters, and then you have, like, a dynamic column list, and, like, hey, it gets to be too much. So, like, I just wanted to give people, like, a fairly simple, like, what’s running right now query to sort of back up what shows up in spwhoisactive. Again, there’s nothing running on my server right now, but, hey, I pushed the button and it didn’t crash, so we’re in good shape, right?

We have the query stats over here. This is what’s pulling from the plan cache. You know, you can sort this any way you want, like, up and down by any of these columns.

If your, you know, code came from a stored procedure, which clearly mine in the last 24 hours has not. I am fully ad hoc developer. You know, the object will show up in here.

But, you know, you get all the sort of information that you would want from all this stuff, right? Like, you know, like, CPU, reads, memory, DOP, like, everything that you would want. The query text.

And then if you go and you right click on something in here, right? You know, I showed this yesterday, but I’m so excited about this, I’m going to show it again. If you go in here and you say, like, view plan, you can see the estimated plan for something, right?

And, like, again, like Microsoft making that MS SQL plugin for Visual Studio Code really made all this very easy. Because I had all the assets I needed for the icons and all the XML information I needed. And so, like, I could get, I could really just plug this in basically.

If you come over here and you click get actual, I’m going to do this for a slightly less annoying one. Say get actual plan. Then this thing will execute the query against the server.

It will throw away the results and it will produce an actual execution plan, right? And so, here it is. And, again, you get all of the execution numbers against it. Now, something that I’ve, you know, I’ve said Microsoft should do for a long time.

Is when you get an actual execution plan and you have, like, all they show you are the durations. I also expose the CPU time, right? Because you get CPU counters per operator as well as duration per operator.

So, I expose CPU time in there as well. Just to zoom in a little bit closer so you can see, right? Like, the top number here is the normal duration number that you would always see.

But underneath you have the CPU time, right? So, like, this, this operator, you know, let’s just pretend it ran for one and a half seconds. The CPU time is 1.13 seconds because we had some time spilling, right?

So, that’s what that indicator up there shows you, right? That we, like, you know, something happened. So, we have that and then coming back to queries, though.

Procedure stats, again, I haven’t run any procedures lately. So, that’s empty. Query store shows you all the query store data. Now, if you click on a query row in here, like, we can saw that one had, like, 60 executions for it.

This right here is what you get when you click in and drill down. So, you can see, like, all of the, like, per execution things in there, right? So, every time this query executed, like, how long it, like, all the different execution metrics.

This query apparently hasn’t changed really since at all since it did anything. So, that’s nice. But up here in the top corner, this is where you can change sort of how the graph shows things, right?

So, you can change it to any one of these and you’ll get a slightly different view of it. There’s no difference between CPU and duration. So, not a lot of fun to look at, to be honest.

But, the drill downs are the important thing here. When you double click on them, you get all of the query executions, right? So, that’s important, right?

You might want to see more than one thing there, right? But then, coming a little bit further, this is query store regression. So, you can see, like, some query that did worse over time, right? You can, again, you can drill down into any of those.

And then, query trace patterns is just stuff that I capture with my sort of special, like, performance trace of, you know, bad queries. Anyway, that’s sort of an overview of all the different query tabs in there. Again, if you want to check out this completely free open source SQL Server monitoring tool, it’s at code.erikdarling.com.

Again, go to the releases section, download a zip. There are EXEs in it, I promise. If you don’t see them, your security software is messing with you.

So, you should do all that. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where I will continue talking grandiosely about this wonderful monitoring tool of mine. It is completely free and should hopefully put every other monitoring tool company that makes you pay for their garbage out of business.

Alright, thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Dashboard Overview: Server Health, Query Plans, and More!

Free SQL Server Performance Monitoring: Dashboard Overview: Server Health, Query Plans, and More!


Summary

In this video, I delve into the exciting world of free SQL Server monitoring tools, specifically focusing on a dashboard I’ve developed called Performance Monitor. This tool is designed to make life easier for DBAs and developers by providing real-time insights into server performance, resource usage, blocking issues, and more. With a user-friendly interface that includes various tabs like an overview, resource metrics, and query analysis, the dashboard aims to simplify complex monitoring tasks. I also highlight some of its unique features such as the plan viewer, which leverages Microsoft’s SQL Server extension for Visual Studio Code to provide detailed execution plans with actionable insights. Whether you’re a seasoned DBA or just starting out, this tool can help you keep your SQL Server environment running smoothly and efficiently.

Chapters

  • *00:00:00* – Introduction to SQL Server Performance Monitoring Tool
  • *00:07:39* – Detailed Walkthrough of Query Tabs
  • *00:15:00* – Downloading and Installing the Tool
  • *00:16:55* – Exploring the User Interface <– Thanks, AI
  • *00:21:48* – Advanced Features and Future Developments <– Thanks, AI
  • *00:25:37* – Conclusion and Thank You <– Thanks, AI

Full Transcript

Erik Darling here with Darling Data, here to talk more free SQL Server monitoring, make all the people of the world happy, or at least the SQL Server people of the world happy, as happy as they can be, being SQL Server people. So, I want to give a little bit of an overview of the dashboard. I’m also going to, at least at current, you know, when this video publishes, these features might still be in dev, but if you’re watching this well after the fact, they’ll, you know, just look at the date on the video, you can see some cool new stuff that I’m going to be releasing. So, the full dashboard has a bunch of different tabs in it, because people love tabs, graphs, and all that stuff. I’m not saying that, like, you know, it’s for everybody, but, you know, a lot of people who like monitoring tools like this sort of thing, or need monitoring tools, they’re like, oh, look at that pretty line that shows me where everything went wrong. So, I have, like, an overview, a bunch of stuff in it, of course, a lot of stuff about performance, resource usage, how memory is getting used, a bunch of stuff about blocking and deadlocks. And then, my store procedure, SP Health Parser, which goes into the system health extended event, populates a lot of system events, but I’m going to tell you something, if these graphs are empty, it’s a good thing, right? Because you don’t want, like, ideally, like, you want the system health extended event to be like, I got nothing, we’re healthy, we’re good. No news is good news.

Also, you know, kind of, you know, some standard stuff, like a Nox style landing page, sidebar with all your servers in it, and you can pick favorites if you want. Everyone has favorites. And, of course, a whole bunch of ways to alert you. There’s email, there’s, you know, like, little pop-up alerts, stuff like that. There’s also, like, little alert badges on different charts and graphs when we find bad things. So, like, the Nox style health cards, they basically just tell you, like, kind of like a short story overview of what’s going on with the server. Is it up? Is it online? Like, CPU usage? Is there blocking? Stuff like that. And then all of the, like, the overview tabs that you see, like, all the other tabs that you’ll see going through the dashboard will break out other resources, usage, queries, stuff like that.

There are a bunch of UI controls that you can use to look at different time periods. I tried to put some simple ones in there that help you sort of figure out, you know, just, like, last one hour, 12 hours, 24 hours, week, 30 days, stuff like that. Because, like, no one likes fiddling with, like, you know, like, calendar pickers for everything.

There are calendar pickers in there in case you need to fiddle with those things. But in general, people kind of want, like, like, show me the last day or something. Like, show me, you know, the last, like, since, what happened since this morning.

And there’s also a bunch of stuff that you can do to get data out of there and, you know, things like that. So, like, when you right click, you can, like, export stuff to CSVs, you can save graphs as pictures, you can copy cells and rows. Again, you get, like, system tray and toast notifications when things go wrong, like, in the middle of whatever.

And there’s also a way via the dashboard to edit schedules so you can manage how things, like, what cadence things get collected at. I put some pretty aggressive defaults in there because I wanted, like, it’s like, like, let’s start, create, let’s just get stuff. And then you can tone that down based on your environment if you want.

Like, I have some stuff that I would be like, look, if we’re going to figure some stuff, if we’re going to figure out some performance issues on your server, we’re going to have to collect some data, right? Because you can’t see a problem, you can’t solve a problem. But, so, I start pretty aggressive there.

But if you want to tone that down, you are absolutely welcome to. If you want to check out any of this, go to code.erikdarling.com. Everything is in a repo called Performance Monitor. The way to download it, because I’ve gotten a lot of questions about it, the way to download it is you go to the Releases page.

It’s a pretty easy to find link at the repo called Releases. There are zips for everything. There’s the Lite dashboard bundled separately, the installers, and, of course, the full dashboard, which is what we’re going to be looking at today.

So, if we come over here and look at stuff, this is basically what you’re going to be greeted with when you first open it. It looks like this. So, you have these server cards that show you sort of everything that you kind of want to know just at a glance about a server.

Like, what’s CPU? Like, how many threads are in use? Like, what’s memory allocation look like? Ah, I got a little thing there.

That’s hanging off the edge a little bit. I’ll fix that. But this tells you sort of the difference between, like, the buffer pool and query memory grant. So, if you have queries using a bunch of memory on there, it’ll let you know.

Also, like, when the last time you had blocking or deadlocking was, and if your collection health is okay. So, that’s all that stuff. There’s also an alert history tab.

So, if, like, I haven’t had any alerts fire off in the last 30 hours, but if I change this to seven days, we’ll see, like, a history of alerts that went out. You know, you can dismiss these and, you know, do what you want with them. But, you know, I just think it’s nice when people say, hey, like, hey, like, maybe I was away from my computer for a little while.

Maybe I have an email, like, let’s set up email alerts. Maybe I just look at what fired off that I might have missed. But once you dig into a server, once you open up one of the server tabs, this is what life generally looks like.

You know, like I said, there’s a resource overview there. So, you get, like, CPU. And if you hover over this stuff, you can see all the little numbers and, like, you know, what was going on and what your weights were.

So, like, just sort of a general overview of things. There are other tabs. They’re not going to have anything interesting for me in there right now. But, like, you know, like, this will give you a summary, just a one-line summary of, like, if your server is in normal health, what your total weight was and how much wait time was spent on it.

How many, like, you know, like, long-running queries and deadlocks and blocking and stuff we found for that. And then, you know, like, I’m going to be expanding on this. But this gives you sort of a breakdown of, like, critical issues on the server.

It basically runs the code from my perf check procedure and looks for terrible things happening on your server and reports about them here. I have some stuff from the default trace in here as well. So, like, some of the, like, high-level stuff that comes out of there.

Also, a tab that tells you about your current server configuration. That’s at the server level, database scope configuration stuff, and any trace flags that someone might have enabled. This will log configuration changes.

So, at the server database and trace flag level, if anyone has changed anything about the server, that’ll show up in there because we’re just checking to see, like, every day or so if things change. Collector health tells you how things are going, like, if anything’s failing, getting messed up. And then the running jobs tab.

This tells you about active agent jobs and stuff like that. So, like, if you’re, like, come in and you’re, like, why is CPU crazy? You can come in here and look. You can say, oh, well, this, you know, this stupid index maintenance job that the dummy DBAs won’t get rid of is screwing things up. So, I’m going to start backwards a little bit here.

System events. Again, this is all the system health stuff. And, again, you don’t want to see things in here. Like, I have some stuff in the severe errors tab because, like, you know, it’s killing queries at some point. But all the other stuff on this server, at least, is pretty clean because I’m not, like, I just not collect, like, there’s just nothing to collect in here.

The locking and blocking tabs. This will talk about, I mean, I don’t have anything in the last 24 hours, but I do have some lock waits. I just didn’t have any blocking.

Right? So, like, I don’t, like, there’s nothing showing up in here, which is fine. That’s okay for me. But down in here, I’ll get these populated when I’m going in closer on those. But then, like, this will be the deadlock XML report.

This will be the block process report. And this is just sort of blocking trends where there’s, like, you know, how many blocking and deadlocking events we had and then, like, the duration of them. Under memory, there’s all sorts of stuff about memory grants, memory clerics, the plan cache.

If there was any memory pressure going on in the server, it would get noted here. And then, under resource metrics, there’s some neat things, right? So, this is, you know, like, CPU, 10 dB, and, you know, like, some perfmon counters in here.

Just to kind of get an idea of, like, server busyness, like, you know, anything weird going on. I got wait stats over here so we can, like, you know, get a good idea of, like, you know, sort of, like, how, like, did anything spike up? Is anything bad?

Like, what should I look at? What, you know, what time frame is, like, would matter to me? You know, like, usual 10 dB stuff, 10 dB usage by different consumers, 10 dB latencies, like, all the file I.O. latency that you’d be used to seeing. You can see, like, the different database files down here and the different log files, different database files here and here.

And, you know, this is just read and write. Under perfmon counters, something that I just added, which I’m very, very proud of, because I am always forgetting exactly which perfmon counters I should be using for different things.

I’ve added these sort of, like, perfmon packs or, like, perfmon templates. So, depending on what you’re looking at, right, you might care about, you know, like, memory pressure, CPU, I.O. pressure, 10 dB, locking and blocking, stuff like that. So, if you want to explore a specific scenario in here, you can change these and it’ll select all of the perfmon counters that, you know, you would normally look at for this without you having to go through a whole list and click all of them.

Because that’s annoying and you’re going to forget and you’re going to miss one or, you know, I don’t know, maybe you do this when you’re drunk and forget stuff. But, you know, there’s just all sorts of things in here that make sense to look at and, you know, just to dig in from a perfmon level. This is on the dev branch now.

This will be out sometime this week. No promises because I’m still kicking stuff around. But the thing that I added in here that I’m real, real proud of is the plan viewer. So, Microsoft released the MS SQL extension for Visual Studio Code, which is MIT licensed.

And it just happened to have, crazily enough, a whole bunch of assets that I could use for graphics and also for, like, you know, making sure the XML is all in there correctly. Right? Like, the XML is all documented.

I can make sure that what I get from the XML is exactly what Management Studio would get from the XML. And I can add my own stuff to it. I am a hog in heaven because all the, like, you know, plan analysis and indicators and stuff that I would want to see when I’m looking at a query plan, I can put into these query plans. Right?

So, check this out. Right? So, like, if you right click on any one of these rows and you say get actual plan, you’ll get a little warning. It says you’re going to execute this query. Now, this does something that I always loved a lot from SQL Sentry, which was you can execute a plan.

It’ll throw away the results, but it’ll give you the actual execution plan. If we zoom in a bit, you’re going to see that it has all the execution numbers. I break things out a little bit further.

So, you get the duration number that you’re used to in SSMS, but you also get the CPU numbers, right? So, if we look over here a little bit, we can see, like, this was 633 milliseconds of wall clock time, but 3.366 seconds of CPU time. Now, if you’re looking at the little yellow bubbles up there, those are parallelism indicators.

I chose not to put those over the icons for one reason. If we have multiple things on a plan that I want to show you, like, down here is a good example, right? We have, sorry, the zoom is a little slow on me.

We have a warning, and we have the parallelism indicators. You can see them both, right? So, it’s not like, oh, was that parallel or not? Is it just a warning? What happened?

Right? And if, like, but, you know, this plan is kind of boring, right? Now, we have, like, a, you know, missing index suggestions up here. We can actually see both of them, right? Isn’t that crazy, right? You can see all the missing index suggestions.

It’s wild. But the real thing that I wanted to show you is in this plan, right? You know, this thing runs for about 20 seconds. So, I didn’t want to sit there and make you wait for 20 seconds. But this has two things in it that I have been dying for SSMS to have that Microsoft just wouldn’t give you, right?

And that is, if we zoom in a little bit, right? Let’s get this over here so it’s sort of a little bit more out in the open. We have an eager index spool.

And if you hover over the eager index spool, it has the index that you should create to get rid of the eager index spool, right? So, like, there’s a warning here that’s like, yo, SQL Server is creating an index for you. Here’s the index it’s creating.

If you go create that, this thing goes away, right? And then over here on the clustered index scan, because something I always talk about when I’m talking about eager index spools and parallel execution plans, is that they make, is that like when the spool is getting built from the clustered index scan, SQL Server only uses one thread to do it.

So, when you get an actual execution plan and you have skewed rows on parallel threads, I can warn you about that, right? Thread 5, process 100% of the rows, right? This is fantastic.

The amount of stuff that I can put into this to make your lives better and easier is going to blow your mind. I have way more stuff lined up. So, you know, keep an eye out and all that. But otherwise, these query tabs, you know, they kind of give you the normal set of information that you’d want to get.

You know, like all this, like, you know, sort of like resource uses for the queries, the query text. You know, if you want to just say view an estimated plan for something, you don’t have to get the actual plan. You just click on view plan.

You get all that. You can save the plan if you want, right? Like all this stuff that you can, like, you know, normally do with SSMS. And this should be a pretty SSMS-like experience, right? At least from viewing it, like the perspective of viewing an execution plan.

Another thing that I like that I could put in here was warnings on filters, right? And, you know, stuff like, you know, like filter up, like just warning you, like, hey, if there’s a filter, you’re sometimes leaving this, like, you know, like, granted, this is like a query store query. So what am I going to do?

But like I can warn you, like, hey, you know, that filter operator that discards rows, I have to like way later than you maybe want to. So I can like bring, I can point that stuff out. So this is just a small, like a, like intro to some of the stuff that I’m building into this. Again, it’s totally free, totally open source.

I have received one single $25 contribution from some nice person who was like, wow, you’re doing a lot of hard work on this on GitHub. So thank you, JiHuan. I apologize if I said your name wrong.

If you’re out there listening, you can give me a phonetic typing of it somewhere in a comment, maybe. But anyway, I’m getting pretty happy with this, where it’s going. Again, I’ve got a lot of other stuff lined up that I’m not quite ready to talk about yet, but soon enough, soon enough, soon enough.

Anyway, again, if you want to check this out, it’s at code.erikdarling.com. You can download it from the releases section. It’s where all the zips are.

If you download a zip and there’s no EXE in it, sometimes that’s Windows blocking stuff. You might have to unblock that, but it should all be pretty easy and straightforward once you go through the readme, which I have carefully crafted to make your life much, much easier.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something, and I will see you in tomorrow’s video, where we will talk some more about this wonderful, free SQL Server performance monitoring tool. All right.

Thank you for watching.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Full Dashboard Installation

Free SQL Server Performance Monitoring: Full Dashboard Installation


Summary

In this video, I delve into the installation process of my free SQL Server performance monitoring tool, addressing a common issue where users have questions about setup despite having access to detailed documentation. I explain that there are two main installers: one for command line use and another with a graphical user interface (GUI). The command line installer offers various options such as reinstalling the database, choosing encryption levels, managing server certificates, and handling password input in different ways. Additionally, I cover how the tool runs through 52 installation scripts, including community dependencies like SP WhoIsActive and my own scripts for data collection and analysis.

Chapters

Full Transcript

Erik Monitoring Tool Mogul Darling here, back to talk to you about, well, my free tool that monitors your SQL Server’s performance. And we’re going to talk a little bit today about installation because I’ve actually moved this up in the lineup because I have gotten a bevy of questions. I have a readme file that documents all this stuff very nicely, but I keep getting questions about installation. things. People are, like, impervious to reading, right? Not even comprehension just won’t do it, right? So here’s a video where we’re going to talk about installation options. So there are two installers. There’s a command line installer and there’s a GUI installer. That’s a graphical user interface. The command line installer has options where you can choose what you want to do. Like you can reinstall, which will drop your old database and make a new one. You can choose your encryption level, right? Which that’s optional, mandatory, strict. You can choose to trust or not trust the server certificate. And you can choose how to hang, how to put in your password. There is an environment variable that you can use if you’re uncomfortable sending your password across a connection like this. You can also use Windows authentication. There are many wonderful things that you can do with this.

I promise you. I promise you. I promise you. I promise you. It’s there. Wait. Those words just waiting to be read. But the performance monitor database runs through, well, 52 install scripts and then some extras, right? So I’ll show you what that means in a minute. It will also go and grab some community dependencies, right? So, you know, like I wanted to make sure that this was a very community oriented tool and that people would get some behaviors and some results back that look like what they might find in community tools with a whole lot of pretty like graphs and charts on top of them. All of these collectors run off agent jobs. The collection job runs every minute. Not every collector runs every minute, but the like some of them do, some of them don’t. But the collector agent job runs every minute, goes through and figures out which like what schedule each collector is on. And then if that schedule meets the criteria, then it will run them.

There’s a retention job that will run at 2 a.m. And look for old data to get rid of. And then there is a third agent job that looks for the monitoring tools agent job being stuck. And if it’s stuck, it’ll kill it and say you did a bad job. You have to start over again, right? So that’s how that works. Behind the scenes, we create a performance monitor database. That’s where all the tables and views live. This is again for the full dashboard. The light dashboard is very different. We talked about that in the previous video.

And then all the collector store procedures that go through and look at your metrics and job monitoring, those will all get installed. There’s a there’s a neat store procedure that calculates delta frameworks, right? So we need to like, like, like, like charting raw data is kind of weird. So we do all this. We I do all this in my head. It’s like, what’s the delta?

There’s a store procedure that does all the delta calculations for things that we want to graph so that we can we get an accurate view of what change so that the spikes in the graph makes sense. Again, the community dependencies we SP who is active goes in there all the the SP blitzes all get installed for you and all of my scripts. And this is just in case you want to use them. The performance monitor does use some of them for some stuff like SP who is active goes for the active queries.

Of course, SP blitz lock, which I wrote most of at this point that shreds the the deadlock XML and then my store procedures, human events block viewer and health parser also do some work in there. Again, we have the three agent jobs responsible for collection, data retention and looking to make sure that those jobs don’t get stuck. There’s a bunch of like metadata tables that keep track of like installation history and stuff in case you’re on the upgrade path and then but 55 scripts in total get executed.

But there are three of them that are sort of not involved with that. But let’s take a quick look at what the installer looks like. So this was this is this is an installer run that I did where it just completed.

It’s not fun to watch like, you know, a log fill up live, but I can I can hit install again and we can we can watch it go. But this was SQL 2016. I keep instances going back for all the supported versions so that I can test stuff and make sure it works.

And on this main screen is where you put in the server you want to connect to you want you choose between Windows and SQL off. And if you, you know, I use SQL Server authentication because I’m lazy and I have VMs. And this is also where you would use the SSMS like settings about encryption and trusting the server certificate.

So this just like the monitoring dashboards uses Windows Credential Manager. So you are not getting anything like like this is the same connection methodology as SQL Server Management Studio. If you don’t trust SQL Server Management Studio, fine.

But if you do, if you like, OK, SQL Server Management Studio, you can connect to my SQL Server. This uses the same thing behind it. So there’s nothing weird.

There’s a big warning down here, right? The big warning says if you perform a clean install, we are going to drop the performance monitor database and start fresh. There are a variety of reasons you may want to do this.

If you’re on the upgrade path, you probably don’t want to do this. But there might be a reason for you to, right? There’s all sorts of things that you might don’t want to say, let’s just start this over with. So there’s that.

There’s a somewhat newer option to reset the collection schedule to recommended defaults. The reason why this is here is because as I’m working with the monitoring tools and I’m kind of tuning and tweaking things, I might change the collection schedule now and then when I find that certain things work better at certain cadences.

And so if you want to use the cadence that I recommend, you can hit that button and you’ll get the schedule that I recommend. You are free to not check that button and keep your schedule. I am totally cool with whatever you do with it.

And then there’s another checkbox here to run some validation tests. The validation tests basically just run all of the stored procedures in there a first time to make sure that everything runs without error. Because if you run into something there, then you’re going to run into that probably every single time the collector runs and you don’t want to do that.

So you want to make, if there are any errors, report them on GitHub. Let me know. I will do my best to fix them. But down in here is where we sort of walk through everything getting created and all these things happening.

And at the very bottom, like this is like, again, just, you know, installing up, like installing all the stored procedures. Down here is where we start installing the community dependencies. Right. So everything that, you know, who is active, my darling data, and then the first responder kit scripts.

And then this is where we execute the validation thing. And if anything weird happens in here, well, let me know. I will do my best to fix it.

And then, of course, you get, you know, all of this sort of, you know, stuff in here. Right. So if anything happens, it tells you where the log is. So if you get an error and you want to report it to me on GitHub, you can include these helpful details.

I would love it. Like, you know, more details are better. So put everything you can in there.

You can maybe leave out like your, you know, computer name, like whatever, whatever this is. But, you know, everything else in there is good. And then you will get sort of like an installation summary over in the corner that I’m helpfully standing in front of.

There are other buttons. There’s the, oh, I got to move this way. There’s the install button.

There’s the troubleshoot button. And there’s another button that says view report, which will bring up a text file with all of your log information in it so that you can share it with someone who might care to troubleshoot it like me. There’s also a close button over under this armpit.

Hold on. This armpit where it says I’m never going to get this mirror camera thing right where you can close things. And that’s how you exit the installer. So that’s pretty much it there.

Again, all of these options are available in the command line as well. And I highly recommend that you read the documentation about these things before emailing me the same question over and over and over. All right.

Thank you for watching. And I’ll see you in tomorrow’s video where we’re going to do a bit of a deeper dive into some of the charts and graphs in our wonderful free open source SQL Server monitoring tool. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 54

SQL Server Performance Office Hours Episode 54



 

To ask your questions, head over here.

Summary

In this video, I dive into some of the most pressing questions submitted during my office hours session, covering topics like `sp_whoisactive` and its parameters, memory grants and their management in SQL Server, query performance variability, and when to step in with manual interventions. I also discuss the nuances of parameter sensitivity and how it can affect query performance. Whether you’re a seasoned DBA or just starting out, these insights should help you navigate common challenges in SQL Server performance tuning. If you found this content valuable and want more, consider supporting my channel by subscribing, liking the video, and sharing with your colleagues. And if you need personalized assistance, I’m available for consulting services—no monitoring tool mogul lifestyle here!

Chapters

Full Transcript

Erik Darling here with Darling Data, and we’re going to take some valuable time away from my new career as a monitoring tool mogul to answer some of your very important office hours questions, because, I don’t know, why not? I still like you. You know, I have not went and joined the monitoring tool glitterati. We’re partying with starlets in limousines, with champagne, and I don’t know, whatever else people do in limousines that probably just make the toes curl. Anyway, down in the video description, you’ll find all sorts of helpful links, and you’ll also see useful things like, I don’t know, what do you call it there, summaries and chapters, which the robot sometimes gets right and doesn’t hallucinate the videos being two and a half hours long, and other times, other times, it’s not so good. Anyway, if you would like to hire me as a consultant, I still, you know, again, my monitoring tool mogul career has not quite eclipsed my consulting career, so if you would like to hire me to help you with your SQL Server performance problems, I am available in all sorts of ways, and you can see the ways that I’m available down in the links.

You can also find ways to purchase my training, become a supporting member of the channel. If you like this content so deeply and feel strongly motivated to spend $4 a month, you can do that. You can also find links to ask me the office hours questions, and of course, as always, if you like this content, but, I don’t know, perhaps you’re not financially motivated to give me a little hug and kiss, you can always like, subscribe, and tell a friend, so some channel numbers go up. And I can feel some sense of joy in my day-to-day life, aside from watching the downloads and GitHub of the monitoring tool, which, speaking of which, if you want to check that out, it is completely free, it is completely open source. There’s no email requirement, there’s no application phoning home to tell me all about your dirty server secrets.

It’s just a whole bunch of T-SQL collectors running on a schedule, collecting all sorts of important performance data, supplying you with a Nox-style dashboard and all sorts of pretty charts and graphs, and there’s even, if you’re, you know, becoming one with the robots, there is even an opt-in, optional, built-in MCP server, so you can ask questions about your performance data and see what the robots think about your server crisis. And it’s a wonderful little thing, so maybe the finest thing that I’ve contributed to the SQL Server community, aside from my handsome face, which will get better looking when I officially become a monitoring tool titan and get a whole bunch of crazy plastic surgery. If you’d like to see this handsome face out in public in the world, I will be at a variety of events coming up over the next portions of time.

Data Tune, down in Nashville, March 6th and 7th. Data Saturday, Chicago, March 13th and 14th. SQL Day, Poland, May 11th to 13th.

And, man, jeez, that one’s a dash, the other ones are ands. Whoa, why are these? I gotta beat up my PowerPoint guy. And, of course, Data Saturday, Croatia, June 12th and 13th.

I will be doing pre-cons at all of these events about advanced T-SQL. I don’t know, maybe I should do advanced SQL monitoring at this point. Might be fun.

But you should go to all of those. Each and every one of them does travel around the world catching me. For now, we are still trapped in the miserable doldrums of February, waiting for March to arrive and springle our springs. But, man, I hate it outside right now.

I’m miserable everywhere I go. So, let’s stay inside and answer some questions. What do you think about that, huh? All right. So, let’s see.

SP, who is active? Do you like at getplans equals 1 or at getplans equals 2 and why? I threw in an and there. I apologize for my editorial interjection into your question.

But I usually just run with getplans equals 1. Getplans equals 2 is a little known thing. Getplans is not a bit.

It is, I believe, a tiny int. But you can choose getplans equals 1 or getplans equals 2. And I’m going to get maybe a little deeper on this than most people will care about. If you are running, I would say, an older version of SQL Server, getplans equals 2 would return all of the plans associated with a batch.

So, kind of like if you look in the plan cache and you look for a stored procedure and you open it up and it’s all of the query plan for the stored procedure. It’s like a big, long thing.

getplans equals 2 would give you that. getplans equals 1 would just get you the query plan for the query that’s currently running on the batch. However, if you’re on a newer version of SQL Server and SP who is active prefers the new query plan DMV that gives you the sort of in-flight query plans or like the live query plan, it just returns the statement level 1 anyway.

So, me, I just go with getplans equals 1 because usually the only plan I care about is the plan that is currently up for the query running. You might find different uses for getplans than I do.

That’s totally okay, but that’s my thing. All right, question number two. How do I know when it is time to reduce the memory grant limit on the default pool below 25%?

Is it determined based on how much RAM I have or is it some other factor like memory weights? Well, I will typically start messing with this as soon as RAM becomes an appreciable number, somewhere around the 128 gig mark.

And the reason for that is because SQL Server by default and what this question is really asking about is using Resource Governor to reduce the default max memory grant for queries.

And you can do that with Resource Governor. And you can even do that with standard edition of SQL Server 2025, which now has Resource Governor, which is so lovely.

But the reason why is because by default, 25% can be a pretty big memory grant. That’s 25% of your max server memory setting. And boy howdy, the more memory you give SQL Server, the bigger that 25% gets.

Like 25% of 128, 25% to 256, 25% to 384, 25% to 512. Like, oh, those numbers get up there. So really the more memory you have, the smaller you want to make this.

SQL Server will ask for memory grants pretty heavily. Getting this to the right number is a somewhat different exercise because it, well, it sounds good to just drop this number down.

You know, you do have to start keeping an eye on queries to make sure that like, you don’t have important ones that start spilling off the disk. So like there is some caution that needs to be exercised here.

But one, like, so like, well, it is a good sign to look at memory weights like resource semaphore to figure out if this, you know, does this apply to me? The other thing to look at is that even if maybe, even if you’re not hitting resource semaphore weights, you might still have queries asking for pretty big memory grants.

Like even if you don’t have multiple queries fighting over memory grant space, because by default, SQL Server will give about 75% of your max server memory setting out to a bunch of queries running.

Even if you don’t have that happening, you might have queries asking for those 25% memory grants and knocking a whole bunch of important data out of your buffer pool, which is not a good time either.

So, you know, well, memory weights like resource semaphore are certainly fantastic indicators of queries having memory grant clashes. You should also keep an eye on those page IOLATCH SH weights because you may have queries that are asking for memory grants, knocking a whole bunch of data out of the buffer pool, and then other queries have to go read that data back in.

Remember that SQL Server does not work with pages on disk except to read them into memory. All right. Let’s see here. Oh, another memory question.

Jeez Louise. People love memory, huh? If SQL Server shows plenty of free memory, how can queries still suffer from memory pressure? Well, I don’t know.

What’s that free memory doing? Is it really free? I don’t know. Did you download that memory? Is that why it’s free? I don’t know. Well, that’s a very vague question. The mind wanders a little bit on this one.

I would say that if SQL Server shows plenty of free memory, in other words, memory that is not dedicated to the buffer pool, then, or like other memory clerks and consumers within SQL Server, I would say that perhaps you have had queries ask for a lot of memory, clear a lot of that space out, and SQL Server may not have recovered from that just yet.

That’s probably what’s happening there. Where I would look is at what we were kind of just talking about in the above, with the above question. I would probably look to weight stats and I would see if you have resource semaphore weights going on and I would also look at weight stats to see if you have a lot of page.io latch underscore whatever weights going on because that might inform you as to whether SQL Server has plenty of free memory because a lot of it is not being dedicated where it should because it was dedicated somewhere else temporarily and SQL Server has not filled itself back up yet, primarily in the form of the buffer pool.

That’s the best I can do with that one. All right. Next up. When should I stop trusting the optimizer and intervene manually? Probably right about the time you start asking that question.

Honestly, that’s about when I would do it. No. So, I mean, what are you trusting the optimizer with? To make every query perfect? You know, it ain’t gonna happen.

You know, it’s, you know, the SQL Server has a fantastic query optimizer but, you know, nothing’s perfect and not all modeling is perfect. You know, models are models, right?

They’re not, they are not like exactitudes. They are not, they are not certainties. They are just models that we try to make predictions about and SQL Server uses its various cost modeling and cardinality estimation to try to, try to forecast what the best query plan for your query will be.

But, you know, it, like, maybe not everything makes it, is there for it to make a correct decision. You know? So, when should you intervene manually?

Like, via, like, query hints or index hints or other things like that? Well, that’s, that’s probably last along the way. Before, well before that, there are all sorts of things you can do to make your queries a bit happier.

You know, materialization with a temp table, that’s, that’s a good one. Making sure that your indexes are aligned to the goals of your query, that’s another one. Making sure that you have adequate constraints, whether they be uniqueness constraints or foreign keys or domain constraints to let SQL Server know values that can exist in a place.

You know, there’s all sorts of things that you can and should do. You know, and, that’s just from like the, the database side, from the query side, you know, writing sorgable queries, you know, again, breaking up complex queries.

There’s all sorts of things you can do, but there would also potentially be a manual intervention that, that are not just, you know, dirty, dirty tricks with hints of, of various states of brittleness.

So, you know, when should you intervene? When, when, when things start getting bad. If, if, if you need help figuring out interventions, I, my rates are reasonable, as always.

Don’t forget that. All right, the last question. Of course, this is, why does the same query sometimes run in milliseconds and sometimes in minutes?

My friend, you have asked the most biblical of questions. I could, I could go on and on for days and hours on this one.

Most, most commonly, you, you are dealing with an issue of parameter sensitivity. That, that’s a very, that’s a, probably the most common reason. There are, there are other reasons though that might lead to variable performance within queries in your workload.

Locking would be one of them. Remember SQL servers, default, up isolation level for databases is read committed in which read queries will wait for locks from write queries and get blocked and all sorts of, might, might, might take minutes for all those locks to release.

Another thing might be a buffer, buffer pool availability of your data. I mean, if you have to read lots and lots of data into the buffer, into the buffer pool from disk, well, yeah, that can certainly introduce a lot of query variability.

All right? well, but, but most, most likely it is an issue of parameter sensitivity and once again, if you need help figuring that out, I’m, I’m here.

I’m like, like the Maytag man just hanging out with my glass of milk, my pencil, my, my, my fancy hat waiting to help people. Anyway, that’s five questions I think, right?

I’m gonna make sure one, two, three, four, five. It’s probably about good for today. So, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I’ll see you in tomorrow’s video where we are gonna talk more about my, my, my fantastic tool that monitors your SQL Server’s performance.

So, I will see you there. All right. Goodbye.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Lite Dashboard Overview

Free SQL Server Performance Monitoring: Lite Dashboard Overview


Summary

In this video, I delve into the exciting world of SQL Server monitoring tools, focusing on the Lite version of my free open-source tool, which is a lightweight and secure solution for monitoring performance across multiple servers. I walk through how to download and set up the Lite edition, highlighting its unique features such as no server-side installations or databases, making it an ideal choice for environments like Azure SQL Database where agent jobs are used. The video covers the tool’s 20 collectors that run with minimal permissions, ensuring data collection is both efficient and secure. I also showcase the various tabs within the dashboard, including weight stats, query trends, CPU and memory usage graphs, blocking information, and performance monitor counters, all designed to provide a comprehensive view of SQL Server health without any heavy lifting on the server side.

Chapters

Full Transcript

Erik Darling here with Darling Data. And boy, I just keep getting more and more excited about this here monitoring tool journey that we’re on together. In case you have missed all of the other stuff around this, I released a free open source SQL Server monitoring tool that makes it very easy for you to get up started monitoring performance on your SQL servers. The Lite version, which is what we’re going to talk about, today is like my second child, you know, the full dashboard that I like originally started working on, had some limitations to it and also presented some things about it that would probably cause some friction in people using it. So the main limitation is not being able to support Azure SQL database because it creates a database and logs data to it using agent jobs. If that’s too much for you, that’s fine. This is what the Lite version is for. So that was like sort of the friction point in and the missing little bit of supportability in there. But the Lite version, if you head to code.erikdarling.com, I’ll show you that URL in a moment. All you have to do is download the zip for it, extract it to single executable with an embedded DuckDB database. I am an unabashed DuckDB fanboy. I think it’s about one of the coolest things to happen in databases in a long time. And so it just starts running a bunch of data collection, pulling into the DuckDB database and then showing it to you in the graphs. So you just extract it, point it to a server, hit connect. And if you trust SQL Server Management Studio or Azure SQL DevOps Operations Manager, whatever, that’s getting retired this month to connect to SQL Server, then you can trust my tools to connect to SQL Server, because they all use Windows Credential Manager. They do not store passwords and usernames and files or anywhere stupid. It’s all stored in Windows. So there’s nothing dumb happening, right? This is a very, very big deal to me. This is not some like half butted effort anyway. But then, you know, so try not to mess anything up from a security point of view. The Lite dashboard requires very minimal permissions. It just really needs like view server performance state, or if that’s not available, view server state. A max of seven servers will collect at one time. So if you have more than seven servers that you’re monitoring with it, seven will go get their stuff and the next seven will go get their stuff.

And they all have a timeout on them. Now, like I just redid the data collection for this. So that if you have servers that are timing out, they will not hinder collectors that are able to run. Before I had done something kind of stupid, and they would keep trying the servers, even though like some of the is this server online queries were failing. So that’s fixed now. So got less stupid. But the cool thing about the Lite edition, which is also sort of one of the drawbacks of it is there’s nothing installed on the server, there’s no database, no agent jobs. But that means if the app is not open and running, it’s not actively collecting data, right? So if you close the app, and you go away for two weeks, and you come back, and you’re like, what’s wrong with SQL Server, there’s not going to be anything in there for you do have to have it running somewhere to collect the stuff. But we’re going to talk through what the what it looks like, kind of how it works, the different things that it shows. And these are the 20 collectors that it runs. So it’s a lot of stuff. And we run the run these as much as possible. So that I make sure that you have the most up to date data. This is all very lightweight stuff. I’ll show you that I’ll show you that in a moment. And I’ll show you a couple areas where I’m working on improvements to. So again, if you want to check this out, I’ll show you a little bit more now. So this thing is currently running and collecting data. I started off around about an hour or so ago, and I spun up my hammer DB workload to make sure that we had some stuff to look at, and it wasn’t all boring. So this is the weight stats, and the weight stats will show you the top 20 weights. Really, it’s the top 10 weights, along with like the sort of the usual suspect weights, SOS scheduler yield, the CX weights, page latch weights, stuff like that, along with the poison weights to you see that over in this this part here, like I have thread pool, and I have like some other stuff that you would like normally see when things are like bad on a server. But like, like, I want to make sure that we represent like what’s actually happening on the server and also call out some of the scarier stuff if that’s, if that’s, if that’s happening to sometimes it’s good to see when that’s not happening. Over in the queries tab, we have some performance trends, right? So these are graphs that tell you like what was going on in a server, the query store one looks a little flat, because I didn’t turn on query store in the TPC databases that the workloads running in. So that one looks a little dull. But all the rest of these are very interesting. This is query executions, that’s procedure duration, that’s query duration. And over here, you can see queries that are actively running over here. And one thing that’s like you can, you can sort this data like any way you want, right? So like if you wanted to sort by like, I don’t know, like CPU, right, you can change the way that the data is sorted and get stuff that has the highest CPU.

And you can also filter. So if you wanted to get rid of stuff that like is greater than zero CPU, you can do that too. You have queries by duration here, that’s the default view. But you know, you can switch all these to order by whatever other thing you want. You can filter all these, these are store procedures. And this is all the stuff from query store. Now what’s neat about all of the query grids is something that you can do if you want to troubleshoot a slow query. You can right click on it. And you can hit copy repro script. And then you can go over to SSMS and paste it in. Now, this isn’t the greatest example, because there are some local variables in here that like we don’t have the data types and stuff of, but we do have the execution parameter that got used in here for the where clause, right? So like the DW, whatever ID, we would just have to figure out like, okay, what’s the data type is just run this query, right? So it does warn you about that up at the top too, right? So like, but other than that, it’ll give you everything you need to reproduce a query just copying and pasting it.

Now, coming back to the dashboard. Oh, wait, is that the right one? Yeah, it’s the right one. Okay, good. Not losing my mind. We also have some individual graphs for things like CPU memory usage. There are lines here. There’s also some information up at the top about like plan cache buffer pool, other things like that. Um, you know, like how much physical memory the SQL Server has, how much is currently using, um, you know, this, you know, like some nice stuff to sort of just see visually and not just have to like, oh, what is this thing? Um, one thing I actually forgot to mention is you can hover over any graph line you see and get how like the, the, uh, data, the data point from down to the legend that it represents and like some other stuff about it. Right. So, uh, we got information about 10 dB. This is like a current, like space usage, like what’s using a lot of space in there. So you can see if it’s user objects, internal objects or version store. And then you can, uh, down here, we have 10 dB, uh, file latency. So you can see exactly which files are spiking up when, right? And every once in a while you get lucky and there’s a spike and like what was using stuff and you know, what was, what was going on in there. Um, over in the blocking tab, uh, there’s not nothing going on for blocking right now, but we do.

I do have deadlocked. In fact, the deadlock alert just popped up and made my green screen angry. Uh, but we have some trends over here, which will tell us like, like about deadlocks that are happening and, uh, blocking happening. And then, you know, you, you get like the, the fully parsed out deadlock graph over here. So I can just make, put all much information in front of you at one point as you can, um, over here, we have perfmon counters. So there’s by default, I show batch requests, optimizations, compilations, and recompilations.

So you like, that’s usually the stuff that I eyeball first on a server. So, but you can add in, you can click in any one of these and add a new line to it. Like we just added in lock, wait time. And because there’s a bunch of deadlocking and stuff on here, uh, we get a whole new spiky thing going on in the graph. Um, we also look at running agent jobs. Uh, this one is actually seeing the performance, the full performance monitor database agent job running and doing stuff.

So we’ve got that going on, uh, in the configuration tab, you can see if anyone has changed trace flags, um, you get, uh, uh, database scope configurations, um, like, uh, database configuration stuff from sys.databases. This needs some more columns in it. I’m going to make a note about that. And then the general server configuration as well. Um, and then over here we see collection health where we can see the, like a summary of things that are running, how long they take on average, uh, a full log of everything.

So if you wanted to like, if you saw something like had, um, you saw that something had some number of errors in this column, you could come over here and see when it, when, when those errors happened. Uh, you could just filter down to the error message. And then over here are duration trends. So, um, you know, because I care about performance, um, like I am working on some of these, uh, sometimes they take a little bit longer, especially like the XML parsing ones. You have a lot of deadlocks or a lot of blocking or like, um, anything else that deals with XML and SQL Server. Sometimes it takes a little bit longer, especially when the server is under a lot of load from a hammer DB test.

So, uh, I clearly have some work to do, which means I’m going to end this video here and go figure out which one of these jobs I need to go have a talk to. Anyway, uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. Uh, again, this is all available for free at code.erikdarling.com. Uh, you should go try it out, start monitoring the performance of your SQL servers and have a good time. All right, cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: How To Talk To Your Performance Data With The MCP Server

Free SQL Server Performance Monitoring: How To Talk To Your Performance Data With The MCP Server


Summary

In this video, I delve into the built-in MCP server feature of my FreeSQL server monitoring tool, Darling Data. This feature allows users to interact with their collected monitoring data in a conversational manner using language models like Claude or other compatible tools. The MCP server is designed for local use only and does not expose any data beyond what’s available within the performance monitor database, ensuring that no sensitive information is at risk. By enabling this feature, you can ask questions about your server’s performance directly to an LLM, receiving quick insights into issues such as deadlocks, query performance, and workload spikes. This approach simplifies the process of diagnosing problems without requiring extensive manual analysis or script writing.

Chapters

Full Transcript

Erik Darling here with Darling Data. And in this video, I want to go over a little bit about using the built-in MCP server. That is a model context protocol server available in my FreeSQL server monitoring tool that allows you to have a little bit of a chat with your collected monitoring data. It’s not allowed to go out and start running random ad hoc DMV queries. It’s not allowed to, you know, you know, drop, delete, update, mess up, merge, merge anything. It’s not, it’s very tightly bound to the tools that it has defined and available to it and nothing else. Right? So, uh, I wanted to have this in there for people who are maybe not fully comfortable looking through monitoring tools, uh, maybe not fully comfortable, um, you know, like digging into things, figuring out what was actually wrong and when, uh, and, uh, you know, just maybe people who are maybe a little bit more geared towards, you know, like, or gotten used to over the last couple few years, like having chats with LLMs about things and figuring things out with them. Right? Cause I mean, they’re not always right the first time. Sometimes you get a little poking and prodding and you got to, you know, nudge them a little bit, but you can usually get there. Right? So this feature is opt-in. Um, it’s only running locally. It is not like some big broadcast web server out there in the world. Uh, it’s just running as a local host on your machine. You have to turn it on. Right? Again, I don’t want to alienate anyone who hates the robots. Uh, but if you, uh, find yourself using robots more and more in your day to day work, you can use this here. Um, and then you can start talking to your performance data with the tools that I’ve made available to, um, to, uh, basically expose the tables and views. Uh, if you’re on the full version in the performance monitor database, so only that database. And if you’re on the light version, it’s just talking to the local DuckDB database where all the data got collected. So you have nice sort of like, like pre arranged time series data, the, the model knows the schema and everything, and it can just talk directly to that. And it can look at everything in there and it can tell you like, you know, some pretty neat things about it. There are a whole bunch of tools available for, uh, different things like discovery, weight, CPU, queries, memory blocking. There’s a whole bunch of stuff. There’s like 30 or something tools available to it across all the different data collectors. So you have a lot that you can look at and dig into. And what you have to do, and I’ll, I’ll show you what this looks like is you have to, um, enable the MCP server, right? So again, not the DOM by default, you turn it on. Uh, and then depending on which version you’re on, the port might be different, but it’s unimportant, unimportant ports. Uh, and then you have to go into the LLM of your choice. Um, you know, we are a strong Claude household. We’re a Diet Coke, Claude, um, Lagavulin, like we have strong allegiances. Uh, so this is what we’re doing. If you use something else, as long as the LLM that you use is, um, MCP compatible, it’s fine. Right? And then you just start asking questions about, what’s been going on with your servers. So you, and you can ask it a lot of just normal conversational questions. Like if you like, instead of having to like, you know, like someone’s like, Oh, the server was slow yesterday. And you’re like, ah, you’re going to go look at yesterday. What the, come on, give me a break yesterday. You’re telling me today, like, if you were to go like do this on your own, it might take you five or more minutes to, you know, like, you know, either like, like, let’s say you’re lucky enough to have a free open source monitoring tool. It might take you five or more minutes to like go through all the charts and graphs and dig into stuff and take screenshots and make notes and pull things out and go like, you know, the normal sort of like outlining of things. But with Claude, you can ask it pretty quickly and get some pretty fast answers. It’s maybe not like, you know, as fast as you would like it, but it’s still reasonably quick. Uh, if you want to check this out, go to code.erikdarling.com.

It’s the performance monitor repo. Uh, and again, totally free, totally open source. If your company does want to, um, get support or some contract guarantees with that, I have things available at training.eric.com to, um, to do that. So let’s go look, uh, real quick at, um, first let’s look at the, uh, the settings that I’m talking about. So, um, if you want to look at that, uh, like I already have mine turned on. So this is the MCP server. Um, and if you go in here and you’re using Claude, you can copy the setup command right from here. It tells you like, just like, just the, gives you the little one line command to add the MCP tool. If you use a different LLM and you want to like include like the, the, the correct command to do it with whatever you use, you can submit it to the documentation. I just don’t use anything else. So that’s how you set that up there. And I’ve pre-baked a little bit of a, um, conversation with our dear friend Claude, uh, via the MCP server because, um, you know, like it’s not really a lot of fun sitting around like, like watching an LLM, like spin its gears with stuff. You get bored. You’ll look at something else for a little while, right? Oh, I don’t know. Gamble on, gamble online or something. So no, I just started off like, Hey Claude, can you see my MCP server? It was like, yeah, I see that. Look at that. It’s connected and working. You have seven monitor servers, but only SQL 2022 is available. That’s fine. All right. And then I asked Todd, Claude, not Todd. I don’t know who Todd is. Todd is not cool. Can you list all the tools? Right. Let’s tell me what you have. And then it, so we have a list here of all of the available tools that it has to look again, just at your performance data, not at anything else, right. Just local to that. Um, so don’t, don’t think that I’m sending queries off to like other databases and collecting things. And I’m going to, you know, I don’t know, Ashley Madison, you or something, but these are all the tools that we have out there. And, uh, you know, like I, this is just me, like mostly hitting tab, uh, because there’s like autocomplete stuff where it’s like, well, what, what, what, what’s it, what should I do here? And so I just said, yeah, run a health check on SQL 22, 2022. And it went, cause I’d run a hammer DB TPCH workload on, on SQL 22. You might see some fun spikes and stuff in there, uh, where the queries are running. And it went through a whole bunch of the tools and it said, well, you know, right now things are normal. Um, I ran through, um, all sorts of things and, you know, your server was idle most of the day, but you did have a big spike in there for 33 minutes. 33 minutes is exactly how long my TPCH workloads for, right. And it even says, it looks like a hammer DB, right? It’s like, it knows. It’s like, I got you, buddy.

All right. Uh, it’ll tell you about weight stats. Uh, you know, it says, wow, yeah, 2009 deadlocks today. That’s, that’s a lot. You might want to think about that. All right. It’ll give you a breakdown of how, like what memory looks like file IO running jobs. Um, so, you know, we’re all good there. And then, um, it’s like, you know, there’s some things that you might want to think about, right? Like 2009 deadlocks, 32, 33 minute spike, but you know, whatever. And I was like, Hey, yeah, you know, Claude, um, well, where was it? Uh, there was, uh, there was another prompt.

What was the longest running query in the hammer DB workload? And it comes back with some information in here, right? It’s like, it was a TPCH, TPCH workload, right? Tells you this was the longest running query. It gives you information about all the stuff that we have collected about the query, right? Like all this stuff that make you, you know, you would have to go and look at and like, wait for a tool tip. It’s all, it’s all a nightmare. Right. Uh, and then like, you know, say, okay, well, you know, that’s cool. I can, I can go, I’ll, I’ll make a note of that. I’ll go work on, uh, query 21 later. And then we can say, well, you know, this is the next one. Dig into those 2000 deadlocks. And it did, um, I had a little boo-boo there with some JSON, but that’s okay. And then, um, you know, it tells you what happened, right? And so all 2009 deadlocks are from the TPCC workload, right? And it tells you all the stuff that was going on there and the two deadlocks that were, uh, that were the two queries that were involved in the deadlock, right? So new order and, uh, execute, yeah, new, well, new order and delivery. Huh? See, we like new order too. It’s a good band, right? Uh, so they take some X locks on these and delivery takes some X locks on these and they acquire page level locks in the opposite order, right? And so we just get all this information.

We can say, cool, I can take that information and I can do something actionable with it, right? So, uh, this is just a sort of basic example of, um, you know, what you can do with the MCP servers that are built into my monitoring tools so that you can do sort of performance investigations without breaking a sweat, running a bunch of scripts, like gathering a bunch of query plans, query texts, looking at like 50 million different metrics, you know, trying to corroborate a whole bunch of stuff together. Uh, and, uh, you know, again, if you’re uncomfortable, you know, doing that sort of thing, or even if you’re uncomfortable, just looking at charts and graphs and figuring out when things were crappy, this is a very, very easy way to just talk to your data and get some answers about it. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I hope that you will try out my wonderful, uh, free open source SQL Server monitoring tool.

Again, that is at code.erikdarling.com. Uh, so I would encourage you to head over there and check that out. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.