Free SQL Server Performance Monitoring: Resource Usage Overview

Free SQL Server Performance Monitoring: Resource Usage Overview


Summary

In this video, I delve into the resource metrics tabs of the full dashboard from my free SQL Server performance monitoring tool. I explain how these tabs provide more detailed information compared to the lighter version, including CPU utilization, TempDB usage, memory usage, and a selection of perfmon counters. The goal is to help you quickly identify when your server might be experiencing issues by showing real-time data and historical trends that can pinpoint specific performance bottlenecks. I also highlight how the weight stats section helps in understanding wait statistics, making it easier to diagnose problems related to resource contention and query compilation.

Chapters

Full Transcript

Erik Darling here with Darling Data and returning to talk more about free, completely free SQL Server performance monitoring. We’ve got, today we’re going to talk about the resource metrics tabs from the full dashboard. These do have a bit more than the light dashboard has in them because there are some extra things that are a little bit easier to collect and grab the way that this is set up than it would be with the light dashboard. Again, the light dashboard serves most people’s needs most of the time, but if you need a little bit extra, the full dashboard has a little bit extra in it.

So, we get some server trends up here. We collect waitstats, file.io, tempdb stuff, perfmon stuff, and these DMVs, they’re all cumulative over time, and one of the most painful things for me about being a consultant is coming into some server that has either just been restarted or has been up for like a thousand hours, and you’re like, well, you know, like either the server just restarted, it’s like there’s nothing useful in there, or like, you know, the server’s been up forever, and you’re like, well, I don’t know if these waits happened yesterday, today, you know, 900 hours ago, like who can tell, right? What in here is meaningful? What in here is consistently happening?

And so, it becomes a little bit harder to, you know, give people good answers on the spot about what their current pain points are, because the cumulative metrics can not only lie to you about, or either disappear if the server restart, lie to you about when they happen, right, their significance, or worse, some of them just sort of smooth out over time.

You consider like a bursty workload where, you know, you have like big spikes of activity and then nothing for a long time. It’s like nothing tends to stick out terribly, so you have to go looking in different places, right? But all of the stuff in here that we collect is pretty nice and useful, and I don’t know, I like it.

So, if you want to check out the monitoring tool, it’s at code.erikdarling.com. It is completely free, it is completely open source, you can see everything that it does and is doing, and it is a step up above a lot of the paid stuff that you will see out there in many ways.

Let’s go look over at the resource metrics tabs now, though. So, what we collect up here is CPU utilization, TempDB usage, memory usage, and some limited perfmon counters over here. This is a pretty good way to just get an idea of like when stuff might have gotten a little wacky with the server.

You’ll notice that there’s a spike in some of the perfmon stats lined up with some of the TempDB stats up there, so something neat happened, right? Memory’s been generally okay, the buffer pool got a little hot data injection down here, and again, if you hover over any of these things, you can see the percentages and numbers and depending on what they are, and all of this stuff just kind of serves to make your life easier evaluating performance on a SQL Server at whatever time you’re looking at it.

The weight stats section over here, what I’m trying to do that, I think I’ve mostly got right now, is the, like, I wanted to call out all of the poison weights, so like resource semaphore, resource semaphore, query compiled, thread pool, and then sort of like the known, like, group of weights that most people commonly see, like SOS scheduler yield, the CX weights, stuff like that.

And then also, like, the top 10 weights other than those. So this starts off with a whole bunch of weight stats checked and lets you see sort of, like, what’s going on. A lot of people just, like, are, like, it’s, like, heartened to see when, like, you know, like thread pool and all the other stuff are zero, because if you see those ones go up, your server’s not having a good time.

But again, if you hover over any of these, you will see, like, actual values for them. So you can see which weight was weighting and for how long. You can pick different weights to put into here.

So, you know, that’s, that’s all hunky-dory. And over here is tempdb stats. tempdb not terribly busy on my server, but apparently there was a fun little spike up here.

This is tempdb latency and this is tempdb space used. So, you know, like, just kind of, like, like, I have, like, unallocated up here and then actual usage down here. The legends will tell you exactly what each line color means.

So if one of them sticks up above the rest, it should be pretty easy to figure out. Down here is just read and write latency. For file I.O. latency, you’ll have reads and writes, and you will have this per database, right? So, like, all the legends in here, all the databases I have on my server.

And we’ll see that tempdb had a tough time at some point yesterday, right? And stackoverflow had a tough time at some point yesterday. The perfmoncounters one is one that I actually just made an improvement to.

And one that I’m pretty happy, what I’m pretty psyched about is, like, I always forget sort of, like, which, which groups of perfmoncounters I want to look at for specific problems. So this one is locking and blocking, right? So, like, if I want to, if I select a different counter up here, like memory pressure, this will change and select all the counters that have to do with memory pressure.

So you can, you can start figuring out, like, oh, crap, when did, like, when did something terrible happen, right? So stolen server memory is spiking up at various points. The next tab over is session stats.

Now, what this is essentially collecting is aggregated session metrics. A lot of people who I work with have the big problem with sleeping sessions doing terrible things on their servers. So what this, these lines show you are the total number of sessions, how many of them were actually running, how many of them were sleeping, how many of them were background processes, and how many of them were just idle for more than 30 minutes.

So, like, like this, I mean, this is pretty crazy on my servers, like 45 of them, like, I guess that’s, I guess that’s mostly SQL Server doing stuff and me doing other things. But, like, you know, a lot of wacky stuff going on on here. And then we have latch stats.

This is just, like, the regular, like, DM latches stuff. So, like, you know, stuff that you would expect to see if you were doing, like, a deeper analysis on a server that’s having problems and you wanted to analyze latches. We got all those collected here.

And I know, I know, it’s never spin locks, never be spin locks, right, but it’s always nice to be able to prove that it wasn’t spin locks. So I have a graph that will let you very, very quickly absolve spin locks of any potential involvement in a query performance problem. That one time every 10 years when it is spin locks, you can also do that.

But the 999,999 times it is not spin locks, you can prove that, too. So this is just a quick overview of the stuff that you’ll find under the resource metrics tab in the full dashboard. Again, the Lite dashboard does not quite have all of that stuff.

You know, it does capture CPU, memory, file I.O., tempdb, but, like, you know, it just doesn’t capture the full spate of stuff. This is meant to be lighter weight and not have as much getting collected, but still give you all of the most important stuff that you would look at, right? There’s, like, not some, like, the deeper dive stuff, but all the really important stuff that you would care about is in here.

So, anyway, I hope you enjoyed yourselves. I hope you learned something. I hope that you will try out this amazing free, completely free monitoring tool that does way more than most paid monitoring tools ever did. It’s at code.erikdarling.com.

Again, go to the releases section. That’s where all the zips are. That’s where you can grab it. That’s where you can, you know, run everything from. And, again, if you open up a zip and you don’t see the executables in there, it might be some security software blocking them. You’re just going to have to check to unblock them.

Anyway, thank you for watching, and I’ll see you, well, I guess this might be Friday’s video, so I’ll see you Monday for Office Hours. Monday, Monday. All right.

Have a good weekend.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.