sp_QuickieStore
The Darling Data machine has been hard at work making the bits and bins and whatnots work better.
You can download all of the main SQL Server troubleshooting procedures I use in one convenient file.
Okay, so like, maybe not all that hard at work, and we I had some help with a funny bug, stemming from a recent Pull Request. The Read Me file has been updated to reflect the newest parameters.
Let’s start with giving some credit, here, since I hate an overdue bill.
ReeceGoding reported and very capably fixed an issue with sp_QuickieStore. The problem only occurred when you ran the procedure with @get_all_databases
and the new @escape_brackets
parameter set to true. Each loop would add an escape character to the search string, which could end up looking like this: %\\\\\[AnyStringStartingAndEndingWithSquareBrackets]\\\\\]%'
The good news is that it’s fixed now. A big round of applesauce for that. It’s the only substantial change to sp_QuickieStore.
Now, on to the other stuff!
sp_HealthParser
I had to make a bunch of adjustments in here. I was testing out some stuff and hoping that I could make it work, but the math just wasn’t there. Or the maths just weren’t there, if you’re into multiple maths.
For the wait stats sections, I’ve removed the “total wait time” column. It just didn’t make any sense, because the only information that I had was the total number of waits, the average number of waits, and the maximum wait time. I was hoping I could get sensible numbers by multiplying the total and average, and adding the max wait time to that. The problem is that the numbers were huge. Enormous. Senseless.
The other thing I had to do some work on was in the “waits by duration” section. It seemed like the results would just re-report on every collection cycle, even if no new waits had occurred. It would make things look like you had the same waits in every grouped block of time. Via the magic of windowing functions, I weed out absolute duplicates, and only show where the wait stats report something new. As part of this, I also filter out waits with a low average duration (less than 500ms by default). You can change this behavior by using the @wait_duration_ms
parameter.
Another thing that I set a threshold on was in the “cpu task details” section. In there, it would show sections with a warning logged to the system health extended event. The problem was mainly that a warning would be logged whenever there was even a single pending task (query waiting to get on a CPU). Having just one of those is not a very interesting sign of CPU pressure, so there’s a parameter called @pending_task_threshold
that defaults to 10 which you can use to decide how many pending tasks matter to you.
sp_PressureDetector
This is perhaps the biggest update! It hadn’t changed much in a while, and I wanted to get a little more out of what SQL Server has to offer.
The big changes are:
- Added in some useful PerfMon counters to the results
- Added the ability to sample server activity
When you run sp_PressureDetector the “normal” way, it’ll show you perfmon counter activity since startup. I only collect specific ones though, and only specific counter types. See, there are a variety of types of counters.
The math to figure out how much of something happened during a period of time is about the same for all of them, but I found looking through the various categories and specific counters, that only those that come from types 272696576 and 65792, and only certain ones within those.
The new section will look something like this:
If you decide that you don’t care about perfmon counters, you can skip that section by using the @skip_perfmon
parameter.
The sampling code was inspired by running into some client issues where hitting F5 was okay, but remembering what all the numbers were before to compare them to after. You can run the sampling code by doing this:
EXEC sp_PressureDetector @sample_seconds = 5;
Not every section supports this yet. I started with the ones that I thought would benefit the most:
- Wait stats
- File stats
- Perfmon counters
I may add it to more later, but these are good enough for now. The easy way to think about it is that only sections with a “sample_seconds” column in them are sampled. The rest are current.
Look for a video walkthrough of the latest round of changes.
Again, you can download all of the main SQL Server troubleshooting procedures I use in one convenient file.
Issues and Contributions
If you’d like to report an issue, request or contribute a feature, or ask a question about any of these procedures, please use my GitHub repo. Specifically, check out the contributing guide.
As happy as I am to get emails about things, it makes support difficult for the one-man party that I am. Personal support costs money. Public support is free. Take your pick, there!
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.
Now I want to create problems just to try it out.
You seem capable 😃
Hi Erik,
Thank you for the update, I am using these procedures every day at work and it has been saved lots of problem to us. One question
My server is up time shows 12 hour (correct ) but in columns hours_cpu_time and hours_wait_time I see
35.49 and 53.6 , does it means 35 hours?
Can you clarify please?
Sorry, I am talking about EXEC sp_PressureDetector
@what_to_check = ‘memory’,@skip_perfmon=1;
Hi Uri – like the post kindly asks, please use GitHub for questions like this. Thanks.
Thanks, I will