Introducing sp_HumanEvents: Making SQL Server Extended Events Easy For Everyone

Put Up Or Shut Up


Video Summary

In this video, I introduce a new stored procedure called SP_human_events that aims to make extended events more accessible and user-friendly for SQL Server administrators and developers. Extended events are powerful tools but often come with a steep learning curve due to their complex setup and the need to work with XML data. This stored procedure simplifies the process by allowing you to capture common performance tuning issues like blocking, query performance, query compilations, query recompiles, and wait stats in a more user-friendly manner. With SP_human_events, you can sample and collect relevant information for just 5 to 30 seconds, providing detailed insights into what was happening on your server during that period without overwhelming the system or requiring extensive setup.

Full Transcript

It’s another exciting, thrilling, scintillating Saturday. And here I am recording things that hopefully will meet your high standards of YouTube video watching, whatever that is. Some real crap on YouTube. I’m just kidding, YouTube. You’re great. You’re wonderful. You’re full of really wonderful people, really high quality. gear. Alright, so you’re probably wondering what the point of this video is. And this is a very exciting video for me. Because I am going to introduce a stored procedure that I’ve been working on lately called SP underscore human events. And I know that that sounds like a sort of ridiculous stored procedure to work with SQL Server for, but it is my attempt to make a extended events a little bit more accessible to you human beings out there. Now, with extended events as an end user, there are a lot of barriers to entry. You would have to, like, especially if you were trying to figure something out on the spot, you were probably, like, tanked. There was no way you were going to figure everything out fast enough to get it set up, running, getting the right stuff, all that. If you had, like, a lot of things that you could use, you could use it to get it to be.

If you had a lot of time ahead of you, or a lot of time beforehand, English-ing. If you had a lot of time beforehand to, like, research and test things and work with stuff, then sure, you could have an event session set up to get what you need to get. But if you were just, like, on the spot, like, ooh, I really need to go figure out this problem, you, it would be very difficult. You would have to know which events to collect, which information to collect. And if you wanted deeper information, you would have to query XML data. Yes, XML data. You would have to write XPath. And the year of, oh, I don’t know, how long, how old are extended events? 12 years old and haven’t figured out something better than XML for them? Yeah, I don’t know. You’re not getting past that one. If you want better, you need to mess with XML. And Microsoft never made your experience with extended events pleasant. They, sure, they put a lot of technical time into it, but there was no user experience time put into it. Everything about it just smelled like, yeah, just get something to work.

The extended events GUI is just pure misery for everyone to use. I’ve worked with clients who are just like, I can’t, you can’t do this. I don’t want to do this. This is, this is awful. No one likes it. It’s got really confusing Excel style aggregations. You have to like choose 40 different things to get it to group stuff together. And it’s really, really slow when you need to sort or filter large data sets. There have been times when like, you know, I’ve, I’ve run a session for a little while and, you know, because of my own, my own mistake, I captured a lot of data and I was like, oh, well, I just need to filter this by what used the most CPU.

It would sit there for two minutes and then SSMS would crash after trying to sort the data. So it was just not fun. And of course there was also the sort of crapshoot of choosing which columns to display because it would always be like, oh, get all of them. These are mostly no. Why are they all no? Why, why is there no information? He’s okay. Get rid of those. Okay. Wait, wait, wait, but what if there’s, what if we get information in those? You should keep those. And then this is like, you ended up with this like giant sort of clunky, hard to correlate data. And I mean, again, like Excel, Excel file. And that was just no fun either.

And if you wanted to query the DMVs, you had to be as smart as Jonathan Cahias. I mean, that’s tough to do. He wrote queries and there were and signs in them. And I was all freaked out, like capabilities and one equals something. And it’s hard. It’s hard work. I get, I’m like shell shocked from thinking about simple things that I’ve, I’ve tried to do querying the extended event data to try and like figure out what I need to go and look at and search on it, what’s valid and what’s not. It’s, it’s, it’s unpleasant experience. I mean, it’s contributed heavily to my drinking problem.

And a lot of the stuff in extended events was never terribly well documented. And the documentation always seemed to be a little bit lacking. Measurements could be in microseconds or milliseconds. Events sometimes had descriptions that were just the event name. There was no like further detail on them. And a lot of them just like didn’t return very helpful information.

You’d see these like very promising event names and they would return like, uh, op code. It’s like, okay, well, that doesn’t help me much. And there was always times when basic information was missing when you went to set up an event. You’re like, okay, well, I really want all this stuff. And what does this stuff mean? Cool, cool, cool, cool. Blank line.

So, I don’t like that. I am not a fan of that. I am not fan. And I’m not a fan of making technology harder to use. And that seemed to be the goal with a lot of this stuff.

And the way that I directed my frustration with SQL Server being harder to use was by contributing to the SP Blitz scripts, the open source first responder kit. And I think the one problem that I always faced when trying to troubleshoot performance issues on a server was, you know, we had the ability to analyze what’s there, right? We could very easily look at existing data.

We could look at what indexes are there and we could look at their usage metrics, all that good stuff. We could look in the plan cache and we could analyze that XML to death. We could look at wait stats aggregated since the last restart or since some nudnik cleared out wait stats on the server.

And we could also do a good job of seeing what’s currently happening, right? Like you could hit F5 and you could see what was executing. And there were a lot of times when that would be enough to catch issues.

We could also, with SP Blitz first, sort of do a sampling of performance counters. So, like, look at wait stats for a duration of time or look at perfmon counters for a duration of time. But the problem that I would always run into with that stuff is that we would get the before and after, but not really the during, right?

Like all the stuff, like you would get, like, let’s say you ran SP Blitz first and you did it for 10 seconds. You would start up, it would take a sampling of a whole bunch of performance counters. It would wait for 10 seconds and then it would take another sampling and kind of give you the delta.

That’s great. That’s really, really helpful. But you never saw what caused all this stuff in between, right? There was, like, that 10-second window of something happening.

But we didn’t know what. Like, we could see that, like, it caused certain wait stats. We could see that they caused certain perfmon counters to tick up. We could see a lot of things, but we could never see exactly what caused them.

Unless we got very lucky and we saw, like, the queries, like, they would take a snapshot with SP Blitz who at the beginning and end. We would see, okay, like, you know, are we, like, what was happening here, what was happening here, but not all the things that happened from one into the other. And clients, a lot of times, would have problems that were faster than hitting the, than you could hit the F5 button.

You would have to try to catch sub-second queries, and that’s very difficult. You could sit there hitting F5, but then if you hit F5 and you catch something and you’re, like, on a roll hitting F5, and then you hit F5 again when you get, and it goes away. Okay, and that’s, that’s tough too.

And, you know, even fancy monitoring tools have thresholds for, like, what they’re going to gather. And I don’t blame them. I wouldn’t want to see a monitoring tool that captured every single thing.

But sometimes you have to capture that really tough stuff in order to get the information that you need out of SQL Server to solve problems. Now, what I, what I wanted to do with the initial offering of SP underscore human events was to give you a sort of easy way to capture some common performance tuning issues. So, blocking, query performance, query compilations, query recompiles, and wait stats.

And what I do, it’s a little bit different from what the Blitz scripts do here, is not only do I let you set off and sample stuff for a period of time, but I capture all the stuff that you’re interested in for that period of time. Now, this is targeted.

You can only choose one thing at a time to collect information on right now, because I don’t want to overwhelm people with stuff. And, you know, I also don’t want you to overwhelm your server by collecting stuff. You know, any, any tool that you use to monitor performance is going to have what’s called observer overhead.

You’re going to introduce, you know, some, you know, some amount of overhead collecting the data that you need to find a problem. So, this tool is best run, you know, again, sort of like SP Blitz first, for around 5 to 30 seconds to capture very specific issues. It will help give you a much better idea of what’s happening on your server for that period of time, because it’s not just a before and after.

It’s all the stuff that we need to figure out what was going on while the thing ran. It has a lot of configurable options, but not so many that I think they are overwhelming. And it also works pretty well with the defaults, too.

I wanted to sort of make it like SP who is active, so you could just pop it out, hit F5, and collect some information without having to think too much up front about what you want to collect. And sometimes, you know, you would want to see the results before you decided what other stuff to collect or what different stuff to collect. So, there is that, too.

Anyway, that’s enough of this boring-ass PowerPoint. And I’m going to record some more videos where I am going to talk about, or we’re going to show you exactly how it works. So, I’ll see you there.

So, I’ll see you there.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.