sp_HumanEvents: Tracking Queries and Waits

Is It Really So Strange?

sp_HumanEvents

Video Summary

In this video, I dive into the fascinating world of SQL Server’s `sp_whoisactive` and its cousin, `SP_HumanEvents`, to explore how they can help us track down blocking, compiles, and recompiles. I also delve into their utility for analyzing queries and weights, showcasing examples where running similar statements in separate batches versus within a stored procedure yields different insights. By using `SP_HumanEvents` to collect data over 30 seconds, I demonstrate how it can provide detailed execution plans, memory grants, and query metrics, even when run from a stored procedure like `sp_BlitzCache`. Additionally, I highlight the unique value of weight stats in identifying performance bottlenecks through page IO latch weights and SOS scheduler yield weights, emphasizing the importance of setting appropriate thresholds to avoid overwhelming data.

Full Transcript

Mmmhmm. My wife tells me that I get to have champagne after I finish this video, so I’m going to make sure this one’s real good. Now, we talked about how SP underscore human events can help you track down blocking, compiles, and recompiles. Next, we’re going to talk about how it can help you with queries and weights. Now, there are some second samples in here that are 30, and when we hit those, I’m going to hit pause while this runs, because I don’t want to sit there going like this with you as much as I love you. I don’t want to sit there staring at you while this thing runs for 30 seconds to collect data. So, without further ado, let’s talk a little bit about what happens when statements run in different ways. So, the first thing I have, you see there’s two batches here. There’s no goes in this one. This one’s all goes. And this is the same five or six queries that repeat one after the other. You can see there is, let’s see, greater than 10. We have this batch of queries that run, and then this batch of queries that run. They’re all technically the same thing, except this one which has a different at ID, because I got errors when I tried to declare the same ID. Well, when I tried to do it down here. So, I just used the second one.

You can cut the irony with a knife there. Anyway, so what I’m going to do is kick SP underscore human events off to run for 13 seconds and collect information about queries that run in the Stack Overflow database. And then I’m going to come over here and run this bunch of separate statements. Now, human events isn’t going to collect information about all of them. There are certain default values for parameters that not all of these queries will meet the barrier for entry on. So, we won’t collect information about every single thing that runs here.

We will collect some about a couple of queries that ran, or a few queries that ran here, that did meet our criteria. So, we have this one here where we selected something. Well, let’s just not talk about what the query did. It doesn’t matter what the query did. But in this case, when they’re all separated by goes, this batch of statement text column will be the same thing as this column.

And I haven’t figured out, like, long term, if I want to figure out a way to, like, say, hey, if this text column equals that text column, then just show one of them. Because quite frankly, we’re going to end up getting some big text back. I don’t know if that’s how I want to spend my CPU cycles in here. But we get back, again, some commensurate information with other stuff. We get the database name.

This is obviously not dynamic SQL. This is obviously regular SQL. But it’s not coming from the stored procedure, so we don’t have a legit object name there. We have the text of the queries that come back. We get their execution plans.

We see how many times they were executed. I guess we used a cached plan for that other ZZ thing. And we get to see some blitz cache-ish information about the queries that ran. So we’ll get total and average CPU, total and average logical reads, physical reads, duration, writes, bills, if available, memory, a whole bunch of stuff about memory grants.

None of these got memory grants, apparently. We’ll see how many rows they returned. We’ll get the DOP. And then I’ll give you some, you know, sort of forensic information in case you want to go digging about looking for more stuff on these. So I’m going to go to the query plan hash, the query hash, and the plan handle.

So you can go off and try to find more stuff there. Since this is v1, I just wanted to, like, like, yeah, I could go to the plan cache to try and find more stuff. But I don’t really see the point since everything I would go out to the plan cache for I get pretty well here.

I don’t really know. Like, I might try to add more stuff in later. But I think for now this is a kind of a good enough report. Now, where this is, why I’m showing this to you in two separate runs is because if we run this with all the GOs, these each register as a separate statement.

If I run all of this as one big monolithic chunk, so I’m going to come back over here, I’m going to run the same thing. And now I’m going to run this all as, again, one big monolithic chunk. The results are going to be a little bit different.

And I think it’ll help explain why I did things the way that I did things. Maybe, maybe not. We’ll see. So this time we get back four queries. Yeehaw. Lucky us. And now we’re going to see that things are slightly different here.

So this did just give us the individual query text, which is what we want. This is good. This is like good information to have. But this gives us back the entire text of what ran.

So this is all of those selects at once. This is different because with all the GOs in there, these were, this matched here. But without this, this, we got blah, blah, blah.

You understand what I’m saying. Anyway. We also get back the sort of the same information. We get back the XML. We get back the query metrics. And going on over here, the same stuff.

Cool. You understand that. That’s great. I’m happy for us. I’m happy for us all to be able to hang out together. Now, this is going to run for 30 seconds. So I am going to hit pause now, run this.

And what I’m going to look at now is how things look when they come from a store procedure. I’m just using Blitzcache here because I had it available and I know it does some crazy XML stuff. There are a lot of different statements in there.

And I think that’s a little bit more of a challenge. So here we go hitting pause. And I’ll be back in 30 seconds with some results for you. I missed you so much.

I missed you so much. So it is 30 seconds later. I have run SP underscore human events for 30 seconds to collect information about queries that are running inside of SP Blitzcache.

Now, hmm, hmm, hmm, hmm, hmm. What I see is something helpful and then, well, I mean, it’s all generally helpful. So we get the database.

We get the object name back this time. There’s actually something in there. And now this is where things get a little bit more interesting for store procedures because now we get back the statement text of the part of the store procedure that ran. So we get the exact statement that was running that generated those metrics.

And now SQL text shows us the command that we ran when this thing happened. So this is where things can get really helpful because a lot of times when I’m trying to troubleshoot a problem with a client, what will happen is we’ll be like looking at something and be like, okay, cool, store procedure.

But now we need to figure out how it ran and you can’t figure out what happened. I don’t know. This helps you figure it out.

So then we get the execution plan for what happened. And we also get the metrics here. So we get the total and averages again, like you’ve seen before. We get how many rows, estimated rows.

And I’m going to do a little work to put like some, maybe like a warnings column in here to help deal with, you know, if row, if actual row counts are way different from estimates or if, where is it? Here we go.

Used memory or used memory and granted memory are off in some weird way. Blah, blah, blah, blah, blah. So you understand that. Not a whole lot new to here to show you with that.

Cool. Let’s move on and let’s look at queries. I mean, let’s look at weights. Because I did some, I think I did, I think I did some kind of cool stuff in here.

So again, this is going to run for 30 seconds. And while that runs, I’m going to run this query. But first I’m going to drop clean buffers so that I have to read a bunch of stuff from disk so that I generate some page IO latch weights.

Because that just makes this thing a little bit more interesting. So I’m going to run this and run all of this and I will see you back in 30 seconds. So this is the part that I’m like really excited about with this store procedure.

So again, I ran and I collected weight stats for 30 seconds. And I collected weight stats, well I mean not specifically for this query, but you know, this was the only thing really running at the time. So that’s the only thing that we’re going to get any information back.

Now this returns three result sets. This was total weights, total weights by database, and then weights by query. And then we’re going to run the total weights, which is fantastic.

Now you can see from the total weights that we generated a whole lot of page IO latch weights. And so this looked insane to me when I first saw it. I said, I am screwing something up.

I did something royally awful. I mangled this calculation somehow. But then I opened up the event XML and it turns out I was 100% right. CX packet logged one weight for almost 21 seconds.

It’s not me messing anything up. You can go look at the XML if you want. You can do that by using the debug parameter. It’ll show you the XML as it lives and breathes.

So there’s that. And then next section down, you know, not terribly interesting pattern here. Database name, the weight type, and then some more information about the total weights and all the average milliseconds per weight, which is one of my absolute favorite things to measure with weight stats to see if there’s any, like, like, you know, there’s like the two different sort of issues you can hit with weights.

There’s like the lots of little weights that add latency. And then there’s like the really long weights that add weird latency. But here is my absolute tutti frutti favorite part of this whole thing.

Getting weights per query. So what we’ll have here is not only because this is one of those things that like, you know, you get, you sample SQL Server, you look at weight stats, you’re like, cool, but what query caused them?

This can correlate that a bit sometimes. So we’ll get the, again, the same weights as before, but now we will also get the query that caused them with the query plan. So we get the text and the query plan.

Now this is a bit repetitive. In the future, I might like do some string ag work on this to sort of make this all one line rather than have a whole bunch of things here. Because, you know, I mean, showing you the same execution plan four times is maybe not the greatest look in the world, but it’s probably not the worst thing either.

Now, where weight stats get kind of complicated, at least with extended events, is with SOS scheduler yield weights. Now, I have this other, I have this sampling down here. And what I’m going to do is I’m going to come back to the queries.

And I’ve set up some things here to collect information that I normally wouldn’t. So the weight duration here is 0 milliseconds, and I have to use the gimmeDanger parameter to grab this stuff. And so I’m going to run this, and I’m also going to run all of these.

So I will be back in 30 seconds to show you the insanity of this. All right. We are back 30 seconds into the future. And so what I did was, again, just to recap, I ran sp underscore human events for 30 seconds.

And I set my weight threshold to 0 milliseconds, and I had to use gimmeDanger to get that information back. And I ran all of these queries. I ran them, like, twice, actually, to fill up the sample time.

But that’s not really the important thing. The important thing that I want to show you, excuse you, is SOS scheduler yield and some limitations of the ring buffer. Now, every time I run this, I get 1,000 lines back from this, meaning that I filled up the ring buffer got full after 1,000 lines of this nonsense.

And, you know, I’m not sure if that’s, I’m not fully sure if that’s because the ring buffer has some limitation of 1,000 lines or if it’s because of the size that I chose for the ring buffer and the size of the data that I’m putting in there. I’m not fully sure on that, but kind of second to the point that it’s not always helpful to get more and more information back.

Now, you can see that the weight stats that I got back, I got back 1,000 weights on SOS scheduler yield, but they didn’t have any real duration to them. And that shows per database, and that also shows at the query level where we have a bunch of SOS scheduler yield weights, but all the durations on here are zero. So, in some cases, it is very unhelpful to set the milliseconds down this low.

You just don’t get really good information back. You can see that there are some different queries here that all generated SOS scheduler yield weights, but nothing really beyond that that’s all that helpful. So, I guess if there is a moral to this story, it’s maybe the really fast weights are not where you should be focusing 100% of your time on here.

Like, if it’s zero, maybe leave it alone. We don’t really need zero here. So, be careful with that.

I’m not saying never, just, you know, be nice to your server. Be nice to your ring buffer. Anyway, that is all I had for now to show you about SP underscore human events. After this, I’m going to go drink my celebratory champagne and begin my weekend of turning into a useless puddle.

So, I hope that you enjoy this door procedure. I hope that you learn some stuff. I hope that you can use it to fix some problems or find some problems or maybe find some problems and call me to fix some problems.

Because you’re like, ooh, this is hard. We should call Eric. He seems nice. That nice fellow from YouTube. Anyway, that’s it for now.

I will see you in some other video some other time. Thanks for watching. And that’s it for me.

I will see you in some other video some other time. Thanks for watchin’.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Is It Really So Strange?

Video Summary

Full Transcript

Going Further

Related Posts