Capturing and Analyzing Blocking with sp_HumanEvents
Video Summary
In this video, I delve into how to log and analyze blocking using a couple of stored procedures that I developed from scratch—sp_human_events and sp_human_events_block_viewer. These tools are designed to make it easier for SQL Server users to set up important human events for different performance issues such as query performance, weight stats, blocking, compiles, and recompiles. The focus is on capturing these events over a specified duration or using the keep alive parameter to continuously monitor potential issues without causing unnecessary overhead. I walk you through setting up the blocked process report and demonstrate how to use sp_human_events_block_viewer to view detailed results from extended events, providing insights into blocking queries, wait times, isolation levels, and more. The full code is available on my GitHub gist, along with necessary scripts, and I encourage viewers to explore and contribute to improve these tools for everyone’s benefit.
Full Transcript
Erik Darling here with Darling Data Enterprise Edition. And today I’m going to show you how I log and analyze blocking using a couple of stored procedures that I wrote all by myself from scratch. The first one is called sp underscore human events. I wrote that one to make it easier for SQL Server users to set up important human events for different sort of performance issues. It does like a query stuff, weight stats, blocking, compiles, and recompiles. Those are the sort of important ones. The most common ones that I end up troubleshooting, so that’s what I geared it towards. I don’t really have any plans to add more because I don’t really end up troubleshooting more than that. I don’t care about your AG failover. personally. I don’t care about AG’s personally. So I’m going to show you how I do that. And all of the code from this section here, which is all you need, will be available at this GitHub gist. I don’t expect you to copy this URL from the screen. I’m going to include it in the video description. And I’m also going to link to the the necessary scripts in my GitHub repo. Of course, if you find any issues, you need support, you hit errors, you have problems, you have questions, that GitHub repo is also the best place to go and ask there. Alright, so let’s get down to it. I have this, these couple commands right here that run sp configure so that we can turn on the blocked process report. The block process report is what drives information being fed into the extended event that we’re talking about. So I would, I already set this up a little bit, so I don’t have to do a bunch of goofy crap on camera. But if I run this, and this sets up the blocking event type, and it also uses a parameter that I like called keep alive. So sp human events can do one of two things. We can either capture stuff from the stuff that I mentioned before, query performance, weight stats, blocking compiles and recompiles.
For a specified duration of time. Or you can use this keep alive parameter to set up a session that just keeps on running. So if you’re not sure when to expect something happening, that can be a good idea. I wouldn’t suggest leaving a lot of these on for a long time. Because extended events, you know, they can, they can cause some observer overhead. I know that Microsoft has taken some steps to alleviate that. But especially query performance ones, I’m capturing actual execution. plans. Because a lot of the times, because a lot of the times, that’s what I need to do. So I wouldn’t suggest leaving those running forever and ever. Could be, could be detrimental. But the blocking one is pretty straightforward and easy. So I already have this set up. If you come over here, and you look at all this stupid stuff I have on my server, and we go into management, and we go into extended events, and hopefully I don’t have anything nefarious, or weird, and these, in my extended event names. But if we come on, oh boy, zoom it just like, got weird on me. Alright, there. So I have one to capture deadlocks. I actually have a duplicative one up here. I guess I could have used that instead. I didn’t look first. But I have this keeper underscore human events underscore blocking session. That’s the default name that SP human events will set up for a blocking session.
that is being kept alive. And I’ve got a couple queries over in these windows. I just did a simple update for one row to modify the age column. And I can get rid of that because I’m done with that. And I had one query over here that was just doing a select star from users looking for that same user ID, so that it would get blocked under the default isolation level for every version of SQL SQL Server, other than Azure SQL DB, I guess. You have to be good. You have to be able to say something good about the cloud in these videos, right? Well, anyway, so SP human events does use the ring buffer as a target. It does not use file targets. I decided to use that because it seemed a little bit easier to manage in code. I know that Jonathan I think the highest, the man who always has great weekends, that is not the biggest fan of the ring buffer, nor am I, but it’s just a design trade off that I decided to make.
So anyway, I let that blocking situation proceed for about five seconds. And then I ran SP human events block viewer. And I got these results. Now, some of this, some of the details that come back are sort of at the mercy of what the block process report captures. For example, we get the event time, we get the database name, and a lot of the times this contentious object will resolve to the table or whatever that the blocking was taking place on. In this case, we got a really weird object ID back. It just doesn’t make a whole lot of sense. But we see in these lines here, who did the blocking, who got blocked, the SPIDs. I know that a lot of this stuff doesn’t exactly help you troubleshoot the blocking problem. I just wanted to get as much of the information out of the block process report as I could.
Like ECID, I don’t think I’ve ever actually used this to troubleshoot a blocking problem, but it’s in there, so I get it out for you. Just in case you might need it at some point. We get the query text of the blocking query in the blockhead query. You can see those there. Those are the queries that I just showed you in the other window. We get the total wait time, the status, the isolation level, the lock mode. Note that the query that does the blocking, I don’t think I’ve ever seen lock mode get populated there.
I get out the SQL handles because I want you to be able, if you need to, to track down the query plans for the queries that were involved in the blocking. So you’ll have both the, these lines are, I mean, they’re a little bit duplicative, but we have the SQL handle for both the blocking. You see the comma there? That’s both SQL handles for the blocked and blocking query.
If there’s a procedure name in there, I do try to resolve that as well. So transaction count, the transaction name. One thing that shows up in here a lot that can be really useful is implicit transactions, where, you know, you’ll like, you know, certain tools like the JDBC driver, the Python driver, use implicit transactions by default. You have to explicitly turn those off and like, not in the connection string, but like in the connection code kind of.
We get when the transaction started, completed. We have all the client options in here. Sometimes client options can lend some insight into, you know, just like, like a, like a weird setting, but these are both coming from SQL Server Management Studio. So there’s not really any weird settings.
We get the wait resource. If there’s a priority set, if there’s log used, stuff like that. We get the client app, host name, login name, transaction ID, which again, isn’t terribly useful, but it’s in there. And then over here, I give you the full block process report.
So we look through here, like this object ID is obviously nonsense. So, you know, whatever. This index ID of 256, I’m not sure.
That’s entirely accurate, but there we have it. And then coming down through some of the other stuff. So SQL Server 2022 added a lot of these stacks in here.
These are probably not terribly useful for query blocking, at least that I’ve ever seen. They’re probably more useful if you have like, you know, like background process blocking where you could really trace some of these call stacks to something a little bit more useful. And then we have the query that got blocked.
We have the SQL handles that I was talking about and the query that was doing the blocking down here. All right. So we got all that stuff.
And that can be really useful for figuring out what was blocking, you know, what your problem, what the problem was, all that other good stuff. And the bottom section kind of gives you a breakdown of all the like all the blocking that has happened in the database that was captured that is currently in the ring buffer extended event. So a couple in here that I think are particularly useful for the scenario that we’re looking at are lines four and five right here.
And these will tell you if you have blocking involving selects. So under the default isolation level read committed, write queries can block modification queries and modification queries can block select queries. And unless you, I mean, let’s litter all your all your select queries with no lockends and whatnot, which, you know, is not a good idea.
Please don’t do it. I would much prefer that you need use a good old fashioned optimistic isolation level. So multi version concurrency control like using read committed snapshot isolation.
And the second one is that there is a sleeping query doing the blocking. If you have queries that are regularly sleeping and still blocking, there is often signals some sort of application error where you are not closing out connections when you should. This will also warn you about implicit transactions.
A lot of other stuff. There’s a ton of checks in there for various like different things that are weird. Like if you have oddball isolation levels like repeatable read or serializable getting involved, you know, it can be a good idea to warn about those because at least I want to say with entity framework, when you use the transaction scope, I don’t know, is it method or whatever? That sets a pretty strict isolation level.
I can’t remember if it’s repeatable read or serializable by default, but it’s one of those. So it’ll give you all that information. It’ll show you all that good stuff.
And this, I don’t know, I use this quite regularly with client work to try to help me, you know, be able to troubleshoot blocking issues and all that. And I don’t know, that’s about it there. You know, it’ll tell you how much blocking there is and, you know, for the entire database and then per object and all that other good stuff.
So, you know, pretty good. All right. Pretty good.
All right. Cool. So I’m going to keep recording some more of these videos. This is going to be pretty short. I’ll probably be sort of repetitive because I don’t know who’s going to watch what when. You might find one video and watch that. You might find, you might watch a whole series of videos.
I don’t know how it’s going to work. I don’t know. I don’t know what you’re going to do with your life or your day or your great weekend. But anyway, I will catch you in the next video. Again, please use this stuff.
I spent a lot of time writing it, not just for me to benefit, but for everyone to benefit. It’s open source hippie stuff. So, yeah. Enjoy. Use it.
Report errors to me on GitHub. Good stuff. All right. Thank you for watching. Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
It’s freaking amazing and I love it.
I watched this and put it in a test system yesterday. I just wasn’t sure how this works when you’re reading form an *.xel file from the operating system in terms of how to specify the parameters. I’ll work it out but just in case someone else complains…I have a sample file from another procedure but I’m not sure if the format is the same. (It’s a Frankenstein merging of Kendra, Jeremiah and Ken Fisher’s work)
Hi Kevin — it may still work, but I’m not sure what the script(s) you’ve assembled look like.
Thanks!