An Advanced SQL Server Query Profiling Technique

An Advanced SQL Server Query Profiling Technique



Thanks for watching!

Video Summary

In this video, I delve into advanced query profiling techniques using SQL Server’s built-in dynamic management views and actual execution plans to gain insights into query performance in real-time. By examining two different versions of a query—one serial and one parallel—I demonstrate how to use these tools to identify potential issues such as skewed parallelism and inefficient resource usage. This approach allows you to pinpoint where queries might be getting stuck or experiencing delays, providing a deeper understanding of query execution that can lead to more effective troubleshooting and optimization strategies.

Full Transcript

Erik Darling here. Yeah, we are going to do a Darling Data today, buddy, you and me. So in today’s video, we’re going to talk about advanced query profiling. Now, I realize that this might bring up painful memories for people of a profiler or trying to get words, extended events to work, or maybe even just running SP Who is active. All very valid ways to profile queries. But this one’s a little bit different. So with newer versions of SQL Server, have these built in views or dynamic management thingabobs that actually like when you look at query like actual execution plans. Now, you see all the per operator stuff like times and threads and all this other cool stuff. And there are dynamic management views in the background that back that stuff up. So you can actually look at a query while it’s running. This is part of why in my other video about what to do if a query executes for too long to get a query plan. Part of what I said was, if you like click to turn on actual execution plans, and you run SP Who is active with get plans equals one, you can actually see like the in flight actual execution plan is like rows and time accumulating that you can start to figure out where things get stuck. We’re going to use some of the dynamic management views that fuel that stuff to look at a couple different queries while they execute. So with that, out of the way, let’s talk about money, like four bucks a month worth, if you want to become a member, there’s a link in the video description that says join. And if you do that, you can give me $4 a month for producing all of the SQL Server content. You can also give me more if you’re feeling like Daddy Warbucks over there. If money is no object, well, you can always like comment or subscribe. And then money definitely won’t be an object. If you need consulting help with SQL Server, I am very good at all of these things. And my rates are very reasonable. You can hire me. If you would like some training on SQL Server, I offer a lot of it for a very low price, about $150 USD with the 75% off coupon code, you could spend your money on far worse things like like other people’s Black Friday sales. If you would like to see me live and in person, I will be well, you can see like all of me and some of Kendra Little to or half me half Kendra, we will be at past data summit in Seattle for two days doing pre cons about TQL Server performance tuning, November 4th and 5th. I think I already said in Seattle, but there you go. Anyway, let’s talk about this advanced query profiling stuff. Woohoo, it’s fun, right? That’s what everyone everyone always wished for. So I’ve got a couple of queries here. And let’s just make sure we are everything free in the of the database. I recorded a video where I had to do some other stuff. So you know, things might get weird. But anyway, let’s get an estimated plan for this. And this is a demo query that I’ve shown in other videos. And this video, or rather this, this query in this video is going to run just about as crappily as it did in the other video. But we’re going to look at two different versions of it. One of them is a serial version. And the other version is a parallel version or a non serial version. So we have a non parallel and a non serial version. Or a parallel and a serial version. However you want to however you want to phrase that however you feel comfortable with that.

I’m cool with that. But this is the way the the parallel plan the non serial plan looks. So what I’m going to do is talk about this query a little bit. So this query hits some different DMVs, like sys.dm os tasks and workers and worker threads. And then the one that that shows us the sort of in flight query progress stuff is this sys.dm exec query profiles thing. This is the one where we’re going to see the different operators and like progress and stuff. And then we’re going to look at sys.dm os waiting tasks. And we are also going to look at sys.dm exec session wait stats to get the top weights for the query while it runs. This is going to be if there were any weights going on here, this is going to be if the top weights for the session. So if I run this, there’s nothing going on. And if I come back up, I should have highlighted this query before I did anything else. And we start running this. This will start returning results. Now since this is a serial plan, we only get one result back for every operator. Remember, if we think about the plan that we just saw, all these operators were the ones that we saw in the plan.

What we’re going to see is this this thing is sort of finished the rows on this don’t change. But the rows on this this clustered index scan, those are those keep going up. And these are going to go up to about 17 million. And so right now things actually look a little weird in this column because I’m reusing this window. So maybe this this demo, you know, could you could use a little bit of a fresh window start in here. But that’s really the sort of unimportant thing. Just being able to get this stuff while a query is running is pretty awesome. So that query finished and we got to see the clustered index scan make progress here. So now let’s take a look at this query while it runs because this is where the cool CPU query thing gets very interesting. So now we have a whole bunch of we have a whole bunch of results for each query operator. If we scroll over here and look, we’re going to have this parallel resource description thing. And we’re going to have a whole bunch of things for each operator, right? We have a whole bunch of entries for each operator because we have parallel threads working on them.

Now, I’m going to scroll down here and look a little bit because it’s going to what I want to figure out is when this thing gets close to the end, which is about 17 million. So I’m going to give this one more run. And that should get us close to 16 million, right? So that query is about to finish. But what I want to show you here is where something like this can be useful. Now, I’ve talked before about parallel thread skew and how when you build an eager index spool, even in a parallel plan, only one CPU thread gets used to build that spool. That sort of situation with skewed parallelism can happen just about anywhere.

Now, if we look up, I should probably frame this a little bit better. If we look over here, we’re going to have this clustered index scan. This clustered index scan where we have, we can see the different durations and stuff. But look at the row count here, right? If we come over and we look at this query plan, we will see from SQL Server that we spent 59 milliseconds scanning the clustered index of the user’s table, right?

And that happened pretty evenly, right? About 300,000 or so rows ended up across all of the threads. If you look at the thread ID and the node ID, right? So for node 4, that’s that clustered index scan. We had our parallel threads, 0, 1, 2, 3, 4, 5, well, 6, 7, 8, sort of a little bit backwards in there.

But that’s okay. But we had all eight threads accounted for with rows on them. Now we’re going to scroll down to the other clustered index scan. And what we’re going to look at in here is the same thing, right?

If we scroll, like this is all the threads for that clustered index scan. But look what happened in here. This is way different, right? Up here, for this clustered index scan, when we look at the thread and, oh, sorry, that was up there.

We look at the thread and row count stuff. These were all spread evenly. Down here, right, if we look at the thread ID and then the node ID for node ID 7, right, that’s the clustered index scan over here. And we look at this. All of the rows are ending up on a single thread.

And that’s exactly what I was talking about with the eager index pool being built single threaded. So, like, really one thread was responsible for all of that. Now, if we go over and look at the final, like, actual execution plan, we can verify that when we look at the actual number of rows.

Where are you? There you are. We found you. So all 17 million rows ended up on thread 6 right there. All right. And if we come back and we look at the DMV query, that’s thread 6 for node ID 7.

And that’s where all of the threads were. So sometimes when I’m really troubleshooting a very difficult query, you know, I’ve talked about other ways where I might, you know, get the in-flight execution plan and see where stuff goes. If I just want to do something, like, quick and dirty to kind of get a sense of where things are stuck, what we’re waiting on in different places, this is a very good way to do it.

Now, the weight types and stuff for this query can get a little weird because multiple operators will say they’re waiting on something when really we’re only waiting on doing one thing. For example, in this query, waiting on building the eager index pool, that’s the exec sync weight. In a parallel, like, in a parallel execution plan, building an eager index pool like this will result in that exec sync weight.

That’ll really pile up. We ended up with almost 40 seconds of it. 30, 30, oh, I should zoom in on that a little bit. Almost 38 seconds, well, 38 seconds minus 5 seconds for the clustered index scan.

But you can kind of start to get a sense of where things are stuck, what you’re waiting on. You can see if, you know, you’re having problems with lopsided parallelism. You can see if threads are doing more work than others.

You know, you can see kind of the spread of work in there. And it’s just sort of a different way of looking at a query that’s actively executing to figure out where things are stuck, where things are, you know, how long you’ve been waiting. You know, you have the full wait duration for all of this stuff in here.

And so you can just kind of start to get a good sense of, like, where a query might be stuck at various points in the plan. So I will, of course, put this script on GitHub because I don’t expect you to remember it or to, you know, follow me scrolling around in the video and copy word for word and everything because that can get dangerous. So I’ll put this one on GitHub.

But anyway, thank you for watching. I hope you learned something. I hope you enjoyed yourselves. And I hope you got a tiny little insight into just how awful it is to sometimes be troubleshooting SQL Server problems. So, hmm.

And apparently I have a BIOS update from Lenovo. Hopefully this fixes some of my Intel issues. We’ll find out, won’t we? Maybe this will be my last video because my laptop will brick after this.

I don’t know. Knock on wood. It doesn’t. All right. Cool. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.