Rewriting T-SQL Scalar UDFs So They’re Eligible For Automatic Inlining In SQL Server

Rewriting T-SQL Scalar UDFs So They’re Eligible For Automatic Inlining In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into the intricacies of scalar UDF inlining and how it can impact query performance. You’ll learn about the limitations that prevent a UDF from being inlined, such as invoking time-dependent functions like `getDate`, and why sometimes rewriting a scalar UDF to accept parameters instead is the best workaround. By sharing practical examples and detailed walkthroughs using the Stack Overflow database, I illustrate how making small changes can significantly improve query performance. Whether you’re looking for tips on optimizing your SQL Server queries or just curious about the nuances of UDF inlining, this video offers valuable insights. So, if you found it helpful, don’t forget to like, comment, subscribe, and maybe even join my channel as a member for more exclusive content!

Full Transcript

Erik Darling here with Darling Data, voted by BeerGut Magazine to be the SQL Server consultancy most likely to be run by your best friend. So, isn’t that wonderful? Don’t you love being best friends? I’m having a nice time with this relationship. In today’s video, we are going to talk about function rewrites for UDFN. UDFN lining. Why are we going to talk about this? Well, I do have to go to a website after I get out of the slides, but we’re going to talk about this because scalar UDFN lining, for all of the good that it tries to do with your awful UDFs, has some kind of silly limitations that have some kind of silly workarounds, and sometimes you have to employ those in order to get UDFN lining to work correctly. One actually may wonder, if you have to say, if you have to write a little bit more out loud in front of the world in front of 4300 data darlings, if you’re going to rewrite the scalar UDF so that it can use UDF inlining, why not just rewrite it as the inline version of a UDF? The answer is, sometimes that’s hard. Sometimes a UDF is perfectly inlineable just making a small change rather than having to make all of the changes required for writing an action.

Inlineable inlineable inlineable inlineable inlineable function. And boy howdy, sometimes the easy way out is the best way out. Maybe that’s a little grim, but I don’t know. Anyway, before we talk about that, let’s do the housekeeping. If you like this channel, if you like my content, if you’ve already liked and commented and subscribed, but you want to do more, you just have this deep yearning to say, Erik Darling, thank you for the hours and hours of content that you provide for free. You can sign up for a membership. The link to do that is in the description of this video. It is the name of this channel with slash join at the end. I didn’t make that up. That is not a database thing. That’s not like a database joke that I customized for this. That’s just how Google chose to do it with YouTube. I don’t know. It works for me. I’m fine with it. If you need help with SQL Server that is beyond what you can find for free on YouTube or blogs, and you would like a young, good looking consultant to come in and appraise your SQL Server for what it’s worth, I can do all of these things and more. And as always, my rates are reasonable.

If you would like some low-cost, high-quality content, there are 24 hours of it available at my website, training.erikdarling.com. There is beginner, there is intermediate, there is expert slash advanced, whatever you want to call it. I’m not going to say that I’m going to melt your brain because I’ve had my brain melted in the past standing in front of a microwave in the 80s, and it was not pleasant. So there will be no melted brains on my watch. There will be plenty of smart, learned people using SQL Server, though.

You can use the discount code SPRINGCLEANING to get 75% off. There is also a link to that in the video description. Would you believe it? Marketing wizard that I am. If you would like to see me live and in person, either in Dallas or Seattle, I realize maybe not the two most convenient locations for a lot of you watching, but that’s what I got so far. If you would like me to come to a more local venue, tell me when your local venue is having a thing where I could do a thing, and I’ll show up to the thing.

There’s always that. I like going to things. Sometimes. I mean, it’s nice to just, you know, drink alone for a little bit. But Friday, September 6th, I will be at Data Saturday Dallas, and I will be bringing both pokey bribes and sticky bribes, sparkly sticky bribes.

Ooh, you are feeling sleepy because it’s lunch, and you just ate a sandwich and a cookie out of a cardboard box. Now you’re not going to pay attention for the rest of the day. You can come catch me at either one of these events, Past Data Summit or Data Saturday Dallas.

I’ll be there. Past Data Summit, I’m especially looking forward to. One, not because it’s an election. It’s my birthday, and I’m co-hosting two days of performance tuning content with Kendra Little, and that’s going to be a great time.

I don’t think that the election is going to be a great time, especially in Seattle. It can’t go well either way. So with that out of the way, let’s begin the pastry.

I mean, the party. All right. I think I used that joke before. I can’t remember now. Anyway, we’re off. We’re off to the races. Oh, that hat fits much better when it’s not zoomed in than when it is zoomed in.

I should probably shrink that hat down a little bit. But objects in hat may be larger than they appear. Anyway, so I told you we had to look at a website, and so that’s what we’re going to do first.

And one of the limitations on UDF inlining is here. The UDF does not invoke any intrinsic function that is either time-dependent, such as getDate, or has side effects, such as new sequential ID.

Either of these arrangements will prevent your UDF from being inlined. It sounds terrible, doesn’t it? Well, why do we want our UDF to be inlined?

It’s also a good question. It’s also a very good question. So Scalar UDFs and SQL Server have two main performance drawbacks. One is they force the query that calls them to run single-threaded.

They are ineligible for parallel execution plans, and that can make a very big mess for big reporting-type queries when you call them. The other downside is that Scalar UDFs are not invoked once per query. They are invoked once per row that has to get processed through the function.

So if your function has to run over a query that needs to do something with a lot of rows, your function could run a lot of times. And if your function is eligible and is inlined, just like if you wrote an inlined table-valued function, your query will suffer from neither of those fates. It may suffer other performance maladies and whatnot, but those two won’t be it.

Unless you do something else that causes that to happen. Then I can’t help you. Well, I mean, I can help you.

My rates are reasonable. But I’ve seen your code. You seem to practice doing things bad. It’s a thing.

It’s a thing that I’ve noticed. All right. So let’s get into our Stack Overflow database, and let’s set the compat level to 160. 160 is not explicitly necessary for this. We could also be in compat level 150.

Isn’t that just a dream and a joy? We could be in either one of those. As long as you’re on SQL Server 2019 or 2022 or in Azure somewhere. If you’re on SQL Server 2017 or lower, none of this is available to you.

So sorry about that. You should maybe think about upgrading rather than watching YouTube videos. Go download and install it.

I don’t know. So here’s our function. Let’s create or alter it just to make sure that it is properly created or altered. And here’s the sticky part of our function. We have this fallback where if end date is null, we pass in the get date built-in thing.

It’s pink, so I think it’s some kind of function, right? All the functions that are built-in in SQL Server are pink as far as I know. I don’t have anything against that.

I rather like the contrast. If you look up here, it’s like case when blue date diff pink. It’s kind of nice. Mentally, if date diff were also blue, it would all blend.

You wouldn’t know what you were doing. It would be impossible to tell anything apart. So I guess that’s also true when you nest a lot of functions. That’s why I have to separate things out like this so that all this pink does not run into one just dolly-esque mess of drippy things.

So let’s run this query. And I’m going to be honest with you. This query is going to drag on for a little bit. Oops.

Query is going to go for about 16, 17 seconds. It’s not going to be a good time for anyone. All 16, 17 seconds of it. Because, I mean, let’s be honest.

Waiting 16 seconds for rows to return is not anyone’s idea of a good time. I don’t think my microphone picked up on it, but the fans on my laptop, they’re all flustered right now.

Spitting like crazy. If we look at the query plan, we are going to pay attention to two things. One is if we look at the properties here, we will have, as is usual, when we have a scalar UDF in our query plan, we have this non-parallel plan reason.

Since I’m on SQL Server 2022, I actually have a reason here. If you’re on an earlier version of SQL Server, it will probably just say non-parallel plan reason could not generate valid parallel plan.

But since I’m on SQL Server 2022, this actually shows up. And you can figure out exactly why your query could not generate a parallel execution plan. If we peruse some of the plan operators, we will see that, well, we do spend some time in other places.

Oh, geez, that didn’t go well. We spend about four seconds in here. We spend about two seconds in here. There are a few hundred milliseconds in here. But really where things get boggy is in this compute scalar. This compute scalar is what is responsible for the execution of our scalar UDF.

We can see the number 1166178 in here. All right. That’s an important number because when we go…

Oh, man. I just… Hang on. I got to fix my zoom-in. It changed to a red cursor. And I do not want a red cursor. I want a pink cursor. When we go look at this query, it’ll either be exactly that number or it’ll be twice that number.

I’m dying slowly because I practiced this demo and there might have been more stuff. Anyway, here we are. We can see the execution count of our sneaky function matches the number of rows that were returned by the previous query.

And what’s important here is that the total worker time for the function was not the problem. The main problem was the sheer number of executions that the function had to be invoked for and that it forced the rest of the query to run on a single thread.

All right. So this function didn’t actually take up a whole lot of CPU time, but it just dragged on and on and on because it had to execute so much, which is obviously not what we want. This is not an acceptable situation for us.

If you’re on SQL Server 2016 or better, you will have access to this wonderful dynamic management view called sys.dmexec function stats. Sys.dmexec function stats will tell you information specifically about your scalar UDFs.

It does not track inline table valued functions or multi-statement table valued functions. This is purely for the inline, for the scalar UDFs. And I have a feeling that this thing coming around was maybe part of what Microsoft wanted for doing this scalar UDF inlining thing.

I have a feeling. Maybe. You know, it’s just a guess. Just a little guess. Yes. So like I said, this is a very silly reason to need to rewrite a function in this way, right?

Because just because we have get date here, we can’t have the function get inlined. But we can change the function a little bit. Again, this might be where you’re looking at me and saying, Eric, why don’t you just write the whole function as an inline table valued function?

And then we don’t have to think about it. And like I said earlier, when I pose the same dilemma, a lot of the times when I’m rewriting stuff and trying to help clients get things to go faster, there might be very complex logic inside of a scalar UDF.

And the only thing that might be holding that UDF back is the invocation of get date in one or many places in the UDF. And sometimes it’s a little bit easier to just get that out of the way so that SQL Server can do the rest of the inlining work rather than me spending a lot of time writing a very complex inline table valued function trying to get everything working.

So let’s create or alter this to make sure that it is properly created or altered. And let’s run this version of the query now, where we pass in get date from outside of the function, right?

Notice that get date is now out here rather than in there. And you would think that that wouldn’t be a meaningful enough distinction to have anything happen, but it is.

It’s wild. It’s wild what SQL Server has gotten up to. Because now when we look at this query plan, we will see two things. We will see parallelism. And we’ll still see a compute scalar, but the compute scalar has these weird, funny extra branches in the query plan.

And this compute scalar down here with these nested loopy joiny things in it, this is where the function is invoked inline in the query. So this function no longer ran 1.1 something million times. It was inlined into the query and just kind of ran normally.

I mean, I guess technically since it’s a nested loops joint, it may have actually run that many times, but it wasn’t like the same black boxed UDF invocation stuff before. And we got a parallel plan out of it.

So now the whole thing finishes in about 3.7 seconds rather than about 16 seconds. So moral of the story here is that get date in your scalar UDFs will prevent your UDF from being inlined.

If you can replace the get date invocation from within the UDF with a parameter where you can pass get date in from outside of the UDF, SQL Server can then inline your UDF and maybe make performance better for you too.

If you could shave 12 seconds off a query just by adding a parameter to a UDF and passing in get date from the outside, wouldn’t you do that? Wouldn’t that be a nice, simple, easy thing to do? I mean, this is a simple enough function, I suppose.

I suppose you could just put that case expression in somewhere, but sometimes you have to deal with the cards you’re dealt. And sometimes in more complicated situations where maybe this wouldn’t be so readily available to you, just replacing get date in the UDF with a parameter that is assigned get date outside of the UDF is a better option.

So, sometimes it’s so simple you just want to smack yourself silly or your brain melts on the sidewalk. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. I hope that this was one of the finer experiences that you’ve had watching a SQL Server video. If you do care to like, comment, or subscribe, or become a member, or buy my training, or hire me to do consulting, we could have magical times together just like this, you and I.

Because we are best friends after all, aren’t we? All right. Great. Anyway, thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Why Expressions Are Better Than Local Variables For Performance In SQL Server

Why Expressions Are Better Than Local Variables For Performance In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into a common issue many developers face: the preference for using local variables over expressions in SQL Server queries. I explain why expressions are often superior for performance and cardinality estimation, providing practical examples to illustrate the differences between using expressions versus local variables. The discussion includes how these choices can affect query plans and execution estimates, especially when dealing with date columns. Additionally, I address common pushback from developers who worry about repeating themselves or maintaining consistent values across their code. I also touch on membership options for my channel, offering ways to support the content and upcoming events, while emphasizing that even without financial contributions, viewers can still engage by subscribing, liking, and commenting on the videos.

Full Transcript

Your best friend, Erik Darling here with Darling Data. Magnificent, the wonderful, the wondrous, omnipresent Darling Data. Up in all your data business. In today’s video, we are going to talk about expressions versus local variables. Because this is one that comes up a lot with clients where, you know, in, you know, programming in general, people have this fear of repeating themselves. Deep fear of repeating themselves. And, apparently, they’re still under the weird impression that typing something once and assigning it to a variable is going to do something magical and magnificent for them. But it’s not going to. Right? And I know that I’ve recorded a lot of stuff about local variables, but this is a very specific case to comparing date, date, date, date, date, time, whatever columns. And so we’re going to talk through all of that good stuff. Before we do that, if you feel so obliged, so kind, if you feel just so beneficent in your ways, you can get a membership to my channel. There’s a link for that in the channel description. It is the one that is the name of my channel with slash join at the end.

Then you can choose just how generous you’re feeling every month when it comes to a little bit of channel support here. At some point, there will be member exclusive stuff. I’m just a little bit crunched right now. I’m preparing for, well, we’re going to talk about it in a few slides, but I’m preparing for a few events in the near future that just have a lot of my attention right now on top of the normal client work, on top of having a wife and kids and all that other stuff. So I haven’t had a lot of time to put thought into how I want to structure the membership stuff here, but I will probably be getting to that probably around the holidays here. So if you get in now, you’ll probably get some sort of bonus multiplier thing. I don’t know yet. I got to figure it all out. So if you have no money, if your pockets are empty, if you cannot pay for a cheeseburger today or next Tuesday, you can do other things that let me know you care and that you love me and that you’ll never leave me.

Like you can like and comment on, you can like my videos. You can comment on my videos. You can subscribe to my channel and join an excess of 4,300 other data darlings out there in the YouTube-averse. God, I feel awful saying that. I don’t know why it came out of my mouth. So, and subscribe to my channel for free and just watch the stuff and maybe, you know, enjoy yourselves, learn something, that song and dance. If you need consulting, I am pretty good at this stuff. I can also do other stuff, SQL Server related. If you need something not SQL Server related, I would wonder what it is.

But we can always work on that. And as always, my rates are reasonable. If you want some cheap, well, you know, not cheap like quality, but you know, if you want some cheap money-wise SQL Server training, you can get about 24 hours of it from me and me alone for about 150 US dollars with those discount codes and stuff. There is beginner, intermediate and expert level content in there. So no matter where you are, you can learn something and enjoy yourself.

Thank you for watching. Let me just end it right here, right? Like I said, this is a chat GPT image. Deal with it. I do have upcoming live events. Friday, September 6th, I will be at Data Saturday in Dallas giving away pokey treats and, oh, where are they? Sticky treats.

So I have all sorts of treats for people who show up. November 4th and 5th, I will be at Past Data Summit in Seattle co-hosting two. Oh, those banger days of performance tuning content with Kendra Little. And you can spend both of those days with us because it’ll be the best two days of your SQL Server life.

All right. And that’s all there is to it. And with that out of the way, let’s begin the festivities. Let’s get to partying here because that’s what we do best.

All right. So what we’re going to talk about, again, is why expressions are better than local variables in SQL Server for performance and cardinality and all sorts of other important things. So what I’m going to do is I’m going to create a table called Express Yourself. Sort of a little bit of a joke there, a little play on words, you know.

Express yourself. But because we’re talking about expressions. That wasn’t obvious. I don’t know what is.

And that table is pretty simple. It’s just two columns. And really the important thing is that we have this thing in there. Right. Great. We did it.

We rule the world. We rule the seas and the skies and the dirts and all that stuff. And if we query that table and we look at this query plan, because that’s largely what we do around here at Darling Data is look at query plans and go, oh, not going to make it. When we look at the execution plan, you will see that we got a rather good estimate when using expressions in our query.

I actually framed this probably badly for that particular zoom in. When we use expressions in our aware clause, we get pretty good estimates about what will happen for things. And I know I’ve talked about it before.

If we use local variables rather than expressions, we get actually just do both of these at the same time because they’re both equally annoying. If we use local variables rather than expressions. And also I have two slightly different expressions in here for the later local variable.

One adds one year and the other adds, oh boy, Zoomit is struggling today. One adds one day or maybe it’s just my mouse that’s struggling. It’s tough to tell right now.

Struggles abounds. We get the same kind of goofy estimate for both of them of 171024, regardless of how many rows are actually liable to come out. Now we saw when we use the expressions that this, at least the first one, I didn’t try the second one, obviously.

I was just trying to get to this demo. But we got a good estimate before. Now for this one, we are off by 606%.

And by this one, well, this one is just way lower compared to what actually comes, or way higher compared to what actually comes out of the table. But notice we get the exact same number for both of them. The same thing is true if we use a stored procedure.

Now this cardinality estimate is going to be a little bit different because the start date in this one is a parameter, whereas before they were both local variables. So we’re going to get a slightly different estimate on this one, but it’s still going to be pretty wrong compared to when we just put the expression in there. So if I run this, right, and if I just go get the minimum start date in the table, we get the same result back for both of these.

But notice that the one that we use the expression in, right, is a lot closer to reality than the one that we use the local variable in. Now, before we had that guess of 171,000 some odd something there. This change is because we have a parameter for start date.

SQL Server is able to do good cardinality estimation for start date, but it does bad cardinality estimation for end date because it’s a local variable. So having the expression up here yields better results than storing something in a local variable like we did down here. Where I run into pushback sometimes with these things is someone will inevitably say, well, yeah, but I need everything to be the same date going across all of them.

The good news is if you operate with your start date parameter and you add time to that, it’ll always be the same no matter when a query runs. But where it won’t be is if you use, of course, one of the non-deterministic built-in functions like get date, right? If you use get date over and over again, it’ll be blah.

If you build off the start date, it’ll be blah. That was a good blah, not a bad blah. I thought the smile kind of would help with that.

Maybe it did. But now if we run this, we can sort of illustrate the problem generally. We’ve got to wait like five seconds because there’s a five-second wait for in there.

So let’s put these a little bit closer to each other and let’s zoom in. So the start date parameter for both of these is before and after the five-second wait for is the same, which means that if we use an expression to calculate later using the start date parameter, that’ll be the same for both of these two.

If we set later to a value based on start date, that will also be the same for both of these. Notice that these two blocks in here line up across both of these columns. Where that doesn’t work is when you build something off get date over and over again, right?

So if we just add five seconds to start date, that’s a lot different than adding five seconds to get date before and after the wait for. There is indeed a five-second difference here between 41 and 46. So that’s where people get kind of weird and screwed up.

Now, another thing that people will generally do in these procedures is have some sort of handling in here for start date, right? They’ll say, oh, if start date is null, set it to get date. I don’t know.

I guess that’s reasonable. But you’ll get the same results from this, right? Like this won’t change anything. This will make for a weird cardinality estimation, though. And the reason why this is going to be weird is because if you pass in start date up here with a null value, SQL Server doesn’t do cardinality estimation based on you fixing it here.

SQL Server does cardinality estimation based on start date being null. So if this is the kind of thing that you have to do, well, we’re going to talk about your options in a few minutes. I promise.

But you would end up with the exact same situation here as you did before, where if instead of passing in the start date with a value at, you know, when you execute the store procedure, you pass it in and you correct it within the store procedure, everything turns out the same, you know, including this column being five seconds different, right?

So, okay. Let’s say, for some reason, you want to use that local variable to insert data into a column. Let’s say it’s like the run date for a batch process and you, for all the rows that you put into a table for a particular run of that batch process, you want all the times to be the same.

It’s totally cool to use a local variable for that. It’s also cool to have a parameter for that. If you need to update column, like you need to do like a bunch of updates across a bunch of tables to say, hey, like these, you know, tables all changed at the same time, it’s totally cool to use a local variable for that.

You’re not going to mess with performance there, right? That’s not where local variables cause problems. If your local variable is getting used as a filter somewhere, that’s kind of where you should sweat it, but only if you’re having a performance issue.

You know, while it is nice to go around and correct all of these problems so that your code base is just, you know, a golden god where there’s nothing wrong with it and everything runs perfectly, most of the time when you’re trying, when you’re like me and you’re a consultant trying to make something go faster, you have to kind of set a threshold with things.

And a lot of the times if, you know, queries just aren’t going slow, I’m not going to spend a lot of time typing to fix them. Excuse me. Because it’s just not worth it because I have other slow queries to go fix.

I’ve got to fix the slow queries before I can circle back and start, you know, sort of fine-tuning everything. Your options when you have this problem and if there’s a performance issue are things that I’ve talked about a whole bunch on this channel before. You can use option recompile at the statement level.

You can use parameterize dynamic SQL and pass your local variables into that. Or you can pass those local variables to another store procedure because that other store procedure or the parameterized dynamic SQL will, of course, accept your local variable as a parameter, right?

Because those are different modules and they sort of change the context of things enough for you to get the better cardinality estimates from that. You can also just use a recompile hint and not bother with all the extra typing and testing that dynamic SQL involves. If you do go the route of turning local variables into parameters and passing them into store procedure or to dynamic SQL, this is where you have to be careful about, especially with time-based ones, right?

With, you know, if you have an integer or a string and you’re looking for a quality predicates, you know, it’s where you have to worry about skewed data within the column as a problem. When you’re dealing with date, date time, stuff like that, date time to zero through seven, when you’re dealing with that, you need to worry about the skew being in the range that people look for.

And that’s not something that the parameter-sensitive plan optimization that came up in SQL Server 2022 addresses. Parameter-sensitive plan optimization only works for one parameter and only if it’s an equality predicate and only if the heuristics kick in to say, yes, you should do this.

You have to meet those three things. And then you get, I think not coincidentally, three different execution plans based on the small, medium, and large buckets of values. But range values don’t get that.

So equality predicates, yes, greater than, no, less than, no, greater than, equal to, no, less than, equal to, no. If you are worried about parameter-sensitive plan stuff and you have date range predicates, one thing that you can do is build your procedure with dynamic SQL sort of like this.

And I’m going to highlight this and run it and then I’m going to talk to you about what I did. So normally, when you have a date range, you’re worried that someone will be searching for a small range and then suddenly start searching for a large range or be searching for a large range and then start searching for a small range.

And either way, in either direction, sharing the query plan between those can be pretty bad. One thing that you can do with your dynamic SQL is set it up so that you execute a slightly different query based on how many, like, you know, based on some spectrum of things like this.

So for this, I’m just saying if it’s less than one month, less than or equal to one month, add one equals select one onto the end of the query. If the difference is between two and six months, add two equals select two.

And this is just to keep things simple. These aren’t necessarily the date ranges that I would always use or the month differences that I would always use. This is just to make things kind of simple when we’re looking at stuff and testing.

And then if it’s more than six months, then we want to attack and three equals select three on. And what that gives us when we execute the procedure is this. Let’s start with one in here.

And let’s highlight this. Remember to highlight this whole thing and run it. What I want you to pay attention to is the cardinality estimate in here. We’re just doing a simple single table query.

We already have a good index in place. It’s not like SQL Server has a big variety of plan choices to choose from here. SQL Server is just going to seek into this index no matter what I do. I’m just getting a count.

There’s no need to think about lookups or anything else. What I want you to pay attention to here is the cardinality estimates. Because under normal circumstances, if we weren’t adding that and one equals or and something equals select something onto the end of the query, SQL Server would reuse the cardinality estimate for all three of these examples that I’m going to show you.

Because we’re adding those slightly different strings on with slightly different literal values, those are going to hash out to different queries. And SQL Server is going to come up with different cardinality estimates for each one. Within that, if we were to execute the same one multiple times, SQL Server would totally reuse those query plans for each individual one.

But it’s not going to reuse the plan for one month and the one for six months. You’re not going to have that parameter sensitivity in there. So now if we change this to do three months and we rerun this.

Actually, before I do that, let me come over here and show you the messages tab. Because we’re printing out our dynamic SQL. And there’s our one equals select one.

And then if we change that to be three months, which falls into our two and six bucket, then in the messages tab, we’ll see our two equals select two in here. And in our execution plan, we’ll have our accurate cardinality estimate for that.

Right? So good news across the board there. And now let’s change this to nine months. Because that will get us outside of the final bucket of six there.

And if we rerun this and we look in the messages tab, we will see the three equals select three at the end here. And if we look at the execution plan, we’ll have an accurate cardinality estimate for that. And this is generally, like the name of the game here is to just find a generally good number for cardinality that makes sense across whatever, however many buckets you want to create with your different things in here.

Are there more complicated schemes for this? Yes. Could you build your own histogram for this? Yes. Could you figure out all sorts of other things to do with, like, optimize for? Absolutely.

You could totally do that. But this is one thing that I use quite a bit when working with clients because this helps me come up with a fairly quick and dirty solution to generate different but reusable execution plans based on how big of a date range we’re dealing with. So anyway, in general, for most things that you will do in SQL Server, using expressions is better than using local variables for performance because of cardinality estimation.

You can get very wrong estimates, and depending on, like, where those estimates are used, you could have all sorts of trickle down bad plan choices, bad, you know, bad choices in, you know, how many rows or pages we decide to start locking, all sorts of things. And you can have bizarro performance issues because of that. I can’t say it enough that you should just avoid that situation altogether.

Most of the time, what I tell people is if you’re using local variables and you’re having trouble with a query that’s using local variables and a join or where clause, slap a recompile hint on it and see if SQL Server comes up with a better execution plan. If it does and you’re cool with the recompile hint, leave the recompile hint on. If it does, but you want plan reuse, then turn that into dynamic SQL that’s parameterized, and you will be able to do all sorts of neat tricks to get the good execution plans across the board.

So, with that out of the way, from your best friend here at Darling Data, me, Erik Darling, thank you for watching. I hope you enjoyed yourselves. I hope you learned something new.

I hope that you will continue to watch all of these, all of this wonderful free content, regardless of if you sign up for a membership or hire me for consulting or buy my training. Come to see me perform live at Jocko’s Comedy Hut, Paramus, something, I don’t know. Anyway, these lights are getting hot and my brain’s starting to fry, so I’m going to call this one and I’m going to figure out exactly which video I’m going to record next.

So, as always, thank you for watching. You’re too kind.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Why You Shouldn’t Use The Plan Cache To Figure Out CPU Load In SQL Server

Why You Shouldn’t Use The Plan Cache To Figure Out CPU Load In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into why the CPU per database query can be misleading when trying to identify which databases are using the most CPU on a SQL Server. I share my frustration with the endless stream of regurgitated advice on LinkedIn and other platforms, particularly around SQL performance tips that seem to lack substance or context. The video explores common pitfalls of this query, such as plan cache disturbances from various factors like restarts, memory pressure, and recompile hints. I also highlight how recompile hints can make the query even less useful by removing plans from the cache. To address these issues, I suggest adding more detailed information to the query, such as the oldest and newest plan ages, total execution counts, and worker time statistics. Additionally, I share a humorous yet instructive story about my interaction with ChatGPT, which turned out to be hilariously wrong about optimized for ad hoc workloads queries showing up in DMExec query stats. This experience underscores the importance of validating AI-generated information before relying on it.

Full Transcript

It’s your friend Erik Darling here with Darling Data. And in today’s video, we’re going to talk about why that CPU per database query is pretty stupid. I didn’t realize until I signed up for LinkedIn how much awful content there was out there about like, I mean, well, SQL Server in particular, because I can sniff that out. But just data stuff in general, like every day, my feed is just absolutely inundated. I’ve been fascinated with these chat GPT posts that are like, like, top 10 rockstar SQL performance tips, avoid select star, dude. And you’re like, what? Like, I, if I ever catch one of these people in person, I’m going to smack them. Like, it’s just the dumbest, most regurgitated advice that I’ve ever heard in my life. And I don’t know, I don’t understand how it gets so much engagement. Like, I feel like LinkedIn has an algorithm that looks for, its own, like, like, like, its own, like, like, AI nonsense and boosts it. Like, anything that has, like, like, crappy emojis to as bullet points. It’s like, the world needs to see this. So check out this AI advancement. You’re like, oh, God. This is why I start drinking early. Anyway, before we talk about that, let’s, let’s do the housekeeping stuff.

If you feel monetarily motivated to support my channel, you can sign up for a membership. There were, there were some comments in the comments recently about how someone couldn’t see a way to sign up for memberships. And it really, so if you’re watching it, unlike, you’re watching it, like, using the YouTube app on a device, you won’t see it. Apparently, you have to be using a web browser. So what I’ve done to circumvent all these issues is I scourge the depths of Google, and I found the URL that you need to use, which is apparently your channel URL slash join. As a database person, I feel particularly stupid that I did not try to join. But that, that link will go in all of the videos going forward. And I also am going to backport that to older videos so that everyone can use the join sign up link and be happy.

If you, if you, if you find yourself in a situation where, oh, you don’t have an extra four or so dollars a month in your pocket that you feel like sharing with little old me, you can, you can support my channel in other ways. You can like posts and you can, videos, video posts, and you can comment on them. And you can also subscribe to the channel and join now 4,300 data darlings out there in the world who also subscribe to my channel and get pleasant little tinkly notifications every time I publish something.

If you are in need of SQL Server help, if you need a health check, performance analysis, hands-on tuning, if you are having a SQL Server emergency, or you want me to train your developers so that you have fewer SQL Server emergencies, as always, my rates are reasonable. If you want just some training, you can get a whole bunch of it for about 150 bucks US. If you go to that URL or click on the link in the video description and you use the discount code SPRINGCLEANING, you can get all that for 75% off, which is pretty, pretty good.

Once again, this image is generated by ChatGPT. I did not make spelling mistakes. I think it’s funny how bad ChatGPT is at things. And I will talk some about something that I caught ChatGPT being wrong about this morning when we’re talking through the demos. But if you want to see me live and in person, and you are the type of person, you can see the dates and places there.

And you are the type of person who needs bribes to show up to things. I have Darling Data pins, which are a little hard to see because of the glare and the bright lights. And I also have these hot SQL action pins, which are very hard to see because of the glare and the bright lights.

But I can assure you they are absolutely magnificent. I also have sparkly Darling Data stickers. Ooh la la. Look at that. Look at the sparkles.

So if that didn’t hypnotize you into coming to see me, I don’t know what will. So I have pointy bribes and I have sticky bribes. And I hope at least one of them works on you.

Anyway, now with my brand new redesigned ready to party slide, let’s do this. Got sick of all the white space and decided to spruce things up a little bit. So I want to make sure everyone out there knows that says let’s party.

These are not fencing videos, so it does not say let’s parry. And these are not baking videos, so it does not say let’s pastry. We are going to party.

We are not going to parry or pastry or any mix of the two. All right. Anyway, at some point in your long, glorious database career, you probably will have seen a query that looks something like this.

Where you go and look at sys.dmexec query stats. And you look at sys.dmexec plan attributes to get the database ID out. And you group the CPU by the database ID.

And then you do some fancy math to figure out what percent of the CPU on the server is occupied by query plans that do this stuff. Right?

Or that did stuff in a database. This can be very weird for a lot of reasons. And we’re going to talk about all of them because it needs to be talked about lest you think that this is a good idea. I guess if there’s nothing else for you to do, you’re that bored in your life.

I guess if you’re like the average no-lock enjoyer, this might be an okay query for you to run to get a crappy idea of which database uses the most CPU. This can be wrong for a number of reasons. If anything has disturbed the plan cache, either in total or in part, you know, restarts, memory pressure, changing settings, queries with recompile hints, someone running DBCC free proc cache, updating statistics and having plans get invalidated, and all sorts of other things, they can really hide stuff.

So this really isn’t a great query. Actually, another thing that can make work look weird is if queries originate in a different database than the database they’re querying that can also cause some issues. So this is, you know, not a very good query practically for most people to use.

If you’re on a server and you really have no idea which database uses the most CPU, that’s a little bit silly. On my server, rather on my laptop, it’s a rather stable place. You know, I don’t, at least at the moment because I’ve been preparing for this, I don’t have a lot to do on here that would mess things up.

But we could add a few things to this query to make it a little bit more useful. I still probably wouldn’t use this as like the ultimate arbiter of truth about which database uses the most CPU, but you could at least spruce things up a little bit so that there’s some better information in there.

Stuff like getting the oldest plan and the newest plan and the total query execution count for each database can get you somewhat better information about what’s going on, like on the server as a whole. Or at least give you some contextual information to help you, you know, figure out if this is actually a good piece, a bit of information for you.

Excuse me. Don’t know where that came from. I may have swallowed a fly. So for this, we have the total CPU.

We have the total query execution count. I don’t know that this is necessarily the order I would present this in if I were trying to, you know, make this a pretty report, but it’s pretty okay, at least for this video.

And, you know, it might help a little bit to figure out, you know, it’s like, well, do I have a few really hard-hitting queries? Do I have a lot of queries that just kind of do stuff?

Maybe help you figure that out. But then this plan cache age column up here, that’ll tell you the difference in, well, days, hours, minutes, seconds, and milliseconds between the oldest plan and the newest plan in your cache.

If that number is very big, you might have a little bit more confidence than if that number is very small. But there’s one thing that can always mess this up, and that is recompile hints or stuff getting removed from the plan cache. So if I run this, this should remove one of the plans from the cache for the Stack Overflow Clean database, at least if I did my math right.

And now Stack Overflow Clean, which used to be second up here, is now way down here with almost nothing on it. So a plan getting removed from the plan cache can really pull that out. Where that’s also true is if you use recompile hints.

So if I run this query with a recompile hint, right, and this thing will take about four seconds, but it uses a pretty good chunk of CPU. That was the big chunk of CPU that was in the results before that’s not there now.

And then we look here. Oops, you know what? I forgot. Okay, actually, this is illustrative. So this is all the crap.

I actually meant to show you this just later. But this is all the crap that you get in the plan cache that this query will count as being responsible for the databases using CPU. So there’s just like a whole bunch of like background system stuff that can happen in there.

But focusing in on the query that we care about, which is the one that has that funny looking GUID in the text, if we run that query with a recompile hint, it will not be in the plan cache. And we will not know that anything ever happened with it, right?

So if we come up here and we look, and please don’t fail me, please don’t fail me now, and we look at this, Stack Overflow Clean, even though we just ran a query that took four minutes of wall clock time and about 23 seconds of CPU time, that doesn’t show up in here. So recompile hints alone can make this query pretty useless.

If we come down here and we run this without the recompile hint, we just run this thing. And that’s another four seconds. And if we look at the query plan, I’ll show you what I mean about the, oops, go away, tooltip.

We don’t need you. We need the properties. And we look at the query time stats. I said query time stats. There is about 22 seconds of CPU right there, which with a recompile hint is totally unaccounted for.

If we come and look at this query now, where we’re searching for that funny looking GUID thing in there, now it’ll show up and we’ll see the execution count and we’ll see the worker time. But without the recompile hint on it, that’s fine.

But with the recompile hint, that all disappears. So coming back up here and looking, we should see our friend up in here. Now Stack Overflow Clean has an additional 22 seconds of CPU time associated with it.

But again, if we remove that query from the plan cache, it’ll all disappear. So this is not a great query for actually figuring something out. Again, if you have absolutely nothing better and you just, and you want like the worst, again, like the average NOLOCK enjoyer style of data, return data.

If you really just don’t care that much and you just want like a bad idea of which database uses the most CPU without anything else contextually around it, you can use that query. Then you can probably get a bunch of wrong information back, just like the average NOLOCK enjoyer.

Now, I promised you a funny story about ChatTPT, so here it is. While I was working up to this, one of the things that, you know, because I love picking on optimized for ad hoc workloads, one of the things that I was looking into was if the compiled plan stub queries would show up in DMExec query stats, which, you know, they did.

But since it was something that I had to like sort of answer for myself because I had never specifically looked at it, I decided to say, hey, ChatGPT, do queries, if I turn on, in SQL Server, if I turn on optimized for ad hoc workloads, do the queries that get a plan stub, will their CPU, like, will their resource usage show up in sys.dmexec query stats?

And ChatGPT was like, no, won’t do it. They won’t be in there until the second execution when they get a full-blown plan, and then they’ll be in DMExec query stats, and then you’ll be able to see them.

And I’m sitting here staring at SQL Server, completely contradicting that. And I was amused by it, and I was like, hey, ChatGPT, can you write me a query that would prove that? So it prints out this query that looks at, you know, sys.dmexec query stats and cache plans, and, you know, sort of like up above, it gets the database ID out, and then in cache plans, it matches the plan handle there to sys.dmexec query stats and the database ID for both of those.

And it was like, yeah, here you go. And so I ran it, and of course I had a bunch of hits between cache plans and query stats. And so I came back, it was like, hey, ChatGPT, your query’s returning rows.

I think they do show up in there. And ChatGPT was like, you’re right, I was totally wrong about that. Cache plans, compile plans, steps do show up in there.

Here’s a query that’ll prove it, and I reprinted the query that it just showed me when I asked it to prove that they wouldn’t show up in there. So for God’s sake, stop caring about AI.

It’s always wrong. It’s always wrong about these things. You should think of any AI as being sort of like, pretend you have an administrative assistant, and you might ask an administrative assistant to write up a short summary of something for you, or take some notes, but then you would also validate it.

If you asked your administrative assistant to say, hey, come up with a report that shows me these things, you would probably want to make sure that you wouldn’t grab that report and then run to a board meeting with it. You would probably check the numbers first.

So if you’re out there in the world really sweating AI and thinking that it’s going to change your life and it’s going to take your job, it’s not quite there yet. You don’t have a lot to worry about.

It’s actually a rather sad state of affairs. So anyway, that’s my amusing slash depressing ChatGPT conversation from this morning. And of course, ChatGPT is what backs Copilot.

You know, there’s other ones out there like Claude and stuff. And Claude is a little bit better, but Claude got this one wrong too. So, you know, AI.

What can you do except laugh at people who think it’s good for anything? All right. Well, that’s about enough here.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. For the record, I will not be publishing any of the queries from this because I still find them equally useless. I should actually, before I wrap up, I should give you a couple ways that are better at doing this.

If you’re on Enterprise Edition of SQL Server, you can use Resource Governor, assuming that different databases have different logins associated with them. And that would be one way to sort of get a rough idea of which workload groups for which database are using the most CPU because it does track that. The other thing that you could do is if you are on a reasonable version of SQL Server, and by reasonable, I mean it’s at least 2016, you could also use Query Store.

And Query Store is a bit more historical than the plan cache. And you could go to each database and, you know, SP, whatever, in each, for each, around each, some preposition each database. And you could hit the Query Store DMVs to figure out which ones have the most kind of CPU usage in there.

That would give you a much closer idea to reality because, you know, Query Store usually generally has, like, I think by default, like 30 days of data in it, which is probably a much better indication of which databases are using a lot of CPU. So the plan cache query these days, real stupid, old and busted. You know, I think, you know, resource governor and Query Store are to much better options.

So with that finally out of the way. Once again, I hope you enjoyed yourselves. I hope you learned something. I hope that you will join the 4,300 other data darlings who subscribe to this channel and get notified when I drop videos because that’s – it’s probably a pretty good group.

It’s probably better than the group of people who generate AI content for LinkedIn and all that. People who think that they can use that plan cache query to figure out which databases have the most CPU work on the server because ship of fools, isn’t it? Ship of fools.

Anyway, thank you for watching. I have other videos to record, as you can see by the numerous tabs up top. So I’m going to close this depressing one and we’re going to do something else. I don’t know what’s next.

I haven’t decided yet. You’ll find out when we get there, I guess. All right. Thank you for watching. My rates are reasonable. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join Me And @Kendra_Little At @PASSDataSummit For 2 Days Of SQL Server Performance Tuning Precons!

Last Year


Kendra and I both taught solo precons, and got to talking about how much easier it is to manage large crowds when you have a little helper with you, and decided to submit two precons this year that we’d co-present.

Amazingly, they both got accepted. Cheers and applause. So this year, we’ll be double-teaming Monday and Tuesday with a couple pretty cool precons.

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

Here are the details!

Day One: A Practical Guide to Performance Tuning Internals


Whether you’re aiming to be the next great query tuning wizard or you simply need to tackle tough business problems at work, you need to understand what makes a workload run fast– and especially what makes it run slowly.

Erik Darling and Kendra Little will show you the practical way forward, and will introduce you to the internal subsystems of SQL Server with a practical guide to their capabilities, weaknesses, and most importantly what you need to know to troubleshoot them as a developer or DBA.

They’ll teach you how to use your understanding of the database engine, the storage engine, and the query optimizer to analyze problems and identify what is a nothingburger best practice and what changes will pay off with measurable improvements.

With a blend of bad jokes, expertise, and proven strategies, Erik and Kendra will set you up with practical skills and a clear understanding of how to apply these lessons to see immediate improvements in your own environments.

Day Two: Query Quest: Conquer SQL Server Performance Monsters


Picture this: a day crammed with fun, fascinating demonstrations for SQL Server and Azure SQL.

This isn’t your typical training day; this session follows the mantra of “learning by doing,” with a good dose of the unexpected. Think of this as a SQL Server video game, where Erik Darling and Kendra Little guide you through levels of weird query monsters and performance tuning obstacles.

By the time we reach the final boss, you’ll have developed an appetite for exploring the unknown and leveled up your confidence to tackle even the most daunting of database dilemmas.

It’s SQL Server, but not as you know it—more fun, more fascinating, and more scalable than you thought possible.

Going Further


We’re both really excited to deliver these, and have BIG PLANS to have these sessions build on each other so folks who attend both days have a real sense of continuity.

Of course, you’re welcome to pick and choose, but who’d wanna miss out on either of these with accolades like this?

twitter
pretty, pretty, pretty, pretty good

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

See you there!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Query Plan Patterns For Windowing Functions: Aggregate Function Atrocities

Query Plan Patterns For Windowing Functions: Aggregate Function Atrocities



Thanks for watching!

Video Summary

In this video, I delve into the fascinating and sometimes perplexing world of aggregate window functions in SQL Server query plans. After wrapping up a trilogy on these complex queries, I explore some of the more peculiar behaviors you might encounter, such as unexpected range clauses and disk-based window spools that can significantly slow down your queries. I also discuss how to leverage batch mode for better performance and introduce non-clustered columnstore indexes as a powerful tool for optimizing paging operations. By understanding these nuances, you can improve query efficiency and overall database performance.

Full Transcript

Erik Darling, one man party here, Darling Data. In this video, we’re going to round out our Peter Jackson-esque trilogy on window function query plans by talking about some weird stuff that you might see when you start looking at aggregate window functions. So stuff like count, sum, average, stuff like that. when you add windowing clauses to them. But before we do that, my spiel of appeals, if you like my channel enough to subscribe, you can do that for like four bucks a month. If you don’t have four bucks a month, or you just hate me, which I understand, you can engage with me in other ways, like liking and commenting and subscribing. All great ways to let me know you hate me. The best way to let me know you hate me is to give me money, though. So if you really don’t like me, sign up for lots of money. If you’d rather spend more money hating me, you can hire me to do actual factual in person SQL Server consulting. Not exactly in person, kinda on Zoom. But I mean, I would show up to offices if anyone had offices anymore. Which is sort of a funny thing to think about. But if you need health checks, performance analysis, hands on tuning, or you’re having some sort of SQL Server emergency, or your developers are causing SQL Server emergencies, and they need training, you can hire me for any of those things. If you just want some cheap training, you can get 24 hours of it for about 150 bucks with that discount code. It’s a great deal. 150 bucks over the rest of your life unless you are ancient or terminally ill, or you sail a little too close to the winds with your personal habits.

That 150 bucks will go a long way. If any of those other three things apply to you, though, what do you care about what you do with 150 bucks? You’re on your way out anyway. I’ll probably be around for at least 150 bucks more worth of fun. Might be nice to leave something behind. If you want to see me in person, I have live events coming up with more ChatGP. JPT created images. Friday, September the 6th, I will be at Data Saturday Dallas doing a full day pre con and then talking way too much of the regular event. And November 4th and 5th, I will be at Pass Data Summit in Seattle, Washington, where you can catch me and Kendra Little coming together to give two days together co hosting SQL Server performance tuning pre con fun. So with that out of the way, let us begin the partying with Erik Darling, the one man party.

All right. We managed to hit the escape button almost flawlessly. I think we’re off to a good start. We can probably just call it here, right? I should have put that down first. So I’ve got a slightly different index here than I had previously. So this index is on user ID, creation date and score because we are doing this with our windowing function and we are summing this.

And sometimes it helps to have data in order for things like aggregates, right? Ordered data goes into an aggregate pretty quickly. Kind of a nice touch. But what you should know is that by default, when you use aggregate window functions, you get this range thing happening.

And the range thing is usually not what anyone wants when they’re using these windowing functions because it sets up weird, you know, sort of, what do you call them? Like segments of stuff. I’ve always found it counterintuitive what range does. At least, you know, by default, it’s always been kind of odd to me.

So it also, when you use range, you have this nasty side effect. And that is that you use a disk-based window spool to do your work. And those disk-based window spools can be quite slow.

If we look at the query plan up here for this first query, and this is a query that I use a parallel hint on because without a parallel hint, I get a single threaded plan that runs for about two minutes. Now, granted, this thing running at DOP 8 for 30 seconds is not like a completely linear scaled-up improvement, right? 33 seconds at DOP 8 versus two minutes at DOP 1.

Those seven DOP threads may be not pulling their weight so great in this one, right? But we have, in this query plan, we have this window spool right here, which is really a large part of our issue. As you can see, we jump from about 17 seconds here to about 30 seconds there.

So this disk-based window spool is not our friend. Another time you will see the disk-based window spool pop up is if you use rows, but you have an absolutely gigantic, enormous framing of those rows, you will probably never need this. I think this is where Microsoft was like, stop here.

We’re going to punish you if you go beyond this. So if you look at the query plan for this query, it is equally as ucky as the last one at 38 seconds. Well, actually, it’s uckier by about six seconds, I guess.

And this query has not one, but two window spools in it. We have one window spool all the way over here. Yeah, no good.

And we have one window spool all the way over here. No good. Now, what’s funny is if you write your query the way most normal human beings would expect to see the data that comes out of this, rows between unbounded proceeding and current row. So not 9,999 in current, but unbounded proceeding and current.

Unbounded. We are rows everywhere. It’s wonderful. You still get a window spool, but it’s not a crappy one.

It’s not the disk-based window spool that adds a whole bunch of time to your query. There’s no real physical indicator about what kind of window spool you’re getting. The only indicator is crappy performance.

So if you get really crappy performance, you’re probably getting a disk-based window spool. If you get mildly acceptable performance, you are probably getting a not disk-based window spool. So windowing functions, man.

World’s a crazy place. Be careful out there. Now, batch mode, once again, what you should be doing with windowing function queries, utilizing batch mode as much as possible, does help with the range issues. If we run this query and we allow SQL Server to use a parallel plan in the most recent, most highest compat level available currently, we will get batch mode on rowstore, and this query will not suck.

This query will not suck because we go from using a crappy disk-based window spool to using a wonderful window aggregate, and the whole thing takes about 1.3 seconds. If the best you can do in row mode is like 30-something seconds for a query that you’ve got things pretty well set up for, you should be using batch mode. It’ll probably be better.

Unless you’re on standard edition and you haven’t paid Microsoft’s fee for liking you and you only get .2 batch mode queries. Isn’t that nice of them? What batch mode will not fix, though, is this query.

I’m not going to rerun the whole thing and make you sit here for nearly 40 seconds. I’m just going to show you that this query plan, the estimated plan for this, is the same as the actual plan for the one that we did run. Batch mode did not change this.

We are forcing, we are saying SQL Server, pretty please here. We do not get any joy there. Another thing where batch mode is a real savior is if you need to do count over. Now, I’m specifically hinting the parallel planting here because I need to show you just how big of an effect count over can have on queries.

Where I see people use count over the most is when they’re doing paging queries. They put count over somewhere in the select list because they want to know the total number of rows that qualified so they can say, you’re on page whatever of whatever that has whatever many results.

So they stick count over in there and then they wonder why performance stinks. So this took 30 seconds. And the reason why it took 30 seconds is because count over happens in a serial zone in the query plan.

Much like if you use a recursive CTE and you have a serial zone in your plan, this is almost exactly the same concept as that. All of these operators run in serial.

Run single threaded. They don’t run in serial like your breakfast cereal is on the table and they’re just like stomping through it. They all run single threaded. Big problem in here, 25 seconds in this lazy table spool.

Why is it so lazy? We don’t know. Impossible to say. But this thing takes about 30 seconds.

And then, of course, when we’re done doing all that hard work, we distribute streams and go parallel. And it has absolutely no effect on it. No effect on making things better.

So that sucks. And you shouldn’t use count over anywhere. Of course, you know, batch mode is very useful when we’re doing count over stuff. Because batch mode fixes nearly everything with crappy windowing function problems.

If you look at this, we have 319 milliseconds versus 33 seconds from before or whatever. Much better arrangement here, I think. So, once again, batch mode to the rescue.

If you are doing page inquiries and you would like to improve the performance of them almost immediately, one of the best things you can do is add a non-clustered columnstore table to whatever, like, the main query, the main table that you do your paging from.

I realize that many page inquiries have joins that, like, maybe even dynamic joins to other tables that might not always get touched. But if you have, like, that one main, like, users table or something like that, stick a non-clustered columnstore index on it.

Encourage SQL Server’s query optimizer to use batch mode or batch mode on rowstore. And you will get not only much better performance from the query, but for the windowing function stuff like count over, or if you’re using row number to do the paging, you will also get better performance almost immediately from that.

All right? Now, there are some, like, maintenance considerations and there are some, you know, design considerations around using columnstore indexes. I’m not going to get into all that here because that’s a pretty big subject.

But most of the time, you can just YOLO a non-clustered columnstore index on a table and not have to think twice about it. It’ll really help queries that do paging and stuff because batch mode is a much better execution engine for, you know, again, large amounts of data and especially anything that requires windowing functions because it really speeds those up by introducing the batch mode sort and window aggregate.

And you can really help those queries along by doing that. You also might find yourself in a particularly cheery scenario where you can get rid of a lot of the non-clustered rowstore indexes that you’ve created over the years to support different variations on your dynamic paging queries because you just don’t need to account for all that anymore.

The columnstore index can just take the place of all of those where SQL Server can read from the individual columns rather than having to use all sorts of different index configurations to help different queries be fast at different times. So, we have now completed our three-part series on windowing function query plans.

If you notice the names of some of the other tabs up here, you can see that we have many other query plan patterns to talk about. These will not take three parts, though.

These will all be one-parters. Some of these videos are not even going to be terribly long because it is just such a fun, obvious thing what happens when different things in query plans go on. So, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. I hope that when you watch, by the time you’re watching this video, I hope that you are fabulously wealthy.

I hope that, maybe you already are fabulously wealthy. I don’t know. If you are, you should definitely sign up for a membership, though, because the fabulously wealthy have no excuse. The day someone fabulously wealthy watches my YouTube videos, that’ll be something.

I don’t know. Perhaps you’re rich in love. I don’t know. Anyway, it’s time for me to go do some actual work now, so I’m going to go do that, and I will start recording some of these other videos when I’ve finished my works.

My works are vast. All right. Cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Query Plan Patterns For Windowing Functions: Indexes And Memory

Query Plan Patterns For Windowing Functions: Indexes And Memory



Thanks for watching!

Video Summary

In this video, I dive into the world of windowing function query plans from a unique perspective: indexing and memory management. I explore how indexing can impact these queries, particularly in batch mode, and highlight some of the downsides when using row mode for windowing functions. I share practical examples to illustrate why parallel execution plans often disappear once you start indexing for windowing functions, and discuss the implications of this change on query performance. Additionally, I delve into memory grants required by sorts within these queries, emphasizing how they can significantly impact overall plan efficiency. By sharing insights from my experience as “the most SQL Server Consultant,” I aim to provide valuable tips and tricks for optimizing your own windowing function queries.

Full Transcript

Wouldn’t you like to get away? Bum bum bum. Erik Darling here with Darling Data. Eh? Eh? Look at that. Look at that logo.

Recently voted by BeerGut Magazine to be the most SQL Server Consultant. I’m not sure if that was an incomplete thought, but I’ll take it. I think being the most anything would be probably pretty good.

Maybe the most dead would kind of suck, but I’ll take being the most SQL Server Consultant. We’re going to continue talking about windowing function query plans, this time from the perspective of indexing and memorying. Because both of those things are important to windowing function query plans.

Without them you would be in quite a bit of trouble. You would not have a good time. So we’re going to do that today.

Important stuff. Alright. Cool. If you like me to the tune of like four or five bucks a month, you can sign up for a membership here. It’s kind of a nice way to say thank you for the countless hours of entertainment and knowledge about SQL Server that I pass along.

If you are somehow short five bucks at the end of every month, I don’t know, maybe you have a gambling problem. Maybe you have a shopping problem. Maybe you just booked too many vacations.

I don’t know. If you just don’t have the five bucks, you can interact with my channel in other ways. You can like. You can subscribe.

You can comment if you’re feeling real crazy. If you’re feeling real engagey, you can do that. And I will engage back with you. Probably.

Sometimes it will just be a thumbs up because, you know, there’s not a lot to respond to. And other times I will personally write you a message. It’s not, I have no interns working for me writing these messages.

So, you have that to look forward to always. If you’ve got SQL Server problems, I’ve got SQL Server answers, questions, answers, solutions. Yeah, I guess I get a little bit of all that.

Health checks, performance tuning, performance analysis. If you’ve got something immediately wrong with your SQL Server or if your developers just need training because they keep putting worse and worse stuff into SQL Server. I may not look like it, but I’m just the guy to help.

And with recent accolades, like being the most SQL Server consultant, it’s hard to say no to that sort of level of confidence that you can have. So, these things, my rates are reasonable. If you need cheap SQL Server training, you can get 24 hours of it for the rest of your life for 75% off.

Think about all of the dumb things you have spent 150 bucks on that you’ve never used. Think about all the hobbies you’ve picked up. All the things that you pretended to got into.

The things you bought that are broken immediately and there was no return policy. They were like, nah, it’s yours. You broke it. You already bought it. You keep it.

Think about all those things. Then think about this. You can actually get better at something in your life by buying this training and using it. Once again, you do have to participate in the training for it to be effective.

It’s like medicine. If you don’t take it, it doesn’t work. So, yeah. If you are so enamored with my physical being, the specimen that I am, you can catch me live and in person.

This Friday. Not this Friday. But the Friday that will be September 6th in Dallas as part of Data Saturday Dallas. I am putting on a full day pre-con.

All about SQL Server performance stuff. Naturally. Because what else would you pay to hear me talk about? So, Monday and Tuesday, November 4th and 5th, I will be at Past Data Summit in Seattle, Washington, with the lovely and talented Kendra Little.

We will be co-hosting two days of kick-ass pre-cons. All about SQL Server performance stuff. Because again, what else would you pay to hear us talk about? It’s a short list.

I don’t really know what else I have going on for me that you would be interested in. If you let me know in the comments, I will do my best to oblige. That’s as much as I can promise.

And I have officially changed on with the show to Let’s Party. I think it’s a far better lead-in, especially to the wild and crazy material we talk about in SQL Server. Okay.

Anyway. In the last video, we talked about sort of the physical appearance differences between row mode and batch mode plans. In this one, I want to talk a little bit about, well, indexing for batch mode, for windowing function queries. And some of the downsides of indexing for batch mode, for windowing function queries.

I don’t know why. I’m so excited to talk about batch mode that it just keeps jumping out of my mouth. But really, it’s just windowing functions.

So I believe this index should already be created. Good, good, good. We’re set there. Now we need to run these two queries. Now, when you start indexing for windowing functions, something kind of disappointing is very likely to happen.

Your beautiful, magnificent parallel execution plans processing those millions and millions of rows cooperatively go away. So before, we didn’t have an index to support the windowing function. I’m going to freeze frame this about right here, because we created this index on the comments table on user ID and creation date, which matches exactly the partition by order by that we’re using in the windowing function.

If we were selecting more columns from the comments table, it is very likely that we would need to have includes for this index with those columns in them. Otherwise, SQL Server wouldn’t choose them. They would think that the effort to do lookups for all that stuff would be far too much.

And it would probably be right. And we want our indexes to be used. Otherwise, they are useless. So now that we have that index in place running these two queries, we are stuck now with these sort of annoying parallel queries.

All right. We, come on, do the grabby thing. There we are.

If we look at these two things, they’re not terribly fast. So I know I’ve said this in other videos, but we’re sitting here looking at it. So an important difference between row mode up here and batch mode down here, query plans, are the way operator times are accounted for. In row mode plans, parent operators include the time spent in the child operators.

So they sort of just add up going from right to left, starting with the index scan and going to the segment and the sequence project in the filter. Notice we no longer have to sort data for this one. We also don’t have to sort data for this one.

Sorry, my hand is getting lost in the void. Which is, you know, I mean, I guess why we created the index. So we don’t have to sort data. We don’t have to worry about memory or spilling or any of that stuff.

These slow down a bit though. In the batch mode plan, each operator accounts for its own time. So the index scan that takes 3.5 seconds, the window aggregate that takes about 100 milliseconds, and the filter that takes about 40 milliseconds.

That’s the total time for the plan. A quicker way of seeing that is to go into properties and to go to query time stats. And then that will tell you just about in total the CPU and elapsed time.

And since this is a serial plan, those numbers are equal for this one. Alright. Cool. Except it’s not cool.

And what’s doubly not cool is what I’m about to show you. We rerun these two queries. Remember the first one took about 5.5 seconds.

The second one took about 3.5 seconds. We’re going to rerun these. And of course we’re going to look at the query plans, because that’s the whole point here, isn’t it? Looking at query plans.

This one ends up being a full second slower. Weird, right? Paralympic is supposed to make stuff faster.

Why did it go a full second slower? Because once again, windowing function stuff in row mode sucks. It’s awful.

Don’t do it. Use batch mode as much as humanly possible. I beg of you. It’s a bad time in row mode.

Something kind of interesting happens with the batch mode query though. No, this wasn’t there before. For some reason, the parallel version of this plan needs to sort stuff.

Perhaps it’s a side effect of parallelism in batch mode. I’m actually not entirely sure at the moment. I’m going to have to dig into that a little bit more and get back to you.

I’m happy saying I don’t know right now. It’s just something that I’ve noticed and I haven’t had time to dig in further into. So, it’s weird, right? Not knowing everything the second you look at it.

It’s strange. Strange stuff, right? But we still have to sort user ID and creation date for that one for some reason. Perhaps it’s a function of reading from the table in batch mode that disorders things.

I’m not sure. Perhaps this is one of those crazy things like when people create columnstore indexes with a max stop one hint to get better compression and ordering and segment stuff.

Maybe, perhaps it’s related to this sort of thing. I need to find out. I need to educate myself here. I need to learn that so I can teach you.

And then someday we’ll all be smart together. Now, what I think is interesting, specifically focusing on the batch mode version, is that the reason why…

Let’s quote that out. Let’s run this. Doop-a-doop-a-doop-a-doop-a-doop-a-doop. And it ran.

And it’s still 3.5 seconds with a cost of… Oh, why’d you leave me? I was hovering. 77.3616 query bucks. If we…

Which beats my cost threshold for parallelism. My cost threshold for parallelism for this is 50. So, 77, definitely higher than 50. If we rerun this with the parallel hint back in there… Again, unsupported, undocumented, not safe for use in production, blah, blah, blah, blah, blah.

I use it in production, because it works and it makes things faster. And I know what I’m doing. I’m qualified.

I am the most SQL Server consultant. Don’t you forget it. This thing has a cost of 124, almost 125 query bucks. So, the addition of the sort in the parallel version of the plan is why SQL Server doesn’t choose the parallel plan for this.

Even though the parallel plan for this does a whole heck of a lot better time-wise, the serial version is 3.5 seconds. This is 1.1 seconds.

I will take any improvement I can. Now, again, batch mode for windowing functions is what you should be aiming for. You should not be relying on row mode windowing functions, except for maybe very small amounts of data.

So, there’s that. Realistically, though, if we’re doing this sort of stuff, we want to have a columnstore index available in order for… Did that throw an error? No, it’s running.

I was like, wait a minute. I thought I broke something. Realistically, if we’re doing this sort of work, we want a columnstore index to help our windowing functions query… windowing function queries. The reason for that, columnstore, great compression.

Minimize I.O. Native batch mode stuff. We don’t have to worry about batch mode for rowstore thinking we’re special. We get the full beauty of batch mode and columnstore together.

They’re wonderful interactive things. They interoperate quite well together. The thing is that, you know, like when we looked at the parallel batch mode, windowing function query up above, we had to sort data.

You know, columnstore indexes don’t really, you know, have data like sorted sorted the way that rowstore indexes do. So, things are a little bit different here. If we run both of these, and chug, chug, chug, chug, chug, there we go.

So, what I have here is one query running in compat level 140, where batch mode on rowstore is not available. And one query running in compat level 160, where batch mode on rowstore is available. What’s nice here is that these both get fully batch mode plans out the gate.

Of course, they both have to sort data, which isn’t awesome. But the presence of a columnstore index gives us enough batch mode goodness that we don’t have to worry about all the performance issues that come from rowstore queries. Everything in these two queries are close enough to being the same time for me that I’m not going to sit there and worry about it.

If you can, if you’re able, adding a non-clustered columnstore index to a table, in real life, you know, you’d probably want to have way more columns in your columnstore index. Because you never, you know, if I know anything about your queries, they are not reliable as to what people want to select and filter on and row number by and sort by and all that other stuff. So, you really want to have more columns in there.

So, SQL Server’s choice of using that columnstore index is far more apparent. It’s a far easier decision to make from a costing perspective. You don’t have to worry about awful lookups involving columnstore indexes.

So, it’s a good thing to avoid. So, as much as possible, you want to use, you want to pair columnstore indexes with your batch, with your window, with your window and function queries. Because that’s where you get the biggest, that’s where you get the biggest gains perf wise.

If you take a look at these two things, look at the time that it took to read from the columnstore table, versus the time that it took to read from the rowstore table, even in batch mode. Right? Far, far, way less time. Almost nothing on that. So, these two queries end up pretty good, at least as far as I can see.

I do apologize for the more frequent coughs today. It is incredibly dry in here for some weird reason. Another thing that you need to be very careful about, and this is, we talked about the indexing, now we’re going to talk about the memorying.

Another thing you want to be very careful about with windowing function queries is which columns you’re selecting. Because, you know, a big topic of stuff that we’ve talked about in this video and in the video before it is sorts. Sorts require memory.

Sorts require an amount of memory based on the number of columns going into this sort and the data types of those columns. So, the width of the rows and the length of the results. So, if we run this query, and we don’t need the legacy cardinality estimator hint for this one.

We will use it for the other one. By the way, if you are query tuning and you’re using compat level, like, what is it, 120 or higher? One of the first things you should always try, like, aside from recompile, is forcing the legacy cardinality estimator.

Because, it works, it truly works wonders on a lot of queries. So, if we look at this thing, it takes, it takes almost 16 seconds, right? That’s about as close to 16 seconds as you can get.

If you blinked, this would have been 16 seconds. This thing asks for a whopping 16 gigs of memory. Now, remember, memory grants for SQL Server can be up to 25% of your server’s max server memory setting.

Unless you use resource governor to nerf that a little bit. So, these queries with these sorts can ask for lots and lots of memory. And, of course, the reason why this sort asks for a lot of memory is if we come over here.

Come on, give me the tool tip. There we are. SQL Server thinks the estimated data size is 17 gigabytes. So, it asks for 16 gigs of memory.

It is probably not correct. It is probably incorrect. The reason for the inflated memory grant is because one of the columns, so, we’re only ordering by user ID and creation date. But, one of the columns that we have to sort is the text column in the comments table.

String columns will make memory grants huge. One way to get around this sort of stuff is to… Actually, we’re going to throw this on the pile for this one.

I like to show off this hint for this query. We’re going to structure this query a little bit different. In our comment table expression, we’re going to do all our row number stuff only selecting the ID column. And then down here, we’re going to join the comments table to itself.

And we’re going to select everything from the joined copy, C2. Notice this is dbo.comments is C2. And this is just comments.

The comment table expression is C. And if we run this, I mean, not only is this going to be faster, I think, twice as fast if I’m remembering correctly, but it’s also going to use a whole lot less memory. Because we don’t have to pass the text column through the sort.

Now, God bless Microsoft for putting operator times in query plans. Really, thank you. I appreciate it. But can you fix it?

Can you please fix it? Like, can I beg you to fix this code? Because if you look at what happens in here, and again, this is all row mode. There’s no batch mode operators in here.

None of these queries are only timing themselves. We start out at 478 milliseconds. We’re going to ignore the repartition streams because it makes things too weird. Then we go to 8.2 seconds, and then 8.7 seconds, and then 8.9 seconds, and then 9 seconds.

And then repartition streams makes things weird. 7.8 seconds. I don’t know where we lost 1.2 seconds in the shuffle.

But we did. And then we remain at 7 seconds here, and 7.8 seconds here, and 7.8 seconds here. Alright. Okie dokie.

And then somehow this ends up at 16.9 seconds. Again, parallelism stuff making things weird. If we go to the properties of this thing, and we go to query time, the elapsed time is only about 9 seconds.

The CPU time is 29 seconds. But the elapsed time is 9 seconds. Alright. So the elapsed time really closely matches about what we had here. That the rest of the query plan completely screws up.

Like it drops by a second where it shouldn’t, and then it goes up by like 8 seconds. Or 7 seconds. Something like that.

Where it shouldn’t. Microsoft, please fix it. We want to be able to tune these queries and know what things mean. This doesn’t help us one bit. But this query only asks for 2200 megs of memory.

Or 16 gigs of memory. Again, because the sort operator here, we don’t have to manage as much. Right?

We don’t have, we’re still ordering by the same two columns, user ID and creation date. But we only have to deal with selecting the ID column. So that asks for far less memory. And the size of this data is far smaller.

Alright? So 538 megs is the estimated data size. There is some additional memory requirement, of course, because we have a hash join back to this thing. If we really wanted to experiment, and this truly is an experiment, because I didn’t try this one before.

If we put a loop join hint on this thing… I don’t know, maybe we’ll ask for less memory. Because the hash join was asking for something. I don’t know.

But, but, but, but, but… Oh, the time is about the same. Oh my goodness! Look! It’s right this time!

The parallelism gathers streams did not screw us up! Great! We can actually see how long that took. Oh, but our sort spilled! Boo! What happened? How much memory did we ask for?

A little bit less. So we went from 2200 to 1800 megs. Um, oh, I don’t know. The loop join asked for a little bit less memory. Kind of a mixed bag there. Maybe it needed that other memory so it wouldn’t spill.

Either way, I’m kinda happy to have shown you that. It was a fun little diversion for me. So anyway. You can see the little hint on the bottom there. That we’re gonna talk about aggregate windowing functions next.

So, uh, we’re gonna stop this video here. Cause we’re over 20 minutes! Good lord! What do we do with ourselves in these 20 minutes? What do we even talk about?

I don’t know. Stop knowing. Anyway. In this video, we looked at indexing for windowing functions. How indexing for windowing functions can lead to serial plan disappointment. Uh, and continued, continuing the effort to get you to only use batch mode for windowing functions.

And of course, uh, be careful which columns you’re selecting because the columns you select directly affect the amount of memory that a sort can ask for, or that a sort will ask for, uh, which is based on max server memory.

So, if you, splitting, sometimes splitting queries up into what you need to order and what you need to show people is a good way to, uh, tune memory grants. Um, we also learned, or continued to push, that the Legacy Cardinality Estimator is usually the best one.

It’s my favorite anyway. I don’t know. Maybe I’m sentimental. Nostalgic old man, but that Legacy Cardinality Estimator gets a lot of things right. Anyway, uh, thank you for watching.

Hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video where we will talk about aggregate windowing functions. Big excitement, big party, big party energy. We’re going to do that as soon as this one is uploaded.

And I have had some water because, uh, I’m, I’m a little raspy. Anyway, thank you for watching. Thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Query Plan Patterns For Windowing Functions: Row Mode vs Batch Mode

Query Plan Patterns For Windowing Functions: Row Mode vs Batch Mode



Thanks for watching!

Video Summary

In this video, I delve into the intricacies of query execution plans when window functions are involved, breaking it down into three parts to ensure that the content is digestible and comprehensive. Starting off by discussing the differences between row mode and batch mode for these queries, I highlight how critical it is to use batch mode for optimal performance, especially with large datasets. I also explore the impact of having or not having a `PARTITION BY` clause in your windowing functions, noting that the absence of this clause can significantly alter the query plan’s parallelism and execution efficiency. Throughout the video, I share practical insights on how to recognize these patterns and optimize queries for better performance, emphasizing the importance of proper indexing and predicate usage.

Full Transcript

Erik Darling here with Darling Data. Feeling real happy at the moment. Extra happy. Probably the happiest I’ve been in, I don’t know, 37, 38 years? No particular reason. Feeling kind of peppy at the moment. And so this is going to be split up into three parts, not six parts, just the last part. three. One, three or the other. Only three, not double three. Because the material’s a little too dense for a single video. I don’t think anyone would stick around for the whole thing. So we’re breaking this one up into three parts. This is going to be talking about query execution plans when window functions are involved. Because it struck me when I was recording something a while back. And I was talking about how I’ve, over time, learned how to predict what will be in a query by looking at the query plan. And I think someone commented that that would be a cool thing to talk about. So I’m going to talk a little bit about query plan patterns for different queries that I can sort of recognize. There’s going to be a bunch of those videos, but there’s three on windowing functions. There’s just about enough to talk about with windowing functions that I think it is worthwhile. There are essentially three different kinds of windowing functions. There’s like the ranking ones, so like row number, rank, dense rank. There’s aggregates, like sum and count and average. And then there’s the analytical ones, like cum dist and whatever discrete something. I can’t keep track of all those things. I don’t think I’ve ever really used them. I guess Entile would be in there. Entile is the only weird one that I’ve used pretty regularly. I dig Entile. I think Entile’s a cool cat. If you don’t use Entile, I don’t know, you should. Team Entile over here with Darling Data. Anyway, what was I going to say? Oh yeah, some some crap that apparently is useful because since I started saying this stuff at the beginning of the videos, things have been picking up. So good job, everyone, for paying attention early on. If you like this content and you feel like it is worth like four bucks a month, you can sign up for a membership to my channel. That’d be cool. If you don’t have the extra four bucks a month, if that’s going into your Happy Meal fund, Lord knows inflation has really messed with the prices of Happy Meals. Maybe that’s why I’m happy. I started thinking about Happy Meals and that brought me back to my youth. Never can tell. Other ways to show your love, support, and enduring allegiance to Darling Data. Humble leader. Likes, comments, subscribes, all nice things that you can do.

If you are in a real pickle and you need SQL Server consulting, I am available to do just about anything that doesn’t involve replication. These are the things that I excel at. I don’t use Excel much, but I do excel at these things. And what do you call it? If you need anything outside of this sort of stuff, well, my rates are reasonable. Anyway, training. It’s another thing. Good to have. Good to buy and actually use. Good to purchase training and then actually go through the training.

Not just say, I’ve got the training. Because it does not through osmosis. Like Edgar Cayce, when he would say he could put a book on his stomach and absorb the knowledge. I’m not quite sure that SQL Server training works in the same way. You do have to participate in order for it to be effective. All right. Good thing to do there. If you are so smitten with me that you want to come see me do things live and in person, where you will see me in something probably other than an Adidas shirt, you can catch me in these places on these dates.

Friday, September the 6th, I will be in Dallas, Texas, doing a full day pre-con for Data Saturday Dallas. And then also on the 7th at the regular event. And then November the 4th and the 5th, I will be in Pass Summit for two days of pre-cons with Kendra Little as my charming co-host.

And that’s not co-host in like Johnny Carson, Ed McMahon sort of way. We are on equal footing. This is not like, you know, there’s like the show leader and then like the charming sidekick.

We are equal partners in these pre-cons. So I look forward to seeing you there with maybe some more ChatGPT created images. I don’t know. That seems like fun.

Anyway, let’s talk about these windowing function query plans. So what I want to go through in this one is sort of two things. One, the difference between row mode and batch mode for window function query plans.

And the other thing is difference in query plans when you have and when you do not have a partition by clause. Because that does change things a bit. Also, the type of predicate that you have on your windowing function, whether it’s equality, less than, greater than, can also change the query plan a little.

So, with all that in mind, gosh, let’s get started. All right, let’s move it on this thing. So I’ve got these two queries already run.

And what I want to show you here is without a good supportive index, both of these queries have to do a bit of work to generate the windowing function stuff. The thing is, the row mode query just plain stinks at it. When you have to do this over a lot of data, when you do a windowing function over a lot of data, you just absolutely should not be doing anything in row mode with it.

So, this query takes 9.3 seconds. This query takes 1.4 seconds. Big difference there. And it’s all because of batch mode.

Now, where things that you should know about window function query plans is that in row mode, you will always see these two operators, the segment and the sequence project. Segment is responsible for the partitioning and the ordering.

And then the sequence project is responsible for the numbering. So you put the data in the order that it needs to be in. If you don’t have an index that already has the data sorted the way it needs to be for the windowing function, then you have that sort there, right where my finger ends, right there.

That puts the data in order. Then the segment sort of does the grouping and then the sequence project sort of does the numbering. At least that’s how it was explained to me.

If someone lied to me, I will find them. Remember Boku Delta Miles over here. Diamond medallion darling data. So we have a sort that puts the data in the order that we need it to.

We have this thing that sort of does the grouping, you know, put stuff, you know, maintains the order. And then the sequence project that does the numbering. So, all well and good.

This is what you will generally see in windowing function query plans in row mode. In batch mode, it will look like this. We still have a sort, but notice that sort doesn’t quite, you know, have the problems that the other sort did. And then we will see a window aggregate.

I’ve said this in other videos, but one of the very cool things about window aggregates in SQL Server query plans is they are the only operator currently that is able to read from a batch mode sort on multiple threads. All the rows come out of the clustered index on the comments table, go into the sort on multiple threads.

The sort occurs on multiple threads, but any other operator would have to read from that sort single threaded. So, kind of a cool thing there about window aggregate operators. So, this is kind of the general physical appearance difference of query plans between row mode up top and batch mode at the bottom.

Batch mode gets the window aggregate. And then row mode has the segment and the sequence project. Because we don’t have an index in place, both of these have to sort.

So, let’s move on a little bit and let’s look at what queries look like when we don’t have a partition by clause. Notice that we are just ordering in this one and this one and this one down here. I’m going to run all three of these at once so we can compare and contrast these wacky query plans.

These wacky dacky doos. One of those things is a little slower than the others, wasn’t it? Now, the big difference between partition by windowing functions and ones that don’t have the partition by clause, they only have the order by clause, is that SQL Server eats up the parallel zone a lot earlier on in the plan.

The query plans that we looked at before were parallel the entire way through to the very end where there was a gather streams. In this one, we gather streams really early on. Both of these things gather streams right after the, well, the row mode.

Both of these things, here I go. Being a knucklehead. The row mode plan gather streams immediately after the sort and does the segment and the sequence project in a serial zone.

The batch mode plan keeps the parallelness through right after the window aggregate. And, you know, the batch mode thing is a little bit faster here anyway, but whatever. This one’s a little bit different because, you know, we really just have to, we don’t have to do as much work as we do with the partition by, especially in row mode.

Curiously enough, emitting the partition by in row mode actually gets it pretty close to on par with batch mode. But, you know, most window function queries that you’re going to use, they’re probably going to have to partition by something. So definitely use batch mode on that.

Now, one thing that I wanted to show you that sort of emphasizes the serial zone in the plan is when you use a greater than filter on a windowing function like we’re doing here. This one is uniquely designed to continue to return zero rows. We have a, what we saw in the other plans where we have, sorry, my head’s in the way.

I’m going to like Mario, boop, the clustered index scan. Then we sort, then we gather streams, then we segment, then we sequence project. And then after that, we reopen a parallel zone.

We have a distribute streams operator here that re-puts things out on parallel threads and then re-filters them. You’ll notice that this thing really slows down. This is not a good time.

This greater than predicate, whatever happens in here, it is no good, right? We are not having fun. We do a lot more work in this query. So, you know, be aware of, you know, how you do that.

Of course, if you use batch mode like any sane normal person would, you don’t have those same problems. Again, row mode query plans for windowing functions are nothing but headaches. I, you know, I will use any trick in the book to get batch mode for windowing function queries because in row mode, they just eat it so often that it’s mind-boggling.

I don’t know how we ever got away with windowing function queries without batch mode. It is truly a dismal experience a lot of the time. So, again, this very fast, even with batch mode in place.

And we don’t have the same sort of weirdness with the, you know, the serial, the parallel zone ends and the segment and sequence reject and then the parallel zone reopens. We just have a parallel plan the whole way through, which turns out a lot better for us from a performance perspective. So, in this video, we have gone over some just kind of physical appearance differences between row mode and batch mode window function plans.

Again, in the row mode plans, you will see the segment and the sequence project with an optional sort if you don’t have an index that supports the partition by order by or just order by clause. We looked at queries that only have the order by and saw that they, you know, while they did have all the same operators, the parallel zone and the only order by clause windowing functions ended a lot earlier. And we also saw that SQL Server has to do some where work when you have a greater than predicate in your windowing function filter.

Think of it like this. If you just have a where row number equals one or where row number is like, you know, equals zero or something, it’s real easy for SQL Server to figure out like where that row number would be, right? It’s going to be at the start of every, you know, either partitioned chunk or at the very start of the results for a query or for the row number, right?

Sorry. For a query without partition by only has order by, there’s only one chance for that to be the equality predicate. When you do greater than, SQL Server has to spend a lot more time putting rows through and generating those row numbers to compare them because it has to go like basically to the very end of the result to do it.

So that’s why there’s just a lot more work involved. Anyway. Hope you enjoyed yourselves as usual.

I hope you learned something as usual also. Thank you for watching. And I will see you in the next video where we are going to talk about, if I sneak down a little bit here, we’re going to talk about indexes and index usage and memory usage in windowing function query plans, which is just going to be a thrill ride.

Edgier seat. Hold on. Better hope you have your wood screws in type stuff.

So. All right. I’m going to go. I’m going to go prep for that. We’ll upload this. It’ll be a good time. All right.

Cool. Great. Good job, everyone. We barely stumbled over words we’ve been saying for all of our lives. Phenomenal on that.

All right. Great. Thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Troubleshooting Security Cache Growth In SQL Server (USERSTORE_TOKENPERM And TokenAndPermUserStore)

Troubleshooting Security Cache Growth In SQL Server (USERSTORE_TOKENPERM And TokenAndPermUserStore)



Thanks for watching!

Video Summary

In this video, I delve into a peculiar issue that has plagued several client servers—security cache growth leading to various transient problems such as CPU spikes, plan cache issues, and memory-related anomalies. I explain how the security cache, meant to simplify login and permission handling, can balloon out of control if not managed properly. To help you tackle this problem, I share a detailed walkthrough on identifying and troubleshooting security cache issues using specific queries and scripts. Additionally, I discuss trace flags that might offer some relief, but emphasize that they need to be set as startup flags for effective management. For those looking for more proactive solutions, I provide stored procedures and agent jobs that can automatically clear the cache when it exceeds a certain size, ensuring ongoing performance stability.

Full Transcript

Erik Darling here with Darling Data. And I remembered to turn my microphone on, so we’re off to a great start here. In today’s video, we’re going to talk about a very weird problem that I’ve seen crop up on client servers a number of times now and lead to all sorts of weird transient issues. CPU spikes, plan cache stuff, just weird memory things, stack dumps, you name it. I’ve seen this thing be the root cause of all sorts of weird problems. And of course that weird thing is the security cache. It sounds like this nebulous little doohickey that is supposed to make life easier by caching security information about logins and whatnot, but if it grows unchecked, it can cause some real big problems. Before we get into all that, of course, we need to talk a little bit about you and me. And things that I like when people sign up for memberships and contribute just a little bit to making sure that this channel stays free and open source for everyone to use. It’s like four bucks a month at the low end. If you don’t have four bucks a month, I totally understand.

You know, there are things that I’d probably rather spend four bucks a month on too. But if you don’t want to do that, liking, subscribing, commenting, all that good stuff is just another way to make my little heart go all aflutter. Shut up, Intel drivers. If you’re in need of SQL Server consulting, that’s my job. Apparently, I do all this stuff and more and my rates are reasonable. So you can hire me to do what I’m going to show you today for you. It’s fun. It’s really great, fulfilling, really just life-affirming work. If you need training in the SQL Server performance tuning space, you can get about 24 hours of it for about $150 when you use the discount code SPRINGCLEANING. If you look in the video description, there’s a link with SPRINGCLEANING baked right into it and you can use that. It’s amazing technology. The advanced URL technology here at Darling Data.

If you want to see me live and in person, and who wouldn’t? I’m even better in person because you can throw things at me. And, I don’t know, give real likes and comments. You can comment in real life, in real time. I think that’s called a conversation. Weird. Weird. I’ll be in Dallas Friday, September the 6th, doing a full-day training session.

And, November 4th and 5th, I will be at PASS Data Summit in Seattle with Kendra Little doing two days of SQL Server performance pre-cons. So you should come see us at those and you should come see me in Dallas if you happen to be in the neighborhood. So now, let’s get on and talk a little bit about how we can troubleshoot security caches.

Now, my good and dear friend, Josh Darnell, who is an application developer, was able to figure out this part of the demo. I don’t take a lot of credit here aside from doing some nice formatting on it, even though there are a couple things that could use some work, apparently. You know, it’s hard to find good help these days.

And, the whole point of this thing is to inflate our security cache. So, that’s what I’ve done. I have inflated the security cache by using SetAppRoll over and over and over and over again in a loop.

I actually had this loop run. Actually, the first run of it got me to about, like, a gig. So, I ran this a few times to get it up a little bit higher.

Just because it made things a little bit more interesting for me. Not because it’s, you know, really all that fun or interesting or cool for you. But, that’s what I did.

So, what we’ve got here are a couple queries that will help you look at security cache stuff. If you look at this one, you will see that things were cruising along going just fine for a while. And, then at some point, the security cache grew.

So, that’s about 2.3 gigs plus about another gig from the ACR cache store. So, that’ll be about 3.2 gigs total from there. So, this is a tough query to remember.

It’s not very portable. It’s not very interesting. I mean, it’s kind of interesting. Actually, you know, if you look at it and you actually click on the XML column, you can get a lot more information out. I don’t like parsing this stuff out from the XML to show in the tabular result because it makes a lot of, like, duplicate lines that are just kind of messy.

I generally just zoom into where, like, things grew or when things grew or, like, if they, you know, spike up from, like, a lower number to a higher number or a high number to another higher number. And, then I sort of just dig around in here because you can see all sorts of interesting stuff about, you know, entries getting put in but not removed and the size of things. And, it is, you know, mildly interesting if you have this particular fetish.

If you want an easy way, excuse me, an easy way of figuring out if your system cache, if your token and perm user store is growing a lot, you can use my free store procedure SP pressure detector. I’ve got it set up here to only look at memory and to skip some other stuff that’s not really pertinent to us. But, if you run that right at the very top, you will have this section here and you will see user store token perm is about 3.2 gigs total, which I believe is about what we talked about it being from the XML.

When we did the XML query, it was, like, you know, 2.3 plus 0.9 something gigs. So, that’s the size of the token perm store there. Now, you can clear this out manually by running this, DBCC free system cache token and perm user store.

So, but the thing is, if this is something that happens regularly because of your application either using set app role or, like, doing impersonation stuff, like, I think execute as is another thing that can really pump this up. Switching users back and forth in queries for different reasons. I’ve seen a bunch of applications that, you know, log in as one user, switch to another user to do something, switch to another user to do a different thing.

Like, they have different permissions and schemas and stuff. All those things will inflate the security caches. So, you can totally run this to clear that out.

If this is a big long-term problem for you, there are a couple of trace flags that can help. The thing is, they don’t help if you just do this. These have to be startup trace flags for them to really make a difference.

So, if you want to look into what 4610 and 4618 do, if you’re having this problem, go crazy. They’re pretty useful if you’re having the issue, but only as startup trace flags. They don’t fix a problem if you just enable them globally.

If this is a problem that you’re having a lot and the trace flags don’t help and your security cache is still growing, over in my GitHub repo, which I’ll have a link to in the video description, I’ve got a few scripts in there that can help. One of them is a store procedure that will run, look at the size of your security cache, and there’s a parameter that you pass in to say how big of a security cache you care about.

If it grows beyond a certain size, it’ll run that DBCC free system cache call and clear it out for you. I’ve also got an agent job to set that up to run. The schedule, I think, is baked in for like every hour or something.

If that’s not often enough, you can, of course, adjust the schedule. But all of this stuff, you can just hit F5 on. And, of course, if you want to inflate your security cache for some reason, or you just want the standalone analysis scripts here, you can use that.

Also in my GitHub repo is SP Pressure Detector right down here, which you can also get totally for free. You don’t have to like or subscribe or comment on that, but you can get that and also view the biggest memory consumers on your server.

And if that user store token perm stuff is up there, you might want to think about running the DBCC command to clear it out, maybe enabling the trace flags, and maybe using this code to set up a job to clear it out on a regular basis, because you might be having all sorts of weird performance issues and reliability issues because this thing grows out of control.

As for like how big it has to be before I worry about it, generally, once it gets past the 2 gig mark is when I see signs of trouble. If it gets up past like 4, 8, 16, 20 gigs, somewhere in there, then you’re just about guaranteed to have some issues.

So I’m usually pretty aggressive on this, and I usually set that to be around like 1, 1 to 2 gigs to clear out for the store procedure there, because like really anything beyond that, you’re just kind of asking for trouble in the long term.

So I hope that this is not a problem that you have. I hope that you don’t have applications that blow out your server’s security cache, because memory is precious, right?

And if you have, you know, 8, 16, 24 gigs of security cache, that’s memory that your server can’t use for other stuff, like caching data pages or query memory grants or having a plan cache or other things like that.

So it’s a bad problem to have. If you are having that problem, you’ve got some trace flags that you can look into. You’ve got some scripts that you can run to clear it out.

If you, you know, I honestly like, you know, I say try the trace flags, but a lot of people can’t just restart SQL Server with new startup trace flags in place. It might be safer for you to just use the scripts there.

Excuse me. So look at your SQL Server with SP Pressure Detector. If you see high user store token perm or anything like over like the 2 gig or so mark, you might want to think about, you know, clearing that out, see if the problem comes back.

If it keeps coming back, I’ve got you on the scheduled stuff with the store procedure right there. The store procedure also does some logging so you can see like, you know, which runs cleared stuff out, how big the security cache was when the run cleared. So there’s some diagnostic data in there too that’s pretty helpful.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that all your dreams come true. I hope that you just get everything you want from life.

It’s a short endeavor and feeling like you are missing out on stuff is never a good feeling. So I hope you’ve got no FOMO. I hope that you get everything that your heart desires, including this video coming to an end.

That’s what I desire right now because I feel like I’m sticking the landing a little bit here. Anyway, thank you for watching. I’m going to upload this and figure out what to do with my life next.

All right. Cool. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Catch Me On The Azure DevOps Podcast: Technical Debt

Professional Rambler


I met The Azure DevOps Podcast host Jeff Palermo (X|L) at Red Gate Summit in NYC, where we were on a panel discussing database stuff, ‘natch.

A few days later, he invited me to be on his podcast. Here’s the episode:

If you prefer to listen another way, here’s a link to the episode with a bunch more options.

If you’re an Appler, you can listen here.

Thanks for listening!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join Me And @Kendra_Little At @PASSDataSummit For 2 Days Of SQL Server Performance Tuning Precons!

Last Year


Kendra and I both taught solo precons, and got to talking about how much easier it is to manage large crowds when you have a little helper with you, and decided to submit two precons this year that we’d co-present.

Amazingly, they both got accepted. Cheers and applause. So this year, we’ll be double-teaming Monday and Tuesday with a couple pretty cool precons.

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

Here are the details!

Day One: A Practical Guide to Performance Tuning Internals


Whether you’re aiming to be the next great query tuning wizard or you simply need to tackle tough business problems at work, you need to understand what makes a workload run fast– and especially what makes it run slowly.

Erik Darling and Kendra Little will show you the practical way forward, and will introduce you to the internal subsystems of SQL Server with a practical guide to their capabilities, weaknesses, and most importantly what you need to know to troubleshoot them as a developer or DBA.

They’ll teach you how to use your understanding of the database engine, the storage engine, and the query optimizer to analyze problems and identify what is a nothingburger best practice and what changes will pay off with measurable improvements.

With a blend of bad jokes, expertise, and proven strategies, Erik and Kendra will set you up with practical skills and a clear understanding of how to apply these lessons to see immediate improvements in your own environments.

Day Two: Query Quest: Conquer SQL Server Performance Monsters


Picture this: a day crammed with fun, fascinating demonstrations for SQL Server and Azure SQL.

This isn’t your typical training day; this session follows the mantra of “learning by doing,” with a good dose of the unexpected. Think of this as a SQL Server video game, where Erik Darling and Kendra Little guide you through levels of weird query monsters and performance tuning obstacles.

By the time we reach the final boss, you’ll have developed an appetite for exploring the unknown and leveled up your confidence to tackle even the most daunting of database dilemmas.

It’s SQL Server, but not as you know it—more fun, more fascinating, and more scalable than you thought possible.

Going Further


We’re both really excited to deliver these, and have BIG PLANS to have these sessions build on each other so folks who attend both days have a real sense of continuity.

Of course, you’re welcome to pick and choose, but who’d wanna miss out on either of these with accolades like this?

twitter
pretty, pretty, pretty, pretty good

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

See you there!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.