T-SQL Scalar UDF Plan Caching and Reuse in SQL Server

Video Summary

In this video, I delve into the intricacies of scalar UDF parameter sniffing, specifically focusing on how different parameters can lead to varied execution plans. I walk through a detailed example using two versions of a scalar user-defined function—one without and one with a recompile hint—to illustrate why parameter sniffing is crucial in understanding query performance. By leveraging extended events and dynamic management views, I demonstrate the nuances of plan caching and reuse, highlighting how these functions can significantly impact execution times based on input parameters. This video aims to provide insights that go beyond just stored procedures, emphasizing the importance of considering function behavior when optimizing SQL Server queries.

Full Transcript

Erik Darling here with Darling Data. And today’s video is going to be about Scalar UDF parameter sniffing. There was a question in recent office hours where I was like, I got blog posts about that. And then I went back and I read the blog post. And it’s funny, like, reading old content because you’re like, oh, I could make that better. Oh, I could fix that. But, you know, updating old material always seems to get get pushed away in favor of forging ahead, creating new material. I went back and I actually reviewed the file that I had on that stuff. And I was like, well, it’s like, I had to get more into some of this stuff. So I actually ended up using completely rewriting completely different functions and giving much more interesting tests on these things. So the next three videos are going to be about Scalar, multi-statement and inline user-defined function parameter. Well, like plan caching parameters sniffing. So we will endeavor to do our best covering those. Down in the video description, you will find all sorts of useful links. You can hire me for consulting. You can buy my training. You can become a supporting, that means with money, member of the channel. You can also do other fun things with your life, like ask me office hours questions. And of course, if you like this content, please do like this.

Subscribe and tell a friend. Past data community summit. Oh boy. These are probably sound like my last pitches that I can do on this thing. But November 17th to 21st, Seattle, Washington. As usual, me and Kendra Little are going to do two days of the absolute most magnificent T-SQL pre-cons that have ever existed. Just the entire span of human existence. So I hope that you’ll be there getting swag and other things from me. T-shirts and stickers and buttons and all the other stuff. So anyway, let’s start this database party up here. So there are a few things that I need to walk through before I show you exactly what happens when these functions start executed.

The first one is a view that I’m going to use to look at scalar UDF details. This hits a few dynamic management views and gets a pretty, like the reason why this is a view here is because I don’t feel like having this code interspersed throughout the presentation and having to reference the entire thing over and over again. So we’re going to look at the sys.dm.exe function stats view. Brand spanking new in SQL Server 2016. Glad you’re keeping up. And we’re going to use this to look at scalar function performance metrics. The second thing is, of course, the scalar UDF itself.

All right. So this is going to be the scalar UDF that we’re going to use. There’s actually two of them. And there’s two of them for a reason that I’m going to explain in a moment. But the idea here is to find out, based on someone’s reputation, we find people whose reputation is like 25% lower and 25% higher.

We can, of course, manipulate this if we wanted to, but it seemed like a good starting number. And we sum up all the reputation for other people who have a similar or all the post scores for people who have a similar reputation. And we look at those. It’s kind of silly, but it sounded like a fun function to write, so I wrote it.

So I don’t really care if it has a, I don’t really care if it delivers actionable business insights for anyone. Because the idea is to show you a behavior, not to show you actionable business insights. I’m on a dashboard. My eyes are up here, as they say.

So there’s this one. But then I also had to create a version where there’s a recompile hint inside it. And the reason I had to do this might surprise you. If you run a query, let’s say like a select query from blah, blah, blah, blah, blah, blah, blah, and you reference a scalar or multi-statement user-defined function in that query, and you have a recompile hint on the outer query, it does not recompile the scalar or multi-statement function in your query.

Fun, right? So the reason why that’s important is because over here, I had written both for the scalar, but this version has the multi-statement function column in it.

But I had written a loop. And the reason that I wrote a loop is because I needed to find out which of these values generated different plans. And the easiest way for me to do that was to grab the histogram from an index, put that into a temp table, and then have a cursor go over the temp table.

I did this in two different ways. One was just grabbing all of the range high keys. And then one way was grabbing all the range high keys plus one. So like I was trying to see if like the in-between numbers gave me any different plans.

Mostly they didn’t. And mostly I think they didn’t because I do not have a direct equality predicate in this. It’s a greater than, equal to, less than, equal to predicate.

So this was maybe a little bit of a wasted effort adding the plus one, but it was a fun experiment anyway. And then basically what I do is I cursor over the table, grabbing the range high key, and then execute.

Well, you can see there’s a recompile in the name of this function. This is clearly not the scalar version of the function, but this is the multi-statement version. And basically it’s just me trying to find different plans getting generated.

The way that I examined those plans getting generated was with extended events. And of course, I used my fabulous store procedure, SPHumanEvents, to spin up an extended event that grabs actual execution plans so that I could see what plans the function generated on each call.

Because getting execution plans normally, you don’t see that, right? Like when you get an actual execution plan for a query that calls a scalar UDF or a multi-statement table-valued function, the actual execution plan for the function doesn’t show up in the query.

That only happens for functions that are inlined into the code. So like if you have a non-inlinable scalar UDF as mine is, you get nothing, right?

It doesn’t show up in there. You can get an estimated plan, but the estimated plan often doesn’t match the actual plan. That’s another funny thing that I ran into quite a bit with this. It was a bit of a head scratcher at first.

So that’s what this session over here is. And there are some filters on this. That’s why you might see displaying zero of filtered event. Displaying zero of filtered zero events.

Total. What? Okay, whatever. Microsoft English. It’s a nightmare.

Anyway. So that’s this option recompile version of the function here. So what I want to begin by showing you is we can turn off actual execution plans for the moment. They’ll only get in our way.

Is if I run the recompile version of the store procedure, this was my big finding with the loop that I ran. If I run this and we keep an eye on this window over here, I might have to tinker with the filter, so don’t be mad.

If I run these three queries, all of these three queries will generate different execution plans because they’re going to recompile. There’s an option recompile on this each and every time.

So that’s running this version. And if we come over here and look at the live query data, wow, it worked. We have three query plans for this. And the reason why we generated three different plans is notice this number right here changed.

That’s the second parameter in the function. That’s the reputation one. This is the one that really drives cardinality estimation for the query plans that we get. So running these for one, three, and five, clearly I didn’t need that whole loop through 195 histogram steps because the first three, four lines in the histogram gave me everything.

So great. Anyway, for reputation one, we get this plan. Which takes about 1.2 seconds.

We can see that it is a big parallel plan. We scan an index over here. We scan an index down here. And the whole thing takes, again, about 1 point. Well, actually, it’s about 1.3 seconds.

The second query plan looks like this, where we have some constant scans and some compute scalars and blah, blah, and a nested loops join.

And then we go parallel over here and we do some stuff. The third plan looks like this, where this one’s a little bit different from this one, right? This one, we have one, I think it’s one set of constant scan stuff over here. But this one, you know, we still have one, but it’s different, right?

Because it’s a different, like, join over here. This one is nested loops, nested loops, nested loops. This one is nested loops hash join, right?

So we change things a little bit here, right? Because there’s a key lookup in this plan. Oops, this one. There’s a key lookup in this plan. And this one, we just scan the index on the post table.

This one takes about 735 milliseconds. And the one prior takes about 75 milliseconds. All right. So let’s keep those numbers in mind as we forge forward in our path to figure all this stuff out.

So there’s some stuff in here about the histogram that is unimportant. And there’s also some stuff in here about getting some frequencies. When I used to work in market research, you would call getting counts of various things frequencies.

How frequently did this happen? What’s the frequency of this? I’m going to throw the whole thing into the garbage. But this stuff is not pertinent to us.

This was pertinent to me writing the demo. So we can move on from that. But what I want to show you here is let’s come over here and let’s clear data out of this thing. And let’s run dbcc free proc cache because we’re going to have the non-recompile version of this first scalar UDF.

And what we’re going to do is we’re just going to get the top one row for where reputation equals one. And that takes about 1.3 seconds. Crazy, right? If we come over here and look at live query data, well, I guess it was more like 1.2 seconds.

I’m sorry. I misled you by 0.1 of a second. I hope you can forgive me. I hope you can. Find it in your heart.

Now let’s run this query. And let’s get everyone with a reputation over 800,000. Run this. And this is going to return four rows in about five-ish seconds or so.

And if we come over here and we look, there are four rows. There are four new execution plans. We reuse that hash join execution plan every time, right? And some of the times are a little bit different.

Like this is a 952 milliseconds. There’s 990 milliseconds. There’s 993 milliseconds. And there’s 1,003 milliseconds or 1.003 seconds. So every time that we ran the function, the function reused the plan that was cached for.

All right? Let’s clear that out. And let’s repeat our experiment. First, let’s see in here. Let’s validate some stuff.

Here is our function. Here are the five executions. And we have the total and average worker time and the total and average elapsed time. Something that I didn’t talk about up there that is kind of funny is all of the function plans that we saw, it’s inside of a scalar UDF, right?

And one thing that a lot of people, what they mess up about scalar UDFs, and I’m actually going to come back to these because it’s interesting enough to sort of backtrack a little bit. What everyone says about scalar UDFs is, oh, they prevent parallelism, which is true, right?

For the non-inlineable scalar UDF will inhibit parallelism for the query that calls it. But the body of the function is completely free to go parallel, right? Like this is clearly a parallel execution plan.

There is clearly parallelism in this execution plan, right? We can see the parallelism, right? We can smell it. It smells like fast. And we can see the parallelism in this one.

So the body of the scalar UDF is completely free to go parallel, but the query outside that calls that function is not free to go parallel, right? So I’ll actually show you that in the next demo just in case you don’t believe me.

Anyway, let’s free the proc cache and let’s run this now for reputation three, right? So let’s run this. And actually, we’ll turn on query plans for this one.

We’re going to have an extra execution of this query, but that’s okay. If we go to the properties over here, we will have this little thing in the query plan. Non-parallel plan reason. T-SQL user defined functions not parallelizable.

So they inhibit parallelism for the query that calls them, but the body of the function perfectly fine to go parallel, right? Good for us. Anyway, if we come over here and look at the live query data, we’ll see the two executions of this and the two query plans for this that do in fact use parallelism in quite a bit of the plan.

And now, so remember the last one took, I don’t know, like three, four seconds? But that was because the query plan for that one for each execution took around 991 second to run.

These function calls, I’ll take 70, well, rather, these function calls specifically take about 68, 69 milliseconds to run. So when we come over here and we run that 800,000 query for the same four rows, this comes back a lot faster because we reuse a slightly faster execution plan.

So we can already see that T-SQL non-inlineable scalar UDFs get a plan cast for them and further executions will reuse that execution plan, right?

So we come over here now and we look, remember the first two in here were the ones from our executions. The next four are from what our query returned, right? So this one, two, three, and four, these all took around 90 to 100 milliseconds or so.

And we got, but we got faster return results because we were using a generally faster execution plan for this query. So the plan reuse thing here can be a pretty big deal. Not a lot of people think about functions when they think about parameter sensitivity.

Everyone looks at stored procedures and SP execute SQL and like, ah, you parameter sensitive jerk. But functions, just as likely to happen. Fun stuff, right?

And just to sort of prove things out, let’s come over here and we’ll notice that the total and average times for these things are way down, right? From the initial invocation there. And just to sort of prove things out a little bit further, we’re going to do this for five as well.

So if we run this, did I clear this? I didn’t. Let me make sure this is cleared out. Oh, oh, I didn’t mean to clear all filters. I just meant to clear data. There we go.

Anyway, we’ll just deal with it. All right. So there’s our first execution. And we can turn execution plans back off. And this is why I had things filtered out. So here is our query plan for that.

There’s one other query plan in here for the calling query, right? Well, it’s, but this is the one that we cared about. But now if we come back over here and we run this for the 800,000 people, this is going to take a few seconds, but we get results back.

And then we have, if we scroll a little bit, we will have the query post execution plan for all of our executions of that. Oh, we’re up at the top.

So let’s scroll down a little bit, but you get the point. Oh, that’s not the point. Point is down here a little bit further. So there’s one reuse of the plan. There’s two.

There’s three. There’s four reuses of the same plan. So T-SQL, scalar UDFs, most definitely, most definitely. Cache execution plans and reuse them.

And if we look over here, then we will see, of course, the times on that. And these times went up a little bit. That query plan is a little bit slower for the results that we were looking at there.

So as always, I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where most definitely we are going to talk about the multi-statement variety of UDF.

And we will find out how plans are cached and reused for these. So we have that to look forward to, don’t we? All right.

Thank you for watching. Thank you. Thank you.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.