Rewriting T-SQL Scalar UDFs So They’re Eligible For Automatic Inlining In SQL Server

Rewriting T-SQL Scalar UDFs So They’re Eligible For Automatic Inlining In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into the intricacies of scalar UDF inlining and how it can impact query performance. You’ll learn about the limitations that prevent a UDF from being inlined, such as invoking time-dependent functions like `getDate`, and why sometimes rewriting a scalar UDF to accept parameters instead is the best workaround. By sharing practical examples and detailed walkthroughs using the Stack Overflow database, I illustrate how making small changes can significantly improve query performance. Whether you’re looking for tips on optimizing your SQL Server queries or just curious about the nuances of UDF inlining, this video offers valuable insights. So, if you found it helpful, don’t forget to like, comment, subscribe, and maybe even join my channel as a member for more exclusive content!

Full Transcript

Erik Darling here with Darling Data, voted by BeerGut Magazine to be the SQL Server consultancy most likely to be run by your best friend. So, isn’t that wonderful? Don’t you love being best friends? I’m having a nice time with this relationship. In today’s video, we are going to talk about function rewrites for UDFN. UDFN lining. Why are we going to talk about this? Well, I do have to go to a website after I get out of the slides, but we’re going to talk about this because scalar UDFN lining, for all of the good that it tries to do with your awful UDFs, has some kind of silly limitations that have some kind of silly workarounds, and sometimes you have to employ those in order to get UDFN lining to work correctly. One actually may wonder, if you have to say, if you have to write a little bit more out loud in front of the world in front of 4300 data darlings, if you’re going to rewrite the scalar UDF so that it can use UDF inlining, why not just rewrite it as the inline version of a UDF? The answer is, sometimes that’s hard. Sometimes a UDF is perfectly inlineable just making a small change rather than having to make all of the changes required for writing an action.

Inlineable inlineable inlineable inlineable inlineable function. And boy howdy, sometimes the easy way out is the best way out. Maybe that’s a little grim, but I don’t know. Anyway, before we talk about that, let’s do the housekeeping. If you like this channel, if you like my content, if you’ve already liked and commented and subscribed, but you want to do more, you just have this deep yearning to say, Erik Darling, thank you for the hours and hours of content that you provide for free. You can sign up for a membership. The link to do that is in the description of this video. It is the name of this channel with slash join at the end. I didn’t make that up. That is not a database thing. That’s not like a database joke that I customized for this. That’s just how Google chose to do it with YouTube. I don’t know. It works for me. I’m fine with it. If you need help with SQL Server that is beyond what you can find for free on YouTube or blogs, and you would like a young, good looking consultant to come in and appraise your SQL Server for what it’s worth, I can do all of these things and more. And as always, my rates are reasonable.

If you would like some low-cost, high-quality content, there are 24 hours of it available at my website, training.erikdarling.com. There is beginner, there is intermediate, there is expert slash advanced, whatever you want to call it. I’m not going to say that I’m going to melt your brain because I’ve had my brain melted in the past standing in front of a microwave in the 80s, and it was not pleasant. So there will be no melted brains on my watch. There will be plenty of smart, learned people using SQL Server, though.

You can use the discount code SPRINGCLEANING to get 75% off. There is also a link to that in the video description. Would you believe it? Marketing wizard that I am. If you would like to see me live and in person, either in Dallas or Seattle, I realize maybe not the two most convenient locations for a lot of you watching, but that’s what I got so far. If you would like me to come to a more local venue, tell me when your local venue is having a thing where I could do a thing, and I’ll show up to the thing.

There’s always that. I like going to things. Sometimes. I mean, it’s nice to just, you know, drink alone for a little bit. But Friday, September 6th, I will be at Data Saturday Dallas, and I will be bringing both pokey bribes and sticky bribes, sparkly sticky bribes.

Ooh, you are feeling sleepy because it’s lunch, and you just ate a sandwich and a cookie out of a cardboard box. Now you’re not going to pay attention for the rest of the day. You can come catch me at either one of these events, Past Data Summit or Data Saturday Dallas.

I’ll be there. Past Data Summit, I’m especially looking forward to. One, not because it’s an election. It’s my birthday, and I’m co-hosting two days of performance tuning content with Kendra Little, and that’s going to be a great time.

I don’t think that the election is going to be a great time, especially in Seattle. It can’t go well either way. So with that out of the way, let’s begin the pastry.

I mean, the party. All right. I think I used that joke before. I can’t remember now. Anyway, we’re off. We’re off to the races. Oh, that hat fits much better when it’s not zoomed in than when it is zoomed in.

I should probably shrink that hat down a little bit. But objects in hat may be larger than they appear. Anyway, so I told you we had to look at a website, and so that’s what we’re going to do first.

And one of the limitations on UDF inlining is here. The UDF does not invoke any intrinsic function that is either time-dependent, such as getDate, or has side effects, such as new sequential ID.

Either of these arrangements will prevent your UDF from being inlined. It sounds terrible, doesn’t it? Well, why do we want our UDF to be inlined?

It’s also a good question. It’s also a very good question. So Scalar UDFs and SQL Server have two main performance drawbacks. One is they force the query that calls them to run single-threaded.

They are ineligible for parallel execution plans, and that can make a very big mess for big reporting-type queries when you call them. The other downside is that Scalar UDFs are not invoked once per query. They are invoked once per row that has to get processed through the function.

So if your function has to run over a query that needs to do something with a lot of rows, your function could run a lot of times. And if your function is eligible and is inlined, just like if you wrote an inlined table-valued function, your query will suffer from neither of those fates. It may suffer other performance maladies and whatnot, but those two won’t be it.

Unless you do something else that causes that to happen. Then I can’t help you. Well, I mean, I can help you.

My rates are reasonable. But I’ve seen your code. You seem to practice doing things bad. It’s a thing.

It’s a thing that I’ve noticed. All right. So let’s get into our Stack Overflow database, and let’s set the compat level to 160. 160 is not explicitly necessary for this. We could also be in compat level 150.

Isn’t that just a dream and a joy? We could be in either one of those. As long as you’re on SQL Server 2019 or 2022 or in Azure somewhere. If you’re on SQL Server 2017 or lower, none of this is available to you.

So sorry about that. You should maybe think about upgrading rather than watching YouTube videos. Go download and install it.

I don’t know. So here’s our function. Let’s create or alter it just to make sure that it is properly created or altered. And here’s the sticky part of our function. We have this fallback where if end date is null, we pass in the get date built-in thing.

It’s pink, so I think it’s some kind of function, right? All the functions that are built-in in SQL Server are pink as far as I know. I don’t have anything against that.

I rather like the contrast. If you look up here, it’s like case when blue date diff pink. It’s kind of nice. Mentally, if date diff were also blue, it would all blend.

You wouldn’t know what you were doing. It would be impossible to tell anything apart. So I guess that’s also true when you nest a lot of functions. That’s why I have to separate things out like this so that all this pink does not run into one just dolly-esque mess of drippy things.

So let’s run this query. And I’m going to be honest with you. This query is going to drag on for a little bit. Oops.

Query is going to go for about 16, 17 seconds. It’s not going to be a good time for anyone. All 16, 17 seconds of it. Because, I mean, let’s be honest.

Waiting 16 seconds for rows to return is not anyone’s idea of a good time. I don’t think my microphone picked up on it, but the fans on my laptop, they’re all flustered right now.

Spitting like crazy. If we look at the query plan, we are going to pay attention to two things. One is if we look at the properties here, we will have, as is usual, when we have a scalar UDF in our query plan, we have this non-parallel plan reason.

Since I’m on SQL Server 2022, I actually have a reason here. If you’re on an earlier version of SQL Server, it will probably just say non-parallel plan reason could not generate valid parallel plan.

But since I’m on SQL Server 2022, this actually shows up. And you can figure out exactly why your query could not generate a parallel execution plan. If we peruse some of the plan operators, we will see that, well, we do spend some time in other places.

Oh, geez, that didn’t go well. We spend about four seconds in here. We spend about two seconds in here. There are a few hundred milliseconds in here. But really where things get boggy is in this compute scalar. This compute scalar is what is responsible for the execution of our scalar UDF.

We can see the number 1166178 in here. All right. That’s an important number because when we go…

Oh, man. I just… Hang on. I got to fix my zoom-in. It changed to a red cursor. And I do not want a red cursor. I want a pink cursor. When we go look at this query, it’ll either be exactly that number or it’ll be twice that number.

I’m dying slowly because I practiced this demo and there might have been more stuff. Anyway, here we are. We can see the execution count of our sneaky function matches the number of rows that were returned by the previous query.

And what’s important here is that the total worker time for the function was not the problem. The main problem was the sheer number of executions that the function had to be invoked for and that it forced the rest of the query to run on a single thread.

All right. So this function didn’t actually take up a whole lot of CPU time, but it just dragged on and on and on because it had to execute so much, which is obviously not what we want. This is not an acceptable situation for us.

If you’re on SQL Server 2016 or better, you will have access to this wonderful dynamic management view called sys.dmexec function stats. Sys.dmexec function stats will tell you information specifically about your scalar UDFs.

It does not track inline table valued functions or multi-statement table valued functions. This is purely for the inline, for the scalar UDFs. And I have a feeling that this thing coming around was maybe part of what Microsoft wanted for doing this scalar UDF inlining thing.

I have a feeling. Maybe. You know, it’s just a guess. Just a little guess. Yes. So like I said, this is a very silly reason to need to rewrite a function in this way, right?

Because just because we have get date here, we can’t have the function get inlined. But we can change the function a little bit. Again, this might be where you’re looking at me and saying, Eric, why don’t you just write the whole function as an inline table valued function?

And then we don’t have to think about it. And like I said earlier, when I pose the same dilemma, a lot of the times when I’m rewriting stuff and trying to help clients get things to go faster, there might be very complex logic inside of a scalar UDF.

And the only thing that might be holding that UDF back is the invocation of get date in one or many places in the UDF. And sometimes it’s a little bit easier to just get that out of the way so that SQL Server can do the rest of the inlining work rather than me spending a lot of time writing a very complex inline table valued function trying to get everything working.

So let’s create or alter this to make sure that it is properly created or altered. And let’s run this version of the query now, where we pass in get date from outside of the function, right?

Notice that get date is now out here rather than in there. And you would think that that wouldn’t be a meaningful enough distinction to have anything happen, but it is.

It’s wild. It’s wild what SQL Server has gotten up to. Because now when we look at this query plan, we will see two things. We will see parallelism. And we’ll still see a compute scalar, but the compute scalar has these weird, funny extra branches in the query plan.

And this compute scalar down here with these nested loopy joiny things in it, this is where the function is invoked inline in the query. So this function no longer ran 1.1 something million times. It was inlined into the query and just kind of ran normally.

I mean, I guess technically since it’s a nested loops joint, it may have actually run that many times, but it wasn’t like the same black boxed UDF invocation stuff before. And we got a parallel plan out of it.

So now the whole thing finishes in about 3.7 seconds rather than about 16 seconds. So moral of the story here is that get date in your scalar UDFs will prevent your UDF from being inlined.

If you can replace the get date invocation from within the UDF with a parameter where you can pass get date in from outside of the UDF, SQL Server can then inline your UDF and maybe make performance better for you too.

If you could shave 12 seconds off a query just by adding a parameter to a UDF and passing in get date from the outside, wouldn’t you do that? Wouldn’t that be a nice, simple, easy thing to do? I mean, this is a simple enough function, I suppose.

I suppose you could just put that case expression in somewhere, but sometimes you have to deal with the cards you’re dealt. And sometimes in more complicated situations where maybe this wouldn’t be so readily available to you, just replacing get date in the UDF with a parameter that is assigned get date outside of the UDF is a better option.

So, sometimes it’s so simple you just want to smack yourself silly or your brain melts on the sidewalk. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. I hope that this was one of the finer experiences that you’ve had watching a SQL Server video. If you do care to like, comment, or subscribe, or become a member, or buy my training, or hire me to do consulting, we could have magical times together just like this, you and I.

Because we are best friends after all, aren’t we? All right. Great. Anyway, thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.