Why Does Erik Talk So Much About Scalar UDFs?

Thanks for watching!

Video Summary

In this video, I delve into why scalar User Defined Functions (UDFs) are a frequent topic in my content. I explain how they can severely impact query performance by preventing parallel execution plans and causing excessive row-by-row processing overhead. By walking through simple examples, including a demo of a UDF that simulates waiting for milliseconds, I illustrate the non-trivial time costs associated with function executions, especially when called repeatedly. The video also explores scenarios where scalar UDFs are used in WHERE clauses or joins, highlighting how these can dramatically increase query execution times due to their lack of sargability. Through this exploration, viewers gain a deeper understanding of the performance implications and learn strategies to avoid unnecessary UDF usage in their SQL queries.

Full Transcript

Erik Darling here with Darling Data. And in today’s rock’em, sock’em episode? Is it an episode? Am I just having an episode? You may never know. I’m going to answer an email question that came in asking me, Erik why do you talk so much about scalar UDFs? And well, boy, boy, as a consultant, I could go on for ages about the problems I’ve seen UDFs cause. But, you know, and this is something that’s in all the videos where I talk about scalar UDFs. The problems that you run into are that they disallow a parallel execution plan from the query that calls them. So whatever query invokes the UDF cannot go parallel. That’s not a big deal for every query, but for a lot of the queries that I see, that is a big deal. Another very, very big problem, problem is that scalar UDFs do not run once per query. They run once per row that the scalar UDF needs to process to return a result. This is of course for any UDFs that are not inlineable. By inlineable, I mean if you are on SQL Server 2019 and you are in Compat Level 150 or better, SQL Server has a feature called UDF inlining that will make an attempt to basically manually re assemble your assemble your scalar UDFs is inline code to the query, which can have awesome results when it works, but there are a lot of limitations and restrictions on it. Going a little bit further than that, one thing that I see quite consistently in client code is not only like a scalar UDF that does something, but that scalar UDF might reference other scalar UDFs.

That scalar UDF might reference multi-statement table valued functions and that UDF might actually be recursive to some degree. So it might, the UDF might actually invoke itself. All of these things pile up and cause very, very real, very, very awful performance issues for queries, especially when those queries need to return a lot of rows. So in this video, I’m going to go over just a very sort of simple way to understand just how much overhead there can be in scalar UDFs, even relatively simple ones that don’t do like, like if you think about like a scalar UDF and I’m going to sort of, I’m going to use a bit of a shortcut here verbally to explain what I mean. But let’s just say that the scalar UDF executes in memory. I don’t mean using Hecaton or an in-memory feature. I just mean, let’s say the scalar UDF accepts an input and just does something to that input.

Like, let’s say it accepts like a date from a date time column from a date time column or something. And it reformats that date time to fit some description. Or it chops the time off the date and just returns a date. See, before the, before SQL Server had a date data type, it was very common to see functions that did that and like, like convert the date to like a VAR car 10 or 11 or something to remove the time portion of it. There’s a lot of stuff that scalar UDFs do that would might be like in memory, but even getting like entering that UDF code and performing whatever in memory processing you have to do to whatever data gets fed into the function has overhead to it.

That will get worse when your scalar UDFs start reaching out to other, to user tables, like, like objects in the database and accessing those to either, you know, fetch some data or set variable values inside of the UDF. Or even worse, I see UDFs written with, I saw not as recently as it is, well, I mean, it’s November now, so it was October when I saw it, but I saw a 3,500 line UDF with four different cursors in it. And people couldn’t figure out why this query was slow. As soon as you commented out the UDF call in the select, that query finished instantly.

You left that in there, it took 35, 40 minutes for this query to finish. This thing was awful. So a lot of people think that there is a freeness to writing and invoking a UDF that just is not there. They’re real big problems when it comes to query performance.

But here I’ve rambled on long enough. Here I go just getting off on my pretty pony, babbling, babbling, babbling. So let’s talk a little bit about this channel and your interaction with this channel, either emotionally or financially. You can, if you click the link down in the show description, you can get a very low cost membership to support my work doing these videos, sharing my knowledge, you know, for free.

It’s not free if you, if you, if you chip in, but it’s like four bucks a month at the low end. And we’ve got like 30 people on board for that so far. So that’s pretty good.

If, if you are not the type of person who is, is keen on, you know, supporting people who, who try to help, help you for free. All of this content can be liked and commented on and subscribed to, which are equally good numbers to see on the up and up every time I hit refresh on YouTube 72 times a day. If you need help with your SQL Server, even awful scalar UDFs, boy, I, I, I have a lot of practice rewriting them.

I am very good at this stuff. Probably the best in the world at this stuff. Uh, if you want to hire someone who’s like crappier at it than me or cheaper than me or whatever, go ahead.

Eventually our paths will cross. Uh, but this is also a good time to mention that my rates are reasonable. So suck on that.

Uh, if you would like some very high quality, very low cost SQL Server training, uh, even cheaper than anything that you will find on those other sites telling you that they, they’re having these massive black Friday blowout sales. Boy, howdy. You can get all mine for life for about 150 us dollars, uh, at that URL with that coupon code.

Um, and, and gosh darn it, that’s a link in the show description too. I’d make life, I make things so easy for you. Uh, since I am, I am back home now from past data summit, I’ve got nothing on my calendar, which for the winter months is pretty good.

I don’t think there’s like a SQL Saturday in St. Kitts or a, sorry, a data or, or a data Saturday in St. Kitts. I don’t want to upset anyone, but, uh, if there is, let me know. Come down there for free.

I, well, I mean, I, I, I’m not going to like stow away on a cargo plane or something or like swim, but, uh, that’d be hell on the laptop. But, uh, you know, I, I, I do like going places. So anyway, with that out of the way, let’s talk about why Erik Darling of Darling Data with his reasonable rates hates scale our UDFs.

So, um, I, I’ve talked about this before, and this is a sort of fully fleshed out demo, demo that I presented during, uh, one of the past pre-cons that I did with Kendra Little, which went wonderfully, by the way. You, you should have been there. It was my birthday after all, and, uh, kind of rude that you weren’t, but whatever.

Uh, so I started, started talking about how you can’t put a wait for delay in a function, which obviously stupid, right? But, uh, this is the function that you can write in order to simulate waiting for some duration of time. So what this function does is it accepts a delay in milliseconds, which is an integer.

And I’m not giving you a big int for this, you sick, sick, twisted, foul thing you. Uh, and then we go into a wait for a while loop. And while the sysdate time is less than, uh, the end date, than adding the delay in milliseconds up here, um, you know, we, we, we spin in a loop, which is, again, sort of like, in memory processing, right?

We’re just doing this stuff without touching physical database objects. Fine. Good. So let’s turn on query plans and let’s just validate that this function works the way we expect it to. Uh, I’m going to run this and it’s going to run for just about one second, even there.

Uh, notice that there’s a little bit more than one second on that though, right? It’s teeny tiny bit more. And if we, uh, execute this for three separate calls, um, of 100, 300 and 600 milliseconds, uh, notice that we, we, we hit the, we hit the one second mark accurately, right?

We did a hundred milliseconds and then 300 milliseconds and then 600 milliseconds, which is about a second. But now that, that went up a little bit higher than just entering the code once and going, uh, through one execution there. So now let’s create a table and let’s put some data in this table.

And basically we’re going to stick a thousand rows in the table. And in the wait for column, we’re going to put the number one, right? So that’s a one millisecond wait for all thousand rows.

When we call the, when we call this function or rather that’s a one millisecond wait for however many rows we take out of the table. So, uh, let’s do this and make sure that we have no function executions in there. And I’ve got a little helper view that is, that I use, that I’m going to use here just to keep the script a little bit shorter.

That will tell us, um, uh, how many times the function executed, uh, the script for that looks like this up here, um, where I’m just going to look at the sys.dmexec function stats view. And, uh, do some stuff to get useful data out of that. So anyway, uh, right now there’s nothing in there because I just cleared out the plan cache.

But if I run this now for 100 rows and I don’t even, I don’t even have an order by on here. I’m not doing anything particularly interesting. To get out a hundred rows, we end up waiting 1.5 seconds.

Note that nothing in this plan has any other time associated to it. It’s trivially, trivially, trivially easy to get a hundred rows out of this table with a thousand rows in it. Uh, but really where all the time gets spent is in the compute scalar.

If we right click on the compute scalar and we go to properties and we go to defined values, we’ll see that the compute scalar is what’s housing, uh, that wait for function, right? This is what’s, this is the compute scalar is responsible for, uh, executing or invoking the scalar UDF, right? So all the time in the plan, 1.5 seconds for a hundred rows waiting for one millisecond per row.

That’s kind of weird, isn’t it? I mean, it’s a bit strange to take, for that to take 1.5 seconds. If we look at the, uh, the function DMV now, we’ll see that, you know, we spent a total of 1.5 seconds in there.

And even though the only, we, we only ever passed in, uh, a one millisecond wait, look at the max worker time, 15 milliseconds. So we spent, we actually spent 15 times, like 15 times longer in there than we passed in for a millisecond wait, right? That’s very strange, right?

And it’s very, very odd. Just because getting into that function code path is not free either, right? So the more times you need to enter a function’s code path, the worse this sort of gets. If I were to do this for all a thousand rows, now for a hundred rows, it was 1.5 seconds for a thousand rows.

I’m going to share something with you, something awful. This is going to take 15 seconds, one five, right? That’s for, for waiting one millisecond per row for a thousand rows.

That’s a thousand milliseconds. That’s one second. Where does 15 seconds come from? Doing all that, doing all that function stuff. Look at this thing.

We spent two milliseconds, uh, scanning the clustered index pretty quick in the, in the compute scale are 15.617 seconds. All right. A lot of extra overhead in there, isn’t there?

A lot of weirdness, a lot of strangeness. All right. And if we go and look at the, the function DMV now, we will have the thousand executions because we invoke that function 1000 times or once per row.

We spent a total of 15 seconds in the function. And with a, again, with a max CPU time of almost 16 milliseconds now. So like getting into that function code path is not free.

Now, what I want to do is, um, one thing that I talk about a lot is there’s a, like, uh, how functions differ, right? So if like, what we just did was we invoked the UDF and the select list that made a difference for how many rows came out of the query. Right?

So for a top 100, it ran a hundred times getting all thousand rows out of the table. It ran 1000 times. I’m going to update the first hundred rows of the table. And I’m going to just, uh, update, uh, this column called did it with the, with the results of that, with the results of the function.

Right? Which this function always returns the number one. Not a big deal.

Like not doing anything terribly interesting. And if we look at the query plan for this, well, 1.5 seconds in the, in the compute scalar. Again, just invoking that function.

Now, when I talk about, um, different ways that functions can get called in queries, there’s obviously the select list thing that we just looked at up there. But sometimes people might stick functions in other places. You might have a function in a where clause or a, sorry, a scalar UDF in a where clause or in a join.

Now, if, if I just put the scalar UDF in the where clause here, right? And so this is going to produce, uh, a hundred rows where the did it column, which is for a hundred rows is set to one. And, uh, where it’s equal to the scalar UDF.

Something really interesting is going to happen. Really interesting. I mean, something really slow.

You’re not going to like it. You’re not going to like it one bit. I’m also not going to like it one bit.

Not, not a good time. So what we have is, um, this thing happens, right? The clustered index scan, four milliseconds, not bad.

Um, but now rather than the compute scalar taking up most of the time, remember this is a row mode plan. So every parent operator also includes its child operator’s time. The compute scalar does have 17.1 something seconds under it.

But this filter operator is now 15.6 seconds. Because this scalar UDF is, uh, I mean, if we have the easiest way to think about it, if you’ve been traveling around SQL Server circles long enough is this UDF is not sargable. SQL Server cannot like has to run that function for every row, produce a result, and then compare it to whatever our comparator is up here.

If we go to the properties of the UDF and we go to the predicate and just to make life easy, I’m going to click on this little ellipsis over here. And what this does, or rather what this shows us is the column that we searched on, that’s this thing. But then we can see where it equals the result of this wait for function.

So we have to feed the wait for column into the UDF 1,000 times and then produce a result 1,000 times and compare it to what’s in the wait for column. Since we only set 100 columns, or rather 100 rows equal to one, only 100 rows qualify and get hit by the compute scalar. So that’s why the compute scalar is only about a second.

If I said like where this equals zero, uh, that would be the other 900 rows. This, that, like this would all take a lot longer, right? And I don’t want to keep you sitting there, uh, dealing with all that. And if we look at the executions, we will see now that there were 1,100 executions.

There were 1,000 executions for the where clause to filter out, to run, run 1,000 rows through the function, produce a result, compare it to the predicate. Uh, and then another 100 executions to, um, uh, produce the function in the select list, right? So 1,000 on the where clause and 100 in the select list.

So putting scalar UDFs in the where clause can be particularly painful, be a particularly painful thing for your queries. This gets, this of course is, I mean, it’s sort of unlikely that your scalar UDF is going to be the only predicate in a query though. So what I want to show you is we’re going to run that exact same query, except this time we’re only going to get the first hundred rows that we own, that we set to one for this anyway.

When we do that, we have an additional, we’re going to have an additional sort of filter in the, or I’m sorry, an additional predicate for this query. So if we look at the execution plan for this SQL Server is able to seek into the clustered index defined where IDs are less than or equal to a hundred. So we have an initial seek that trims the, the rows that leave the clustered index down, but those hundred rows still have to go into this filter.

There’s 1.5 seconds, like we’ve seen over and over again for 100 executions of the function. And then just about another 1.5 seconds in the compute scalar, right? Because the, we have the filter is responsible for its time.

And then the compute scalar is its time plus the filter time. Remember, it’s a row mode plan. The compute scalar, like essentially 1.5 on the filter, 1.5 on the compute scalar, right? So this query doubles in time because not only do we, do we filter the result like we did before, right?

And if we look at the predicate down here, it’s just like what we saw when we looked into the properties of things. And so this, this is like a, just a really sort of awful, painful thing. This time we only have 200 executions of the function though, because only 100 rows left the table based on the initial seek that we did.

That would be true even if we, even if we scanned the clustered index and a hundred rows came out of it, those hundred rows would have been processed still by the filter for the scalar UDF. So this is just in like queries, get getting data.

Sometimes people will put scalar UDFs into computed columns or check constraints. Now for queries running that do this stuff, SQL Server does have the UDF inlining feature available, 2019 plus, compat level 150 plus. That feature does not work with scalar UDFs in computed columns.

And that feature does not work with scalar UDFs in check constraints. At least as of the time I’m recording this, maybe SQL Server vNext will make improvements on that. I don’t know.

I don’t have any futuristic information there. So let’s create a slightly different table. And what we’re going to do here is we’re going to put that UDF into a check constraint. Right?

And we’re going to look at how this impacts queries. So let’s put a thousand rows in here. And putting a thousand rows into this table is going to go a lot slower than it did that first time.

Because for all those thousand rows, guess what we have to do? We have to run that check constraint to make sure it did something. Now, it’s a bit different in this query plan because that happens in the assert operator.

Right? So like this happened quickly and this happened quickly. But we spent a lot of time in here now.

And I forget exactly how useful this assert is in showing us what it did. But there we go. That’s actually pretty good.

So you can actually see the scalar UDF invocation by the assert right there. Okay? So that’s not good. You know, maybe if you have inserts into a table like this that go like, you know, run for a long time, this might help explain why.

Now, the other thing that’s kind of annoying with scalar UDF. So I’m just going to run these two queries back to back.

These happen very quickly. Even selecting the wait for column. Right? That’s not that like you see that there. Right?

We selected the wait for column in this one. We did not select it in this one. For the query where we do call the wait for column, rather, the one that has the check constraint with the scalar UDF, we get this T-SQL user-defined function, not parallelizable warning in the query plan properties. If you’re on a version prior to 2022 or you’re not in the cloud or managed instance, that will just say could not generate valid parallel plan.

We got some more information out of this thing recently. The one down here, the query down here that doesn’t call the wait for column, does not have this problem. This is a lot different from scalar UDFs in computed columns, where unless you have that computed column persisted and unless you have trace flag 176 enabled, this would have caused problems for both queries.

Okay? This would have been an issue for both of them. If we look at the number of executions, we only have the thousand executions from when we did the initial insert, but we didn’t get additional executions of this scalar UDF here because there’s no reason to execute it.

But its mere presence for this table inhibits or rather disallows a parallel execution plan. So even if, like, you know, you’re just, like, say you’re selecting from your table and you’re joining it off to a bunch of other tables and doing all your, you know, fancy join-y stuff, you could end up with real performance problems. Even though this function doesn’t execute every time, this function will force you to use a serial execution plan.

There’s no parallelism allowed there. So you have to be really, really careful where you, both in how you write and how you use and how you call and basically how you do anything with scalar UDFs. Again, putting them in check constraints or computed columns makes for, like, workload-wide performance issues.

Putting them in queries usually makes for pretty bad query performance issues. So it’s, like, even though, like, SQL Server 2019, again, 2019 plus, compat-level 150 plus, does help a pretty decent chunk of functions that I see. Again, the number of limitations and restrictions on it, it doesn’t help.

It can’t help every single function. And, you know, it’s, like, the amount of pain that scalar UDFs cause in, like, the grand scheme of things really does make me wonder, like, if you’re unaware of, like, how bad they are at this point, should you even be touching a database? Should you be anywhere near a SQL Server?

Should you just stick to Excel? Should you collect rocks? Is there a better opportunity for you elsewhere in the world? Because I just see them constantly, and people wonder why I make the face that I make when I’m, like, looking through a query that runs slow, and there I see it.

Some dbo.fn underscore this and that, and I’m like, ah, here we go again. It’s you. It’s you.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something, and I’m going to hopefully record something a little bit less depressing next time for the next video. I’m not sure what it’ll be yet, but I’ll know when it comes to me.

So, anyway, goodbye for now.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.