Cursors In Scalar UDFs, and Other Performance Pitfalls In SQL Server

Video Summary

In this video, I delve into some fascinating aspects of scalar User Defined Functions (UDFs) in SQL Server. After a busy week teaching at the New England SQL Server User Group and dealing with client work, it was great to finally record something for you all. This session focuses on how scalar UDFs can impact performance, especially when they are not optimized or used inefficiently. I walk through an example where a cursor-based UDF significantly slowed down query execution, demonstrating the importance of using set-based logic and avoiding cursors whenever possible. The video also explores how to convert non-inlineable scalar UDFs into inline table-valued functions (TVFs) for better performance, showing that sometimes just adding a `DISTINCT` keyword or making minor tweaks can drastically improve efficiency. By comparing different approaches with query plans turned on, I aim to provide practical insights and solutions for optimizing your SQL Server code.

Full Transcript

Ahem. Ahem. Ahem. Ahem. Erik Darling here with Darling Data. And it would be completely inappropriate for me to tell you just how much I’ve missed you. This is my first chance that I’ve had to record this week. Last week I was gone, the last couple days, because I had to teach a class up in Boston for the New England SQL Server User Group. Fine bunch of people who, let me show up and get paid to talk about SQL Server for a full day. I had about 40 people there. I had someone fly all the way from Nigeria just to hear me speak, which is wild, because Nigeria, at least as far as I can tell from most maps, is FAR. And, like, I’m only used to people showing up from far away, either around the world, or around the world, or around the world. When I speak at past Summit, because it’s expected, right? Like, people just show up there from everywhere, like Mars. And when I’ve, when I’ve, I’ve actually, there’s been a few times when I’ve, I’ve done talks at SQL Bits in, in the UK, and I’ve run into people there from America who I’ve never run into in America. So, like, just seeing random people in the UK is always funny, and that, that happens. But, uh, this has been very busy.

Since I got back, uh, a lot of client stuff, you know, um, the nice people who pay me so that I can record these videos for free, because I don’t, like, get paid for this. So, uh, if any, if you see anything in this video, and you’re like, hey, maybe we could pay Erik Darling to do that, uh, you can pay me to do that, even if it’s standing here babbling. Which I apparently excel at. Uh, so that’s fun. Uh, in today’s video, ahem, we’re gonna talk about some interesting stuff with scalar UDFs, because recently, as recently as this week, I had to help a client with some scalar UDF rewrites, because, you know, if you’ve been paying attention to me, or really anyone who talks about SQL Server over the last, I don’t know, 15, 20 years, you may have heard that scalar UDFs have some performance issues.

You can’t put all those rows on a single thread, which sucks. The other bad thing about them is, of course, that they don’t run once per query.

They run once per row that the scalar UDF has to process, and depending on where in the query you put the UDF, and how many rows have to pass through that UDF for query correctness, for the sake of query correctness, you could end up with a lot of rows going through that UDF and that UDF executing over and over and over again. So if you, like, have a select top thousand query with a scalar UDF in the select list, you could execute that UDF 1,000 times, maybe more, in order to produce a result from every row that passes through that UDF.

And if you do something extra stupid, like put a UDF in a where clause, and that where clause has to process rows from a table with, like, I don’t know, let’s just say a million rows in it, that UDF may execute a million times, or more, to produce a result which to compare against the predicate in your where clause. So you can imagine, my great amazement, relief, just really breathtaking-ness, when SQL Server 2019 introduced scalar UDF inlining, which is a cool feature. It’s a very neat thing. It’s a very novel idea.

I don’t think there’s, like, one other not quite, it’s like RocksDB, maybe, or DuckDB that can inline scale our UDFs, but it was very exciting when Microsoft SQL Server did it, because, you know, I work with Microsoft SQL Server. I’ve never gotten paid to work with, like, anything else.

Literally anything else. So, at least database-wise. I’ve gotten paid to do a lot of other stuff. But database-wise, SQL Server is it. So I was really, like, dumb, like, wow, they’re fixing it, finally.

But there are a lot of restrictions on it. I’m not going to go to the KB article about it, because it’s depressing a little bit. But anyway, so this is kind of what the UDF that I had to fix started as.

It’s sort of a gaps in island problem, and I’ve retrofitted it to the Stack Overflow database. And the idea that I’m using in the Stack Overflow database is to find the longest streak of consecutive days that a user has answered questions, right?

So if they answered questions from, like, 2008 to 2012, one every single day, that would be a lot of days. I don’t know who does that. You have no life. Sorry. It’s just the way it is.

But the idea is to find the longest streak of consecutive days that a question was answered. And so the form of the UDF that I came across was something like this. Again, this is me retrofitting things to the Stack Overflow database, because Stack Overflow is not one of my clients.

But if you work at Stack Overflow and you’re watching this video, my rates are reasonable. I have several references, both inside and formerly at the company, who would be happy to tell you that you should pay me to do things.

So there is that. Anyway, this is what the UDF does. It declared a cursor. It selected some rows into a thing. And then it had all this weird logic to figure out if the streak was consecutive or if we had to restart the streak and, like, hold on to the highest current streak and then return that streak.

I’m just going to take this one quick moment here to say that if you are the type of person who has to declare cursors to do things, please always make sure that you at least include local in your cursor definition.

There are a lot of issues with global cursors, especially if you have two people try to declare them simultaneously, because you can’t.

They will clash and they will error out. So please at least, if you’re going to use cursors, declare them as local. That’s my spiel there. So there are, like, a bunch of things that I want to show you today. But the thing that I want to start with is actually how big of an impact this distinct keyword has in this particular UDF.

Now, I just want to make sure that we have no other indexes in this index created. And I have three testing queries over in this window. The three testing queries are, well, really the same query over and over again.

So again, I’ve got lined up for you today to look at a cursor version of this scalar UDF, a non-cursor version of the scalar UDF, and then an inline version of the UDF. And I think it’s kind of important to see the progression there, because not every scalar UDF has to have a cursor in it in order for it to be horrible.

So without the distinct keyword, right? We have the index now. We have this thing ready to go.

So this thing takes a real, real long time. Like a disappointingly long time. When we look at how long, it’s about 15 seconds of time this thing takes to run.

And this is absolutely zero buenos, as the kids say. And this was sort of the first thing. Well, not the first thing.

The first thing that I did when I opened up the cursor was just, like, hang my head. Because, you know, who would put a cursor on a UDF and expect something good to happen? You’d have to be one of the more foolhardy individuals that I’ve ever met in my life.

So that takes 15, well, let’s see, I guess 14 seconds. But if I move the right way, you can see 14 way down here somewhere, maybe. Oh, it’s on the other way.

No, there it was. Hang on. Where did you go? Where’s the 14 seconds? Next to the six. There we go. 14 seconds, right? Obviously not good.

We don’t like not good things at Darling Data. We have a strong HR policy against good things, against not good things. So let’s just recreate this function with the distinct keyword in there.

And let’s rerun this. And it’ll be about three times faster with the distinct keyword in there. It should take about four or five seconds, right?

So we cut 10 seconds off this thing just by adding distinct into the cursor. So if you’re afraid of making big code changes, sometimes a little distinct goes a long way. Don’t tell anyone I said that.

Usually I make fun of people who are just like distinct, distinct, union, distinct, distinct, union, union, union, distinct. Because, you know, they’re crazy. And they don’t probably just screwed up joins or didn’t use not exist or not exist when they should have. But I digress.

So you really do. So that was that, right? And obviously, you know, we are SQL Server data professionals. And the first thing we do whenever we see a cursor is we stand on our hind legs and we ring a bell and we say, have you tried thinking in sets?

And then we get a little treat. We get to be in the in crowd when we say, have you tried thinking? Have you thought about writing a set-based solution? Okay.

The thing is, and this is a big thing, is that, like, from a, let’s just call it programming perspective, because I don’t know what else to call it. Writing a set-based thing is kind of hard sometimes.

Like, I don’t know a lot of people who can bang out, like, a true gaps in islands solution, like, first try flawlessly without, like, a half a day of tinkering, depending on, like, how many weird, like, edge cases and outliers you might have to deal with. So this is the set-based solution, right, where we have to use window functions like lead, and we have to, like, calculate date diffs, and we have to use sum with, like, a real windowing function sum, not just, like, hey, sum column, like, legit sum with, like, an over clause.

And we have to remember to use the rows between unbounded proceeding and current row, because if you use range in here, your life is going to be nothing but pain. Everyone who loves you will leave you alone, right? You don’t want to use range.

And then we have to do all this stuff, right? So this is the inline version of the function, right? And, of course, we can have, let’s just move that one over there where it makes a little bit more sense. This is the scalar UDF version written using set-based code, where we would declare a variable up here, and we do all this stuff.

Now, the funny thing about this set-based solution is that when we get to thinking in sets, when we think really hard about our sets, really focus our brains on sets, we mess up scalar UDF endlining. So the whole CTE crowd out there who’s like, yeah, CTE, readable, ha, got your readable query here.

They are, they are, we’re going to be disappointed. If we had used derived tables in there, we could have avoided this unpleasant scenario. But with all the CTE, if I just, let’s just get an estimated plan for this.

We look at the properties here. We are going to see that our T-SQL UDF function, not parallelizable, reason shows up because CTE make, break scalar UDF endlining, make scalar UDF endlining not work.

Okay. So the cool thing is, though, is that if you’ve got cursor code, like there might be something you can tweak in there to make it less awful.

Not great, but less awful. There are cursor options you might tinker with. You might throw a distinct down there. Might do all, try all sorts of tricks.

But if you’ve already got sort of set-based code like this, it’s very easy to strip away the things that make this function not inlineable, right, and write it as an inline function.

Now, just keep this in mind, like, in your head when we’re looking at this code. The only, like, this is the, this is a scalar UDF. The only different, the only real, like, actual code body differences are in the scalar UDF, we declare this variable, we write CTE, and then we return this variable.

The inline version of this function, well, it’s, obviously it skips over declaring a variable, and obviously it skips over returning a variable, but it does the same thing otherwise, right?

Like, we don’t declare a variable up here, and we don’t set that variable equal to the final result here, and then return that variable, but the code is exactly the same otherwise.

Like, all those CTE are exactly the same. I changed nothing except, like, just those things in the code, and of course I told SQL Server that I’m returning a table rather than an integer over here. All right, so that’s it.

That’s all that’s different. All right, so let’s come over to our, our answer street testing window. We’re going to give these things a real thrill ride, and let’s run all three, and we’ve got query plans turned on, so we can see what the query plans did.

All right, this and this and this. So all three run, and sort of as expected, as is wont to happen when one trifles with scalar UDFs, we have, up here, we have the cursor version, which runs for, again, about four seconds, and what’s, again, something really nice about modern, execution plan analysis, is that we get operator time, so we can see that the clustered index scan of the user’s table, which took 174 milliseconds, was pretty quick, even though SQL Server’s like, oh, we need an index to make this fast, 85%, or almost 86%, we’ll improve the query by.

The thing is that that index is on the user’s table, the user’s table has absolutely nothing to do with this compute scale R, the compute scale R, if we subtract the 174 milliseconds from here, just runs for about four seconds even.

Right? Close enough. There might be like a 50 millisecond difference or so. So, obviously, like, we did better in here without the cursor.

Right? That’s fairly obvious. The cursor UDF took about 4.2 seconds, and the non-cursor UDF took about nine seconds, even not being able to produce a parallel plan anywhere along the way, yada, yada, yada, yada, yada.

We still did a lot better. We spread this query up by 4x. In some cases, that might be good enough. All right?

So, let’s compare that with the inline scalar UDF. Sorry, the inline table valued function version of the query, which finishes in 164 milliseconds. And I know what you’re thinking.

It’s an unfair comparison, Eric. You should drop clean buffers and run it at max.1 so it’s fair with the other query. Whatever other nonsense people bark at me when they’re, like, mad that I did something faster than them. I don’t know.

It’s the thing people have. So, yeah, this does have the somewhat unfair advantage. This does get a parallel execution plan. This does finish, I don’t know, 800 milliseconds faster, which is, you know, another good, depending on what you care about.

It’s a good improvement. One other thing you might notice looking at the query plans is that these two have very small query plans over here. These just show us hitting the user’s table.

This one actually shows us going to the post table. If we follow this trail long enough, we’ll see that we actually do touch the post table way over here, which we don’t see when we look at the scalar UDF versions.

And, of course, if we were to get the estimated plans for the UDF versions, we would see the UDF query plans in here and in here. Right?

And this query plan for the, you know, the set-based UDF is very close to the query plan for the inline table valued function. It’s just hidden all the way inside, buried in the UDF. Now, if you’ve watched my videos before, you might have heard me explain why.

It’s because if you’ve actually listened to this video closely, you might understand why. It’s because every time we pass a row through a scalar UDF, we have to run that UDF.

So, for this query where, let’s see, let’s just run this one real quick since it’s easy to, since it’s pretty quick to run. This returns 613 rows. If we were to get an actual execution plan for every iteration of the function, we would have returned 613 execution plans back to SSMS.

SSMS would have fallen over and died in its 32-bit misery. So, we’re probably thankful that we don’t get that. We just get a compute scalar that says, don’t worry, I took care of it.

Right? Okay. So, fair enough. All good there. Now, to close this video out, to my grand finale here, what I want to tell you is that either scalar UDF inlining, the feature, assuming that you have right functions that are eligible to be inlined, if you’re on SQL Server 2019, Enterprise or Standard Edition, I don’t know about Web Edition, because who cares?

You’re not serious. It’s like if you use Hyper-V in production, you’re just not serious. He’s like, you must be joking. So, if we have scalar UDFs that can be inlined automatically, or if we go through the great mental strain, the tremendous difficulty, peril, of rewriting scalar UDFs as inline UDFs, sometimes we may find that our query slows down quite a bit.

And the reason why is because if we get the estimated plan for this one, what you might see in your query plans are eager index spools.

Now, I’m not going to sit here and make you wait for this to run, because an eager index spools coming off the post table is fairly disastrous. It usually runs somewhere around 45 seconds to a minute.

So, what the scalar UDF inlining or rewriting scalar UDFs as inline table-added functions can often expose as really bad indexing or just insufficient indexes to help your queries.

So, if you rewrite a query that has a scalar UDF in it, you rewrite the scalar UDF and you find that the rewritten query suddenly slows way, way, way down, make sure you get that actual execution plan.

Make sure that you are on the lookout for eager index spools and make sure that you create indexes that adequately satisfy the needs of your scalar UDF so that SQL Server doesn’t haul off and create a whole index for you on the fly every single time.

It’s not a good time. Not a good time at all. You don’t want that. And, of course, that won’t happen with this version of the UDF because this UDF takes one row and goes and executes it because this one takes all the rows and comes up with the query plan and inlines it.

SQL Server thinks, well, I don’t want to do that nested loops join that many times. That’s madness. Absolute madness. I’m not going to… I need an index. You want me to do that.

So just be careful there and be prepared to create indexes if you run into that scenario. Of course, I have lots of videos about eager index spools on this channel, so if you search my channel for eager index or spool, probably just search for spool.

That’d be good enough because you’ll learn about other types of spools too. Might as well increase the entire surface area of that smooth brain of yours, get some good wrinkles in there. Right?

Half the battle and all that. So anyway, this was a small portion of the client work that I did this week. got a real bad UDF, rewrote it, and just because I had something to sort of work off of, I figured, why not give you three examples over here and show you some of the downfalls of cursors in UDFs, UDFs in general, and something that you might run into, an eager index spool.

If you rewrite a UDF as an inline UDF, and performance, for some reason, gets worse. So you have all of, you are equipped with all of the knowledge you need to go from point A to whatever your end point is.

So anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that you are better prepared to do your SQL Server performance tuning job because I care about that.

If you like this video, I do adore a thumbs up. I do. I do. I sometimes like comments depending on, depending on what’s in them.

Sometimes they’re good. Sometimes they leave, leave a bit to be desired. Sometimes they hurt my feelings, my deep, deep feelings. And of course, if you like this sort of SQL Server content about performance tuning, you can subscribe to my channel so that every time I unleash these nuggets before you, you are notified promptly and you can see them before anyone else.

You can be the first person to see them. Wouldn’t that be great if you were the first person to ever see them? Like viewer number one every time? Surely you’d win some sort of prize if you could prove that sort of thing.

Anyway, thank you for watching. I’m going to turn off all these really hot lights now because I feel a patina beginning to form and I do not like feeling patined.

So, thank you. Thank you.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.