A Little About Scalar UDFs and Read Committed Snapshot Isolation In SQL Server

A Little About Scalar UDFs and Read Committed Snapshot Isolation In SQL Server


Video Summary

In this video, I delve into an interesting quirk involving recommitted snapshot isolation and non-inlineable scalar UDFs in SQL Server. Specifically, I explore how these UDFs can affect the row versions read by a query under recommitted snapshot isolation, leading to seemingly inconsistent results. By walking through a demo, I illustrate why this happens and explain that it’s not an issue with snapshot isolation itself but rather a consequence of scalar UDFs executing once per row. If you’re curious about more details on SQL Server isolation levels or want to dive deeper into the topic, be sure to check out my playlist titled “Everything You Know About Isolation Levels is Wrong.”

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we’re going to talk about a little oddity with, excuse me, recommitted snapshot isolation and non-inlineable scalar UDFs. I don’t think this goes for snapshot isolation.

This is only recommitted snapshot isolation. And this could also like just as easily go for a multi-statement table valued function. But the basic gist of it is, is that like when you run a select query and you read row versions, like the normal select query, the row versions that you read are from if under snapshot isolation, when the transaction begins and you first access the row versions or under RCSI, if you like, you know, when you, when your query starts.

And the thing that, um, scalar UDFs mess with is that the query inside of them. Remember, because like we’ve talked about scalar UDFs a bunch of times here where they don’t like the non-inlineable ones don’t run once per query. They run once per row that the query has to like process and return in some manner.

I’ve gone into detail on that in another video, so I’m not going to redo that here. But the main thing is that like in your select list for every row that your UDF has to execute for, you’re going to read data for the point in time that the UDF starts reading data.

So like your outer select, you know, that’ll, you know, do its thing. But the UDF that executes once per row, that’s going to start reading data for each time that the UDF executes. So you could read different rovers. It’s almost like going back to read committed where, um, this is again, something, if you have questions about any of this, if you want to learn more about this stuff, I have a whole playlist on everything you know about isolation levels is wrong.

And that has a lot of background on this, but it’s like almost like with read committed where like the, like when you start reading data, it’s like a whole bunch of different points in time as you like read along an index. So that’s, that’s kind of what the gist of the demo is.

So you’ll see it in action in a moment. If you want to click links, I’ve got some links down in the video description. Uh, you can hire me for consulting. You can buy my training. You can become a member of the channel that, that donates a tiny fraction of money, like, like the office space, penny stealing thing, uh, every month and say, thank you for the high quality SQL Server content that I produce.

Uh, you can ask me a question for office hours. And of course, if you, if you like this content, uh, enough to, um, maybe spread the good word, uh, about darling data, uh, you can like, like subscribe. And of course, tell a friend or two or 10, uh, once again, I will be at past data community summit, Seattle, November 17th to 21st, uh, two huge days of T-SQL pre-cons with Kendra Little.

It’ll be the best T-SQL you’ve ever learned. So there’s that. Anyway, let’s get back to the database party. I love this picture. So spooky and database-y Halloween-y. I’m almost not looking forward to changing it when Halloween’s over. Maybe, maybe I’ll leave my Halloween decorations up until Christmas.

Who knows? Who knows what’ll happen? Anyway, uh, let’s make sure that, uh, recommitted snapshot isolation is on and we need to make sure that we can, uh, reload these tables here. So we’re going to create a couple of tables and you, you may have seen this demo before in other videos that I’ve done. If you have, I promise this one is a little bit different, but I am going to, uh, show the sort of basic, um, stuff first and then we’ll go back and look at, um, we’ll look specifically at the function thing to see, to see why it’s different. So if we look at what we run this query and we look at what we get back, um, we get rows one through 10 and all total is a thousand for these. Great. Uh, over in this window, I have an update statement. Then I’m just going to run in a loop to add one to the totals column. Where that’s going to get interesting, is with this query. So that update is not yet running. And if I, let’s put, let’s do the read committed one first. So we have that fresh in our brains, uh, right now read committed is telling us that, you know, the same thing as before, but if we start running this update, right. And we come and run this query. The main thing to keep in mind is that for every row that comes back, uh, we’re going to get slightly different numbers. It’s like 100, 200, three, four, five, six, seven, eight, nine, up until we move into a new one. And this will happen every time we run this, because like I said, um, under read committed, as you’re reading data, like, like, like, especially like seriously, if you haven’t watched the isolation levels playlist, go watch it. If you’re, if you’re, if you’re lost on that, as you’re reading data, data can change all around you. Read committed doesn’t guarantee a point in time. Read it guarantees many different points in time that you’re going to read from. So the, the totals for that are all whack, right? So if we, if we run this without the update running, we get a stable result back. So we’re going to do is quote out this read committed lock end, and I’m just going to reset this table to something, right? And show you, you know, just start from sort of zero. Now we get all one thousands, right? So this is our starting place.

And if I start running this update again, right, if I kick this off and we come over here under read committed, these numbers are all stable, right? So because we read a snapshot, read committed snapshot isolation, we read a snapshot of the data. Every time we run this, we’ll get a consistent result back, right? That wasn’t true under read committed. All right, cool. So why does the UDF change things? Well, first let’s create a simple UDF in here. And this UDF is, we’re going to do things slightly differently for this because it just to make it clear what’s happening.

Inside the UDF, we’re actually getting the sum here. So I’m going to change my select query a little bit to rather than sum a sum, I’m just going to get the max from a sum. So inside the this column right here, we are getting this is where our UDF kicks off. If I show the estimated execution plan for this, we get as usual for non-inlineable scalar UDFs, we get two execution plans back, we get the execution plan for the query that’s running and doing stuff, right? This is our outer query, but then we get the plan back for the for the UDF in here. This is where we’re grabbing the sum of stuff from table two. So what I’m going to do is we don’t have any locking hints on this one, right? So we’re not using read committed lock this because this query is executing without any locking hints. It’s using row versions when it runs. But if I come over here and I start running this update again, we’re almost going to see something as weird as when we were using read committed snapshot isolation, where every single row in here is going to be slightly different.

It’s like two 300 500, 879 94, 31, blah, blah, blah, blah, blah. It’s like every time we run this, we’re going to get what seems like non snapshot inconsistent results back from the query that we’re running in there. But it’s really because every time that query executes is once per row, right? Every time that query executes, it’s reading the row versions from a different point in time.

So this can make things look all skewed. So this is just another weird problem that you can run into with queries that that have non in lineable scalar UDFs in them. This is not the fault of read committed snapshot isolation. This does not mean read committed snapshot isolation is bad. This is just, uh, this is a byproduct of the, uh, sort of procedural black boxy, like row by row things that happen, um, when you use, uh, scalar UDFs and SQL Server. Um, so that’s fun, right? Great. It sucks.

Uh, scalar UDFs, stop using them. Get away, right? Just run screaming, go do something else with right functions that different. Don’t do it. Uh, it will cause nothing but problems and headaches for years to come. Your children will inherit these problems. Uh, they’re a nightmare. So don’t do it, right?

No scalar UDFs because they ruin everything except demos for consultants who, um, find weird things about scalar UDFs and isolation levels, which I suppose I should be somewhat thankful for, but, um, you know, it’s a little tough to find gratitude for things that are so annoying, but I don’t know.

Maybe it made a good video. Maybe it didn’t. We’ll find out. We’ll see how many people give this a thumbs up, won’t we? All right. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video. Au revoir and all that stuff.

Bye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.