How To Use A Numbers Table To Replace WHILE Loops In SQL Server Functions

How To Use A Numbers Table To Replace WHILE Loops In SQL Server Functions



Thanks for watching!

Video Summary

In this video, I dive into the world of scalar UDFs that contain while loops—something I frequently encounter when working with older applications on SQL Server. These functions, often used in legacy systems from a time when developers didn’t fully grasp set-based thinking, can wreak havoc on performance as databases grow and evolve. To address this issue, I demonstrate how to rewrite these functions using numbers tables or tally CTEs, which allow for more efficient set-based logic that significantly improves query performance. By sharing practical examples and tips, I aim to help you tackle similar challenges in your own projects. Whether you’re a seasoned SQL Server professional or just starting out, this video offers valuable insights into optimizing function performance through modern techniques.

Full Transcript

Erik Darling here with Darling Data. And continuing with this week’s theme where I will no longer have to apologize for the length of my videos, we’re going to have another short one about how to rewrite functions with while loops in them. Because this is something that I end up having to do a lot for clients who have, you know, these, you know, I don’t know, 10, 15, 20, 25 year old applications. that have been on SQL Server since it came on floppy disks. And back then, people just didn’t know any better. I guess data was all small and easy enough that, you know, functions didn’t really do much of anything. And, you know, now that their databases are all growns up, performance is a wreck and scalar UDFs are often the case and cause for why. And we’re going to, we’re going to talk about that. So, if you want to join, like 20 other people who have memberships to this channel, click the link in the video description, it’s like four bucks a month on the low end and like 10 bucks a month on the high end, you can, you could spend that money in much worse ways. But I understand not everyone is charitable. And with the holiday season coming up and inflation being what it is, if you have to choose between buying your kids a new sock and giving me four bucks, well, I understand.

If you don’t feel like doing that, like, comment, subscribe. It’s all fun. Especially the commenting part makes me feel a bit less lonely. Of course, the comments are usually where I have to apologize for the length of my videos. If you need any SQL Server consulting help, this is the stuff that I’m great at. If you need anything else, I can probably do that too. And as always, my rates are reasonable. If you want some very low cost, very high quality training, you can get about 24 hours of it for about 150 USD at that link with the combination of that link and that, that coupon code, which is also available in the video description.

It’s your lucky day, I suppose. This November 4th and 5th, I will be doing full day pre-cons with Kendra Little at Past Data Summit. That is coming up very soon and it’s going to be a lot of fun. And I hope to have lots of interesting content from Past Data Summit for you. Live! Coming to you live from Past Data Summit. Anyway, let us get on with our party here.

Let us not accidentally click one more time and fade to black like amateurs. Let’s make sure we do this right. Now, a lot of what I end up having to fix for clients with scalar UDFs is some kind of, I mean, it could be a multi-statement table diode function too. They both have sort of the same problems when it comes to wrecking query performance.

You know, a lot of the, a lot of the multi-statement table diode function ones that I have to fix are string splitters, which are, you know, always a fun challenge. Some of them are string concatenators. There’s all sorts of fun things that happen when, when you start getting into that stuff and the, and the particulars of that.

But, you know, a lot of it too is, you know, stuff like this. So I’m just going to actually just show you because it makes things easier. In this case, you know, like a lot of these functions might be used to like strip characters out.

And if you go to my GitHub repo, which there’s a link to it somewhere, I’m sure, you’ll find some functions from me that can strip letters, strip numbers, or match a pattern and strip that pattern out. So there’s a few good things there.

But this is just kind of an abridged version of that where, you know, there will be some function from the, you know, from the beginning of SQL Server time that has a while loop in it. And it’ll iterate over every, you know, character in a string and, you know, figure out if the character in the string matches some pattern or whatever.

And, you know, back in the day, a lot of people found that a very easy and approachable way to do things. Because, you know, not a lot of people were hip to the whole, you know, databases, think in sets, don’t write procedural code and functions and then expect it to perform well thing.

You know, there were a lot of people who were missing that part of their brain because it hadn’t been invented yet. It was just an empty space in their head where they were just like, oh, we’ll get a part in here someday. But, you know, here’s the result of the function.

And I just want to point out, if you’re ever, like, rewriting functions, this is the wrong way to benchmark them. Because, you know, if you’re just, you know, fixing one single string, you’re not going to see an appreciable difference. You really want to incorporate the function into the queries that are calling it.

Not just, like, you know, do stuff like this to, like, unit test it. Test it for correctness. But this isn’t a good performance test right here. We’re not going to do a performance test here because I’ve done a million function videos about performance testing.

This is just an example of how you can get out of, like, how you can rewrite functions with while loops in them so that they are less awful. So if we run this, we will get a count of five and five. One for the count of the letters, which is, we’re saying, where everything is not a number zero through nine.

And then one that is a count of the numbers where everything is a number zero through nine. And, of course, we’re using pat index to do that pattern matching. Because what else would you do to match patterns if set views at the pattern index function?

I don’t know. I don’t know. You’re crazy. But, like, an easy way of doing it is to get a numbers table involved in your database. It does not have to be a particularly large one.

It does not have to be a bajillion rows. You could have a simple, you know, like, 10,000, 20,000, maybe even 100,000 row numbers table of just the numbers one through 100,000. And you could go a really long way with doing this sort of thing, right?

You can do a lot of, you can get through a lot of characters with a 100,000 row numbers table, specifically about 100,000 characters. If you have strings longer than that in your database, obviously, you’re going to have to compensate for them somehow. But with a larger numbers table, perhaps.

But, you know, for most people, having more than 100,000 characters that you would have to do this sort of thing on is perhaps a bit much. So what we’re going to do is we’re going to use a numbers table to our advantage. And we’re going to use this function to completely replace the while loop that we had before.

And let’s just make sure that this thing is in there. And, of course, I’m going to write the query that calls this a little bit differently, where I’m going to put the patterns that I care about in this values clause and then cross-apply the function with the values of the values clause over here.

And we will get, excuse me, for each of these, we will get the correct answer. So for the ones that are numbers, we had five. The ones that are not numbers, we had five.

And, of course, these things both finished instantly, just working on a single string. Pay no attention to what you saw down there. That was something else that I was working on for a different demo for a different day. Pretend that didn’t happen.

All right. It’s our secret. Doesn’t it make you feel special that now you and me have a secret together? You’re honor-bound until death to keep that secret, you and me.

So no tattling, as they say. Anyway, if you find yourself having to rewrite functions that have while loops in them, numbers tables are very good for that.

There are great examples of trading numbers out there in the universe. I believe Aaron Bertrand has a bunch of posts about it. There are a lot of people who use them for a lot of good things. They’re not always an awesome performance boon when used directly in queries, sort of like date tables and date dimension tables or whatever.

They can be very useful for things, but they can also be involved in weird performance issues because joining to those is often kind of awkward. But in this case, where we just have to use that numbers table as a sort of utility, like fake row number type thing, it’s pretty easy to do this here.

If you’re not allowed to create a numbers table in your database, you can, of course, use CTE and nest those to create a large row set of the numbers, one through whatever, in order to do this instead.

This just makes the code a lot nicer and a lot more compact. So this is one way that I help people when we need to rewrite functions with while loops in them by using either a numbers table or a tally CTE, as it’s sometimes called, so that we can do what we need to do with set-based logic rather than with procedural looping logic because that usually helps query performance a whole lot.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that this video was a satisfactory length for everyone. Not too big.

Not too small. It’s the Goldilocks length of the video. Because I don’t think Goldilocks ever had to apologize for her length. But, I don’t know. That’s a different sort of fairy tale.

Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.