How To Get Only Letters Or Numbers From A String For Free In SQL Server

How To Get Only Letters Or Numbers From A String For Free In SQL Server



Thanks for reading!

Video Summary

In this video, I delve into three underappreciated functions from my GitHub repository that can significantly enhance your SQL Server string manipulation tasks. Often, when working with clients, I encounter subpar versions of these functions—clunky and inefficient code that could be easily replaced by the well-optimized solutions I present here. The focus is on `get letters`, `get numbers`, and `strip characters`—functions designed to streamline common string operations like extracting specific character sets or removing unwanted characters from strings. These inline table-valued functions are efficient, avoiding the pitfalls of scalar UDFs and multi-statement table valued functions, making them a valuable addition to your SQL toolkit. Whether you’re dealing with phone numbers, usernames, or any other text data that requires cleaning, these functions can save you time and effort. So, give them a try and see how they can improve your workflow!

Full Transcript

Erik Darling here. Still alive. Amazingly. Still talking about SQL Server. Probably a little less amazingly. In this video I want to talk a little bit about three functions that are in my GitHub repo that don’t get enough, if any, attention whatsoever. Look, you know, look, look, I got a GitHub repo. Ooh, everyone has a GitHub repo. It’s not anything crazy to have a GitHub repo. Most of what gets attention in my GitHub repo are the stored procedures I write to help with SQL Server troubleshooting. Things like the stuff over here that starts with the SPs like SP Health Parser and SP Human Events and SP Log Hunter and SP Pressure Detector and SP Quickie Store. There are a lot of SPs in there. These string functions get very, very, very, very much, attention. And it’s sort of a shame because when I’m working with clients, I often see like really crappy versions of these functions like copy and pasted from some website from the year 2000. Inevitably scale our UDFs, every single one of them. Some of them multi statement table valued functions. But like always like while loops and other crap that make your query easier.

Keep those queries awful and hate you slow and make you need to hire me. Maybe I shouldn’t be talking about it. Keep those queries slow. Call me. I’ll do something. I’ll wave my hands at them. But the three basic things that I see a lot of people trying to do in queries a lot has to do with like string manipulation. Microsoft has given us some help with string manipulation lately, sort of. We got a string split and a string ag and I don’t know. I guess that’s it. You get what you pay for. It’s only $7,000 a core. Why develop anything useful? Apparently we’re getting regex. Azure SQL DB has regex in preview, which can only mean SQL Server vNext is getting regex in preview. So, um, can’t wait to start seeing regex in a where clause. To be honest with you, gonna make the, revive the entire SQL Server community having regex in where clause. Everyone’s coming back. Everyone’s coming back.

All right. Not a sink and ship at all. So, uh, these are the, what the three functions do. Uh, they are called get letters, get numbers, and strip characters. Uh, they all do nearly the same thing. I’m going to talk about what’s in these files a little bit before I show you the, the stuff. So in every file, there’s two versions of these. Uh, there’s one version where you can use a numbers table and the numbers table does help a little bit with performance. There’s also another version that uses a CTE to build up a sort of, uh, internal numbers table, uh, that looks like this. Right? So there’s a little bit more work in there. Uh, the numbers table is a bit faster. It’s not, you know, depending on what you’re doing, the, this, the speed difference is, uh, somewhere between negligible and, um, uh, profound.

So, uh, you know, make sure that whatever version of these you’re using suits your use case appropriately. Uh, so what these things both do is, uh, this one is get letters. So what this does is it basically, uh, uses that string splitting thing to do some XML-ing. And what we do is look for, uh, basically any single character in a string that matches the AZ-AZ, uh, uh, uh, pattern, right? So that’s all the, you know, characters between A and Z. Um, so that, that, that, that’s what that does. Uh, if you need out weird characters, I don’t know, write your own function. Uh, and then the, the get numbers function does just about the same thing, except it only looks for where, uh, the single character is like zero to nine.

The strip characters one is a little bit different because what this does is it seeks to remove, uh, some, some matched expression from, uh, from your strings. All right. So they, they all function slightly differently. Um, I think maybe, I mean, sort of theoretically, technically, strip characters could replace both get letters and get numbers. But, um, I, I sometimes find that it is, it is, it is helpful to write code that is geared towards a specific task.

Uh, the, the more, um, the more that your code might have to do or, like, the more generalized your code is, sometimes the less efficient it is. Uh, overly generalized code, big, bulky, slow, lots of thinking, decision-making, uh, things to check on. Uh, more focused, narrow-casted code generally tends to be faster and work better.

So, uh, with that out of the way, let’s go over to Management Studio and let’s just see a couple examples of these things working. So, uh, this is the get numbers, uh, function. And if I run this, uh, we’ll see, uh, a bunch of nulls where some usernames don’t have numbers.

And then for lines where the usernames do have numbers, we will only get the numbers from them. Things like 4614. Without the user, right?

So, that’s, that’s, that’s get numbers, which is pretty handy. Uh, get letters does the exact opposite of get numbers, where, uh, when we get down to this batch of users, where there, there, there were numbers there at some point, uh, the numbers have been completely stripped out.

All right? So, fun stuff there. Uh, the strip characters one, well, like I said, there’s a, there’s a solid case for strip characters replacing both. So, both of those, uh, in different ways.

But, um, well, so like there are some weird, there are some weird characters in some of these. It’s a screwy Unicode thing. Uh, the results look a little weird sometimes with this.

But if you look at, uh, the, the, the display name column in the Stack Overflow database has some, has some very weird things in it. It’s an Envarcar and strange things happen. So, but we, when we want to, uh, strip out characters, um, well, this is, this is the one where we, where we were removing numbers.

And we can see the numbers get removed in here. Like that. And for the one where we were looking to, uh, do the opposite, which is one where we were trying to remove all that stuff.

Uh, so there’s, like I said, there are some strange things in here. But the, the important thing is that for these, for these rows, uh, we only get the numbers back from those. Uh, for, for these, uh, I would have to, like, go through and, like, do, like, an ASCII or Unicode check to figure out exactly what is odd in here.

But Tomek Melissa, I guarantee you, has some weird Unicode character somewhere in that name. Uh, we just, I, I just didn’t really do all that much digging on it. So, if you, if you have a need in your database to either, um, a lot of, I see a lot of this stuff with, like, phone numbers or, you know, um, something along, something along those lines.

Uh, and it’s usually pretty good for, these are pretty good for that stuff. Uh, they are inline table valued functions, so they don’t have the same problems that scalar UDFs or multi-statement table valued functions would have. So, if you want to give these a shot and see if they fit your use case, you know, I got get letters.

I got get numbers. And I got strip characters. And they all do, you know, just exactly what they sound like. So, you should try them.

And if you, I don’t know, like them or you find bugs or you find things that can be improved from a performance perspective, well, this is what GitHub is for. We collaborate. We’re a community.

We all high-five each other. No one gets paid. No one gets paid. All right. Cool. Uh, thank you for watching.

I hope you enjoyed yourselves. I hope you’ll try my handy little functions out. Uh, I hope you learned something. Uh, what else? If you like this video, I like thumbs.

I love thumbs. Thumbs are the best. Comments are nice, too. Uh, and I also like subscriptions. Subscribers.

I like when people say, I want to hear from you more often, Erik Darling. Uh, because, you know, that’s how I make friends. I yell at my camera on YouTube.

So, anyway. Like. Subscribe. Hang out. Spend some time with me. Mm. Try some functions out. They’re free.

The first function’s free, kid. All right. Uh, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.