A T-SQL Query To Get The Text Between Two Delimiters In SQL Server
Thanks for watching!
Video Summary
In this video, I delve into the intricacies of using the `SUBSTRING` function in SQL Server to extract text between two delimiters. Erik Darling from Darling Data shares his insights on a common pitfall: misunderstanding how the arguments work. He explains that while many assume `SUBSTRING` works with start and end positions, it actually requires specifying the length after the starting point. To illustrate this, he uses examples from the `sys.messages` view, highlighting the importance of being precise when defining these parameters to avoid errors like “invalid length passed.” The video is a must-watch for anyone who frequently deals with string manipulation in SQL Server, as it demystifies a function that, despite its simplicity, often trips up even experienced users.
Full Transcript
Erik Darling here with Darling Data. Of course, who else would have me at this point? I can’t imagine. I can’t imagine at all. In today’s video, we’re going to talk about how to get the text between two delimiters in a SQL Server query. The reason why we’re going to do that, and I know this may sound dull and pedestrian to many of you, the reason why I’m going to do this is because every month I get this email from Google, and it’s all about my website, search results, performance, and stuff like that. And every month, the blog post that I wrote about this is the number one post. All the stuff that I write and record about, all the things that I do, how to get the text between two delimiters is the big winner. Alright, fine. Okay, you get a YouTube video too. That’s your prize. For being the number one blog post on erikdarling.com, formerly ericdarlingdata.com, you get a YouTube video. Memorialized, canonized, itemized, you get a YouTube video.
So the first thing that everyone gets wrong about substring is what each of the arguments does. And this isn’t purely in the context of SQL Server. There may be other implementations of substring that do things differently. But in SQL Server, the substring function has three arguments, and they are the input expression, in other words, the thing that we are going to get a substring of, the start position of that string, and then the length of the string. It is not start and end, right? This third one does not meant to find the end position of a string.
This is where you tell substring how long the string is after the starting point. Alright? So the way this works, and I’m going to be using the sys.messages built-in view for this.
And I’m going to specifically be searching for entries in the text field that have at least two semicolons in them so that this works. If you don’t do that, or if you don’t have something, if you don’t, like, if you’re not really confident about the substring stuff of the character presence or position in the string, you’re going to write a lot of really defensive, like, code around this.
Otherwise, you’re just going to get a bunch of, like, you know, invalid length passed to whatever errors. And those suck, and I hate debugging them, and it’s miserable, and I don’t want to do it. So I try to be as defensive as I can when I write these things.
So anyway, this is sort of a basic way of doing stuff. The fully parsed string portion is down here. This is the only section…
Zoom it, you are on my last nerve. This is the only section that’s going to actually do what I’m trying to teach you how to do. There are a few other columns in here to enumerate what the actual first, second… Come on, zoom it.
…and third arguments are as they’re passed in. All right? So we’re going to run this. And we’re going to look at the results. And what we’re going to see is…
Well, I mean, exactly what I was telling you about here. The first argument is the text of the error messages from sys.messages. The second argument is, like I said, the starting point of the string.
And the third argument… Notice that this is not always a higher number than this, because all this is is the length of the string after this argument, right?
So for this one, we go from the 71st byte and then we go 6 bytes over. This one, we start at byte 59 and we go 441 bytes over. This one, we start at byte 31 and we go to a 13, right?
So again, first argument, the input string. Second argument, starting position. Third argument, length of the string.
That’s why when you’re doing this stuff, what you have to do to make sure that you get all that positioning right is a lot of jumping around.
And it’s really annoying and I hate it. And I’ve had to do this for my entire SQL Server career. And I almost never remember how to do it correctly the first time around. And if it’s one of those situations where we need to do like a reverse and find like the last thing in a string and then jump, forget it.
I have to look for like 15 different blog posts before I can get that one right. So for the parse string one, though, the thing that we’re actually learning how to do today, the first argument, pretty self-explanatory, just m.text, right?
That’s all we need there. The second one, we want the car index in the text column of colon plus the length of whatever you’re searching for. So if we were searching for something that were two characters here, if we were searching for like double colon, then the len would have to change to two, all right?
Or we’d have to change plus two or plus double colon here. Otherwise, we would have an inaccurate starting position. The third one for the length of the string, this is where things get annoying and complicated, right?
The first two things, fairly self-explanatory, right? The thing that we want to split, the thing that we want to substring rather, and then the car index of the first thing that we care about plus the length of that first thing that we care about so that we start on the thing, like right next to it, right?
We like colon right next to the colon. You get what I mean. The third one, this is the tough part.
So one of the nice things about car index as a function is that it, where that, so pad index does not have this, car index does have this, is car index supports a third argument optionally, where you can tell it the starting position that you want to start looking at.
So for this is going to look exactly like it did up above. We’re looking for the first colon in the string, but the third argument is going to tell our query to look for, is going to start looking after it finds this.
So basically, after it finds this thing plus the length of this thing, right? This line up here is basically just a copy and paste of this. We want to find the very next colon in text after we find a colon in the text plus the length of the colon right there, right?
So because remember, we got to bump up one over so we don’t capture like anything stupid. Then, and this is where, this is the part that always murders my brain. After that, we have to subtract, right?
We have a little subtracty thing here. That means subtract. The subtracty thing means subtract. We have to subtract the length of the thing that we’re looking for. And we also have to subtract the car index of the first position that we find, right?
So essentially, we have to subtract this from this so that we get that, we get the correct string length. Because remember, we’re not, substring in SQL Server is not expression start end, it’s expression start length.
So in order to figure out how many characters or how many bytes over we need to go from the thing we care about, we need to subtract the first position that we find and the length of the thing from the second position that we found, right?
Because that’s what gives us the text between delimiters and not just text from delimiter to the end of the string. I know.
This stuff isn’t fun. Most sensible people, if they have a job and they work for a place, they’ll probably write an inline table valued function to do this because remembering this is awful.
But it is the most popular post on my blog, a popular blog on my entire website, so it gets a YouTube video.
And that’s about it. So apparently a lot of people always forget how to do this. Myself included. Anyway, thank you for watching.
I hope you finally learned how to use substring in SQL Server. I hope you enjoyed yourselves. I hope you learned something, like maybe what substring actually does in SQL Server.
If you like this video, lots of thumbs, lots of comments. I’ll take anything.
Send me a letter. Just kidding. I don’t want, I can’t read, I can’t read. And if you like this sort of SQL Server content, maybe not exactly this SQL Server content.
Apparently a lot of people really like this. Maybe this is what I’m going to blow up on. If you like this sort of SQL Server content, you can subscribe to the channel, like 4,000 almost other people have.
I don’t know what I’m going to start saying when over 4,000 people have subscribed. I’m optimistic that I’ll survive, I’ll live to see that.
But anyway, thank you for watching. Happy, happy strings, happy substringing. Oh, yeah.
Okay, goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.