Why You Should Always Use Unicode For Dynamic SQL

Why You Should Always Use Unicode For Dynamic SQL



Thanks for watching!

Video Summary

In this video, I delve into the importance of proper string typing when working with Dynamic SQL in SQL Server. I highlight common pitfalls, such as using `varchar` instead of `nvarchar` for Unicode characters, which can lead to data loss and incorrect results. By emphasizing the need to preserve Unicode-ness throughout your dynamic SQL strings, I aim to help you avoid these issues and ensure that your queries handle international characters correctly. Whether you’re crafting complex Dynamic SQL statements or designing robust database schemas, understanding how to properly manage string types is crucial for maintaining data integrity and preventing unexpected behavior.

Full Transcript

Erik Darling here with Darling Data. And I think the theme of this week’s videos is going to be length. Some people have taken to complaining about my length in the comments. And so rather than apologize for my length, as I apparently often have to do, I’m going to do some short ones this week so that everyone can be comfortable with my length. Now, this video is going to be about Dynamic SQL string typing. Since we’re going to continue with the theme of apologizing for lengths, using the right type and length and strings for Dynamic SQL is very important. You don’t want to apologize for having too short of a length when you’re crafting Dynamic SQL. So always be careful there. But the subject of this video is more along the lines of make sure that you know exactly what kind of data is going to end up in your Dynamic SQL strings.

Because if you have Unicode characters in there, they could disappear in a variety of different ways when you are drafting and crafting and concatenating all your Dynamic SQLs together. So, if you like this channel, you can join like 20 other people in getting a membership for like $4 a month. I know. Big money. Watch out. Erik Darling might get a new Adidas shirt soon. Lord knows I’ve been wearing this one for three years or something. If you are unable to fork over a Paltry $4 a month, you can do likes and comments and subscribes. Because, you know, you gotta have options, right?

If you need SQL Server help from a consulting perspective, these are all things that I am very good at. Just ask anyone. And you can, you’ll get an answer, I’m sure. If you need some low-cost, high-quality training, you can get about 24 hours of it for about $150 USD with that discount code at that URL. The link for all this stuff is down in the video description. So, you can hang out in there if you’re feeling clicky.

I will be live and in person at Past Data Summit, November 4th and 5th in Seattle, Washington. That week, I will most likely be, what do you call it, intoxicated? No, I’m gonna put up some weird, like, livestream-y videos from Past Data Summit. I might try to interview some smart people. I don’t know yet. I’m sort of getting that figured out because I am enjoying the YouTube video content thing.

I’m just not sure if I want to have, like, a bunch of pre-baked stuff going on during Past or if I want, you know, live, fun Past content. Who knows? Maybe I’ll give Steve Jones a wedgie. Anyway, let’s get on with… Oh, we faded to black. Ah, crazy. All right. So, let’s talk about Dynamic SQL. And this is all, you know, a lot of the videos that I record are based on experiences and interactions with clients of mine, the nice people who pay me to have, you know, the spare time in the day to record these videos.

So, we’re just gonna make sure there’s nothing going on. Even though indexes and database contexts have nothing to do with this demo, everything you see here could happen anywhere, even in Azure SQL DB, if you’re stupid enough to use Azure SQL DB. So, excuse me. Here’s some Dynamic SQL, right? And if we run this, what I want you to focus on is that we did not correctly type.

So, like, when you run Dynamic SQL, if you’re gonna use SP Execute SQL, or even if you’re gonna do exec, you probably should use Unicode strings because Unicode strings will do a better job of preserving Unicode data. But if you’re running, if you’re using SP Execute SQL, you have to have the input parameter typed as Unicode.

It will not accept a non-Unicode string or parameter or variable or anything when you’re, you know, executing it. And say, no, it has to be Unicode, dummy. So, what I see a lot of people do is, so that they don’t have to worry about n-prefixing every single string concatenation block, is they’ll declare a varchar max variable or parameter, and they’ll do all their Dynamic SQL concatenating into that thing.

And then, at the end of it, they’ll set their Unicode max string like this. Sorry, I circled the wrong one there. I squared the, rectangle the wrong one. They’ll set their Unicode string equal to the non-Unicode string, and then execute Dynamics SQL with that.

The problem is, like you might have seen because I hit execute on all that stuff, is that you lose the Unicode-ness of your stuff in there. So, these are some Japanese characters in this string, and you can see right off the bat that we just end up with a bunch of question marks for there.

And even converting the string back over to Unicode does not resuscitate them. So, when we select that string, we get a string of question marks. Not good. Not what we wanted there.

Now, pay a little bit more attention to this one, right? Because in this one, we’re actually doing almost everything right. Where we preserve the Unicode stuff here.

Oops. Oh no, it’s red. Ugh. Let’s do that again. Let’s fix that. Let’s make my pointer pink again. So, here we do have the Unicode characters, and here we do have the Unicode characters.

But when we actually go and select that string, this happened. And this happened because the string inside of the string was not correctly typed. This stuff can happen in all sorts of weird places.

So, this one down here is actually done correctly, right? So, if we look at this, and we run this, now we preserve our Unicode-ness all throughout, right? We get that good stuff there. We get that good stuff there. We get that good stuff there.

We returned everything that we should have. And, you know, just to sort of keep going on the same thing, it’s going to be fairly obvious since I’ve showed you all this stuff.

But, you know, making sure that all of your stuff is correctly typed is really important. Even if you do something like this, the fact that you do this is just VAR card does not, it doesn’t help that you end prefix this.

SQL Server is like, oh, that’s actually Unicode data. We have to change that. SQL doesn’t do anything to help you. It does absolutely nothing. You get question marks back from this. You really do need, if you’re going to be handling Unicode anything, to have everything be Unicode.

Now, this isn’t, you know, just dynamic SQL. Dynamic SQL is just an easy vehicle to show you what I mean. But, this is something that comes from a lot of different places, right? So, like, database design, you know, people used to really hem and haw and nitpick stuff like, oh, are we actually going to use Unicode?

Oh, will this table actually get that big? And, you know, they would make kind of dumb decisions and they would, you know, start the database off with non-Unicode strings and do things like make identity columns integers instead of big ints. And, I feel like that kind of stuff really comes back and bites a lot of application developers and a lot of, you know, architect type people because they made bad choices at the outset based on, you know, things that really shouldn’t have, you know, like application specifications that really should override like, oh, this is just the best practice.

So, you know, when I’m, you know, consulting with people and trying to help them with something that they’re making a first pass at, usually a couple of the things that I make sure we go over is that, or rather are that, you know, if you’re not sure, if you don’t have like a particular domain assigned to an integer column, like, you know, there’s only going to be like 16 or 20 like valid statuses for something, make that a big end because you don’t know how big that table is going to get.

You make an identity column or a sequence column or even if you’re generating IDs in the application to make sure that they are completely in order and monotonically increasing with no gaps, you got to be careful, right? But if your application is a runaway success and all of a sudden you hit 2 billion rows and go boom, what if, you know, you start expanding into foreign markets where people, you know, use Unicode characters for a lot of stuff, all of a sudden those name fields and those address fields and a lot of other things might go boink.

So, not just Dynamic SQL, but application design everywhere. I think, you know, if you’re going to do things safely for the long haul, for the long term, you should use Unicode as much as possible. There are just a few specific things about Unicode that for Dynamic SQL that you have to be real careful with.

Like, you know, you can’t execute SPExecute SQL, which is the only safe way to execute parameterized Dynamic SQL so you don’t get SQL injected. Everything has to be Unicode and you have to really mind all the concatenation stuff to make sure that all your strings are correctly prefixed with that uppercase N so that you don’t lose anything in the concatenation.

You’re probably not going to get like the whole string implicitly converted over to VARCAR because that would be, you know, kind of ludicrous. But, I don’t know. Some people find my pronunciation of VARCAR ludicrous. So, I don’t know. I don’t really know what to tell you there.

But anyway, yeah, Unicode. Be safe out there in the database world. Whether it’s Dynamic SQL or, you know, table design or anything like that, Unicode generally is the better choice to make sure that your application is safe and sound in the long run.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I hope that you will be out there with all your full Unicode self. So, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.