The SQL Server Performance Tasting Menu: How Optional Parameters Hurt Query Performance

NOOOOOOOO

Video Summary

In this video, I delve into the ins and outs of SQL injection by exploring common query implementations that can lead to security vulnerabilities. Starting off with examples of poorly written stored procedures meant to allow flexible searches across multiple columns, I demonstrate why these approaches are not only inefficient but also dangerously expose your database to SQL injection attacks. By walking through a series of flawed queries, including the infamous “universal search string” and dynamic SQL implementations, I highlight how these can be exploited to gain unauthorized access to sensitive data. To address these issues, I then introduce safer methods for implementing dynamic SQL that prevent such vulnerabilities while maintaining performance and security.

Full Transcript

And now, my friends, we talk about SQL injection. And of course, the way that we talk about SQL injection is by talking about where SQL injection usually stems from, and that is queries where someone decides against all better judgment in the universe to give users a flexible search query. Users are just the worst. So, what happens is, someone decides you need to be able to search across a number of columns, but only if we care about them for a particular search query, right? So, you know, you have a whole bunch of columns in your table, and you don’t know which one a user might want to search in for a value. And so, sometimes they might search in one or two or one and three or two and three or two and three or two and four or whatever. You get the point. It’s optional.

If someone isn’t searching for something in a column, we still need to be able to return results. And a lot of the ways that these queries start off is by writing a stored procedure like this. And I have to be honest with you that this is… I am not a fan of people who overuse the word literally for things. But this is literally the worst possible implementation of this kind of query.

Do not do this under any circumstances. This will not perform well. It might work okay if you have like a hundred rows in a table. But anything, any amount of real data is going to cause problems here. Now, if I create a stored procedure that looks like this, and I just do a simple search, this query will run for a long time because the post table is rather large.

If I remember correctly, it’s about 17 or so million rows. This query will run for 17 minutes and 43 seconds. 43. You see that right there? 17 minutes, 43 seconds.

Select top 1000, all this other stuff. If we look at the query text, you can verify that that is the query that I just showed you. If we look at the search parameters, we can see the parameter list. That is what I just showed you, 35004. And distressingly, if I show you the predicate that we searched on here…

Oh wait, we don’t have one here, do we? No, we don’t. Isn’t that unfortunate? We don’t actually filter anything here. We do a doot, a doot, a doot, a bomb, and we filter everything here.

Look at that ugly, ugly predicate. No wonder this thing takes 17 minutes and 43 seconds. No wonder. So let me now show you the second worst possible implementation of this type of query.

Where rather than having this alt string and doing this isNull doodad, we say where each column is like the search string or the search string is null. The one universal search string thing always sucks. Always sucks. Don’t do it. If I run this store procedure with this setup instead, it will turn out just as horribly as this one.

I’ve tested it, but I’m not going to sit here for 17 minutes and stare at you. I mean, I would if we were paying by the minute, but you’re not paying anything for this. So again, forgive my limited patience.

If we wanted to start to make this a little bit more sane… I’m not sick. It’s okay. If we wanted to make this a little bit more sane, what we would have to do is move on from the universal search string to separate searchable parameters that match the data type of the column that we’re searching for.

Which starts to get a little bit better, but you will still get pretty bad performance from this. Now, you could potentially make this a little bit better with a recompile hint, but I’d rather teach you the right way to do things. After I show you the wrong way to do things.

Because there are a number of wrong ways to do things. There is a fourth worst possible implementation of this, literally. Let’s look at the fourth worst possible implementation, where we use very unsafe dynamic SQL.

And we start off on the right foot. We start off doing what could potentially be considered the right thing, but we muck it all up over here. We just… this is not our friend over here.

Now, what I hate… or really, what I hate is that little Bobby Tables cartoon. I know many of you are fond of it, but I think it’s stupid because it doesn’t show you the real risk of SQL injection. Yes, you can prevent someone from dropping a table.

Amazing, I know. I’m not a security expert, obviously. I stopped bouncing years ago. But you can stop someone from dropping a table.

But it gets a little bit harder to manage more granular permissions. Especially when you have applications that might need fairly escalated permissions to run normally. There are lots of applications that I’ve seen that need to create and drop objects.

That need to query system views, to look at table layouts, primary keys, column data types, create agent jobs, create databases, add databases to availability groups. I’ve seen applications that need to do basically everything that a sysadmin would do. These applications are written to, I mean, sort of take the place of a DBA needing to do anything because they don’t trust a DBA to be able to do the right thing.

The application owners are like, look, I know what I need to do. I’m going to write my application to do it. Screw whatever happens, right?

Wherever you install this, it’ll take care of itself, right? Self-healing, something like that. Now, if we were to run this query, like this, with this implementation, this would actually run pretty quickly. Let me just make sure that I actually created that sort of procedure.

I sort of forget what happened five minutes ago. So let’s make sure that we have that created. And let’s run this. And that actually runs pretty quick. If you look at the query plan, that’ll happen about as quickly as we can imagine.

We get this one very smart person back who is interested in when they should build their own buffer system for IO in C++. Gosh, you’re smart. You are very, very smart.

How? How do you even come up with that? Building your own buffer IO subsystem in C++? My goodness! Your brain must just explode out of your skull whenever you have a thought. The problem with this kind of search procedure is that it’s not that we could drop a table.

It’s fairly easy to audit or to recover from a table being dropped. It sucks, but you can do it. What is worse and mostly undetectable is that people can start poking around in all sorts of system views that applications, the many applications that I see would have completely unrestricted access to.

So let’s take this body parameter and let’s add a little bit of extra T-SQL in there just to go look at sys.tables. Alright, so let’s go run that and we’ll get back fairly quick results. I don’t know why that took three seconds this time.

Kind of funny. But now our execution plan, well, let’s actually, let’s see. What took three seconds? I’m so confused. Nothing in there should have taken three seconds. Nothing took three seconds.

That finished quickly. Something is unlocked. Something is strange in here. But anyway, now we get back our one row. We had a successful search for this person who wants to build their own buffer system for IO in C++. But now we get back a list of tables.

Hmm? Yes? List of tables. Correct? Yeah? Yeah. And we even get back this one particularly interesting looking table called hackme. Now if we were to take this a step further, we could write another query to go look at sys.columns.

And we could have sys.columns focus in on that hackme table. Yeah? So if we run this, hopefully this won’t take three seconds.

Ah, that was much faster. Thank goodness. But now we get back again our very smart person who wants to build their own buffer system for IO in C++. And now we get back also a list of columns in the table hackme.

I mean, pretty bad. It sucks. And now we can even go a step further.

And we could just say to union all to the columns that are in that hackme table. Right? We could run this query and whoo, that finished quick too. Thank goodness our query finishes quickly.

That’s a primary concern. But now not only do we get, again, this very smart person back. But we also get back this additional row that has someone’s social security number, date of birth, name, account number, all sorts of things that we probably wouldn’t want falling into the wrong hands. A lot of SQL injection, you know, jokes focus on data being defaced, right?

Dropped, deleted, updated to have some quirky value, stuff like that. But I think most of what we’ve seen in the real world is being like what’s threatening to people, like what the bad outcome is, is data being stolen. Someone like walking out the door with a bag full of your user data.

It could be customers. It could be, you know, health patients. It could be credit cards. It’s like so many, like financials, like so many different things that people can leave with.

And the more personal your data is, the bigger risk that that poses. So now we have to write this store procedure in a way that is both fast and does not let someone steal your data. And we can do that by writing safer dynamic SQL.

Now, if you call in this store procedure, the way that we got everything was to concatenate these parameters directly into the string. So there’s a break in the string, right? The string breaks here.

We have an and sign here. String breaks here. We have an and sign here. String breaks here. Blah, blah, blah, blah, blah. You get the point. If you go down to look at how this dynamic SQL is written, there is no break in the string. It is one contiguous thing.

This is all red text here. There’s no break into normal text. So we have owner user ID. We have all these parameters inside of the dynamic SQL. This is a much safer way to implement dynamic SQL.

This is not always a perfect way to implement dynamic SQL, but it’s much better than the alternative, which is probably losing your job and a bunch of personal information of your customers. And the reason that I think, I mean, one reason why I see a lot of people not do this is because there’s extra typing involved, at least from what I can tell. So we have to have variables or parameters, rather, inside of the dynamic SQL.

And we have to assign those parameters to the dynamic SQL when we use sp execute SQL rather than just exec. And then we have to assign values to those internal parameters over here. You don’t have to use at I.

I just use it so that we can differentiate easily between what’s taking place inside here and what’s being passed into there. But now with our stored procedure written like this, using, I think, probably the safest form of dynamic SQL, what happens is if you try to run a search query that looks like this, we no longer get results back. And we no longer get results back because all SQL Server sees here are parameters, right?

There’s our internally passed in parameter. There’s our another internally passed in parameter. And we only get results back if we send in valid searches.

Now we get back our very smart person who wants to build their own buffer IO system in C++. Now I’m going to backtrack a little bit because I realized I forgot to show you what the dynamic SQL looks up, looks like up here for this stored procedure. So let’s go set this one back in.

And five that. Let’s come down here to where we execute that first query that goes and hits sys.tables. And let’s go over to the messages tab.

And what you can see in the messages tab is that rather than seeing parameters like we saw in the safe dynamic SQL, we see the actual full text of what got passed in. So now you can now you I think you can understand the difference when you use passed in parameters like this SQL Server just sees whatever gets passed in when you use parameterized dynamic SQL like this SQL Server only sees the parameter about only sees the parameter. It doesn’t see the value of the parameter when it executes this when it executes the string.

So you can use it to search data, but you can’t use it to execute execute more commands on top of whatever you are going to pass in. So this is much better from a number of point of views. This store procedure a finishes relatively quickly.

B doesn’t allow people to steal data. Now there’s other stuff that can get weird with dynamic SQL. SQL I I understand you can get plan cash blow. There’s there’s other stuff that my my dear friend Erlen Summerscog covers in his article.

I’ll link to that in the video. But if you if you have to settle on a way to write dynamic SQL, if SQL dynamic SQL is truly what you need and is truly the path you have to take for this kind of query, then his article about dynamic SQL is probably I think where everyone I know learned how to write dynamic SQL.

So it’s probably a safe bet that you could learn a fair bit by going there. Well, so that’s all I have to say here. I’m going to get a refill and I’m going to record another video, which hopefully you’ll watch.

Maybe maybe you’ll maybe you’ll learn something. Anyway, thanks for watching. See you over the next.

So very soon. Thank you.

NOOOOOOOO

Video Summary

Full Transcript

Related Posts