How String Columns Can Mess Up Query Performance In SQL Server
Thanks for watching!
Video Summary
In this video, I delve into how string data in your tables can significantly impact query performance, particularly focusing on memory grants. I explore why SQL Server estimates that half of a string column’s byte length is full, leading to potential issues when you select multiple text columns or have non-ideal indexes for sorting operations. By running through two example queries—one including the text column and one not—I demonstrate how this can cause substantial differences in execution time and memory usage, with spills to tempDB being a common issue. I also highlight more extreme cases where selecting large string data can lead to severe performance degradation, even causing SQL Server to run out of temporary disk space and fail queries.
Full Transcript
Erik Darling here with Darling Data. And, you know, I’m having a nice day. I returned from my family vacation. I say family trip because when you travel with kids, it’s not a vacation, it’s a trip. Just careful distinction there. There is not a lot of rest and relaxation. There is just playing referee in a different place. So, you know, here I am. And in this video, we’re going to talk about, well, basically how string data in your tables can mess up query performance. Now, there are a couple other ways that we’re going to discuss today. But today, this one, we’re going to focus on how string, columns can mess up query performance in the area of memory grants. Now, if you’ve watched other videos of mine, you may recall me talking about how when SQL Server estimates memory for a query, it starts with a serial execution plan. And it figures out how much memory would take for a single thread to execute your execution plan, right? Whatever the executable plan for the query is.
The things that primarily impact memory grants are sorts and hashes. Okay? It’s because of the way SQL Server estimates string column fullness, where whatever the byte length of your string column is, SQL Server assumes that about half of it will be full. So, for a VARCAR 100, SQL Server assumes that every row will have 50 bytes full. The more rows you select and the more text columns you select, or the more string columns, I’m going to say text because text is an actual data type. The more string columns you select, depending on their unicode-ness and other things like that, you can really blow up memory grants.
Now, I have a pretty non-ideal index for the query I’m going to run, but that’s because I want to show you what happens when we need to sort data, right? So, I have two different, well, I mean, basically, I have the same query twice. The only difference is up here in the select list. I am including the text column from the comments table, and the text column isn’t in VARCAR 700. And the second query, even though I’m doing a select C dot star here, since I’m not pulling the text column out in the final select, SQL Server is free to ignore it. So, and I have recompile hints on these queries, just because I don’t want any sort of automation or, you know, behind the scenes, you know, intelligent query stuff messing with my demos, right? I just want to show you, like, out of the box, you run this thing, what stinks about it, right?
So, the first query runs, and we select the text column, and the second query runs, we don’t select the text column, and let’s look at some differences between these two execution plans. Well, right off the bat, you might notice that this query runs for about twice as long as this query, right? Let’s just call that close enough to twice as long, right? 4.7 and 9.1 something, good enough. A couple hundred milliseconds off, who’s counting?
The databases, do we care about milliseconds? Yes, we do. We care about everything. And the primary source of the slowness is in the sort operator, where we spill here, right? And that takes about 8 seconds, and this takes about 3.8 seconds. And of course, it’s going to be because the spills are different, right? This one, oops, I didn’t go down far enough. Story of my life.
And this one, we have the warning. We spilled to level 2, and all 8 threads went kaboom. And that’s a big number of pages. If we put some commas in there, because SSMS, I don’t know, comma allergy. Let’s just put a little dot here. Oh, wait, that didn’t dot. That is close enough to a dot. And then 3 more, and that’s 411 million pages or something like that. I can’t even count that high. I’ve run out of fingers very early on. So that’s a pretty big spill.
And if we compare that to the spill down here, even though this is still spill level 2 and 8 spilled threads, that’s a lot fewer pages. And the reason we have more pages spilling is, of course, because we have that text column. And even though, well, let’s look before we say even though, right? Let’s make sure that we understand.
This one asks for a query of 4,296 KB. And this one asks for, well, the exact same memory grant. SQL Server asked for the exact same memory grant for both. And this one spilled a lot more, didn’t it? Interesting stuff there, right?
Even though we don’t have the text column in this one, right? Creation date, post ID score. And this one, we have our text column. Even though SQL Server knew that there was that text column in there, it asked for the same memory grant. That memory grant just didn’t do well for either of these.
Now, if we redo the index a little bit, and the point of redoing the index is to help out our windowing function on post ID and score. And just keep in mind, of course, that the order of the sorting for the windowing function makes a huge difference. Now, granted, I don’t, I mean, at least I’m pretty sure.
In all of the documentation I’ve read, you can’t partition by something descending. You can only partition by something ascending, probably. You can order by things descending.
So if we really do need the score descending here in order to make this index effective for the query that we’re running. And what sucks about both of them, too, is that if you don’t include all the columns that you’re selecting out of the table, you really, really knock down the chance of SQL Server choosing a narrow, nonclustered index to help with this stuff.
Because it just straight up doesn’t want to do lookups for row number a lot of the time, right? Just forget about it, right? Unless it’s a very, very small number of rows.
So looking at these two queries again, right? And this time, we’re going to completely get rid of the sort and the need for, well, the need to sort data to apply that dense rank windowing function in here. There we go.
With that in place, both of these queries take just about the same amount of time, right? We don’t have any big difference here because we don’t have that sort that asks for the same amount of memory in spills and all hell breaks loose, where one spill took twice as long as the other spill.
So we completely got rid of the portion of the sort and the query plan that used to live in this section, right? And now we just have the seek and the building of the windowing function stuff in there. So a lot of the times, you can fix problems with these sorts of things just by having the better indexing in place, right?
So not all that, like, you know, you can’t always just go tweaking indexes on the fly. But if you spot these things, then, you know, indexing can be one way to help out with the pains of string columns, especially when sorting data. The other thing that we want to look at is a more egregious example.
So I have two different queries here against the post table. And now the text column in the comments table is an envarkar 700, which is pretty small beans in the grand scheme of things, right? 700, you know, I guess 1,400 bytes, seven gizzard unicodes, and blah, blah, blah.
But the body table in the post table is an envarkar maximum. It’s a big one. Big boy.
Big boy. And if we run these two queries, we also might be able to spot some differences here. Some significant differences. So that first query finished.
And now we’re waiting on this second query to finish, aren’t we? We’re going to wait a little bit longer on this. Not going so well.
Second query. But that first query finished, I don’t know, it seemed like about eight, nine seconds maybe. I don’t know. This one’s really dragging on with that body column. I’m starting to think that strings are a mistake.
Especially, you know, maybe, you know, storing entire potential two-point-something gig strings in a database was maybe not an awesome idea. Maybe we shouldn’t be doing this. Maybe we should, you know, put them somewhere else.
Let’s put them in a text file on disk, right? And just have a pointer to the text file on disk. Or I don’t know.
Perhaps there is some other storage engine that stores that sort of large text more effectively. But holy cow, we have some differences here, don’t we? We have some big differences.
We have another very big spill difference. Let’s edge these two things a little bit closer together. There we go. We have one sort that took 5.5 seconds and another sort that took almost 49 seconds. Well, I mean, not technically 49 seconds.
Because we did some stuff over here. We did some stuff over here. And we’re going to ignore the time in the repartition streams because the timing code for parallel exchanges is absolute garbage. It just looks dumb and stupid and weird everywhere.
So just ignore that. Just forget that part for this example. Now, let’s compare memory grants. All right.
1214 megs for the one where we just select ID. And for the one where we also select the body column, we end up with a 17 gig memory grant. 17 gigs of our precious memory.
All dedicated to a sort. All right. Just one sort operator was like, hey, 17 gigs, pretty please. I could use that for sorting things.
You have an Envarcar Max. What do I know? Right? That’s pretty foul. And the differences in the spills are, of course, you know, spill level 1, 8 spilled threads, 40,000 pages. And here we have spill level 1, 8 spilled threads, but 3881976.
So 3.8 million pages. But this has had a significantly worse time because we have that body column involved. And we have to write that body column, this sort of stuff out to disk and reread it back in.
And gosh, what a pain that is. Now, at the far more extreme end of that, I’m going to show you two queries that ran over here. And if I lift up my left armpit and we focus in on the time down here, we have three minutes.
Let me move over. Three minutes and 30 seconds of execution time. Now, you may also notice that there is a little message down here that says query completed with errors.
What errors, pray tell? What happened? Hmm.
I don’t know. Let’s find out. Let’s go to the messages tab. And let’s see. Wow, that’s a lot of red text. I’m going to have to duck a little bit. But so basically what happened is that first query spilled so much to tempDB that I blew up tempDB.
I exploded it. I ran out of tempDB. I ran out of drive space on my VM.
I didn’t just run out of like, I didn’t just like fill up tempDB files because I have like an unreasonable max size on them. No, no, no. If you look at, I have a load disk space message on my VM.
And if we look at Windows Explorer and let’s look at, oh, let’s go to this PC where we see it. You know, I’ve got a 400 gig drive on this thing. And I have filled up tempDB nearly to the size of the drive.
Right. So I blew up a lot of tempDB space with this query. Now, of course, because of that, I don’t have an execution plan for this query. But I can show you, of course, that in the second query, if I quote out the body column, right, just I green text that body column.
Again, sometimes green text is very, very good in SQL Server, right? Sometimes we have wonderful uses of green text by getting rid of envarchar max columns from our select list. Well, go to the execution plan over here.
This query still ran for a minute and 14 seconds. All right. It’s a pretty long running query. So if we subtract the minute 14 from the 330 over here, that other query ran for like, you know, I guess a couple minutes and then failed.
Right. Filled up tempDB and failed. All because we were, you know, selecting that body column.
And that body column had to pass through a couple different sort operators. Well, actually, just one sort operator. The votes table down here. I mean, you know, I think there was another video that I recorded where I talked about a very, very offensive execution plan.
Where SQL Server chose a merge join with the votes table and chose to sort the entire votes table. All right. But it’s not this sort’s fault.
This just sucks in general. Right. This just makes all sorts of awful things happen. But the big problem is in the final sort. Because this final sort is where the body column from the post table would be involved.
The body column from the post table has nothing to do with sorting the votes table down here. It’s all this sort where, you know, even without the body column, we still have a spill. And we wrote, let’s see, 1904184.1.9 million pages out to disk.
And that’s just without the body. So, like, SQL Server just, you know, kind of chose a, I don’t know, just not good enough memory grant for this thing. Let’s see what it chose.
17 gigs again. Again, so, why 17 gigs? Well, on my server, which I believe, I forget what I have max server memory set to. I think it’s like 90 something gigs.
SQL Server is willing to give out 25% of your max server memory setting to any query, any single query running. So, 17 gigs is about the max memory grant that I can hand out to anyone. And 17 gigs is not big enough for some of these queries.
All right, just 17 gigs, we still spill out to disk. Whether that’s a function of, you know, the way memory is shared between operators and query plans, which, you know, it totally can be, or something else, just, you know, the amount of data we’re selecting.
You know, that’s a fairly large number of rows and a fairly big number of columns. I mean, from the post table, I mean, realistically, we do have a couple other text columns in here, like tags and title and last editor display name.
And if we quoted those out, you know, maybe we would reduce the spill. It might still spill. I don’t know.
It’s too lazy to check on that right now. Got other things to do. But, yeah, so when you run into these things, a lot of the times the worst spills that you will find are because you, either you did something that required SQL Server to sort data.
You have a presentation order by. You have top with an order by or offset fetch. Or you have a windowing function without an index to support the partition by order by part of the windowing function in order to get you this data in the order that you need it to, you know, apply whatever windowing you need to your data.
You know, those are things that will add, you know, sort problems in. Now, I think what’s really interesting with this one is that, and I sort of wish I had saved it, but it wouldn’t really have done you any good.
This query will still error out if I put a hash join hint on here. Because, again, the only part of this query that, you know, had like a dumb injected sort is when SQL Server chose a merge join here, right?
So SQL Server costed a merge join as being the more effective join type here, which requires sorted data. So SQL Server sorts the votes table, and so this does sort and spill, and this does require memory, but the problem sort is the one up here because we don’t have an index to support the windowing function that we are applying to the post table.
So this is the real big one because this is an absolutely phenomenal number of rows, and a lot of this stuff spills out. So even with just a couple text columns in there, we can still have some pretty significantly bad query performance.
So, in summation, as much as you can, as much as possible, you know, if you’re going to put strings in your database, avoid max data types wherever you can, right?
I understand some queries require them. So rather, some data structures require them. You might be storing XML or JSON or, like, you know, notes, like, you know, whatever. Some, like, open text entry field.
You know, there are reasons why you might need in VARCAR or VARCAR max columns in your data or in your tables, but, you know, just don’t use them unnecessarily. Please try to right-size your string columns as much as possible.
And when you’re doing things like if you really need sorted data for some reason, the two things that you can try to do are pre-sort your data and then fetch text columns later. I do have videos about that, informational versus relational columns.
It’s a very important concept. You know, like, you know, usually in order to provide a result set for someone, your joins, your where clause, your sorting can be done in a different stage of the query processing than retrieving all the columns you need to show someone.
So, like, not doing everything in one big swoop like this is, you know, usually a pretty good idea for query performance. You know, sometimes temp tables are a good use.
Sometimes you can use derived queries in some manner, whether it’s a CTE or, you know, it’s a, you know, sort of a derived table expression. There are different ways to separate those things out so that you don’t have to, you don’t have to select every single column and apply sorting to all of those columns.
You can do the initial set of data joining, filtering, sorting, and then go get the columns that you’re interested in displaying, right, the informational columns after you process the relational columns.
I have videos about that somewhere on my channel. I’ll go try to find them when I publish this. But no promises because, I don’t know, searching is hard, isn’t it? A little difficult.
So, anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something about how awful strings are in databases. I hope that you have now started a project to move all your strings to text files on the file system and just have pointers to the text files.
Kidding. I mean, if they’re PDFs or, like, Word docs, fine, like, file stream type stuff, fine. But, you know, if it’s just something for the table, I guess, I guess you could deal with that. Yeah.
Anyway, I have things to do, so I’m going to go do them. Again, well, I guess, I was going to think of one another. Oh, yeah. If you like this video, you can subscribe to my channel. And you can join the nearly 4,000 other data darlings.
Actually, maybe by the time this thing actually gets published, there will be 4,000 data darlings out there in the known universe. If you like this video, give it a thumbs up.
If you have anything to tell me about this video, you can comment on it. Good, bad, or ugly? Just don’t call me ugly. That’s personal.
Then we have to fight. But, yeah, you know, strings and databases are a very perilous thing. The number of problems that they can cause. So I’m going to be recording another video soon about other problems that max data types and string filtering can cause in queries, especially query performance issues.
We’re going to have one of those coming up soon. It’s maybe even in the text file right over here called don’t ignore filters. Might even be right in there.
Yeah, but, you know, strings and databases are a perilous thing. The bigger they are, the harder your queries fall. And just like we saw in this query where I was able to completely blow up 10DB with a sort, you know, it’s probably not why you want your queries failing.
You probably don’t want queries failing at all. Right? So, yeah, watch out for those string columns, George. Boom.
All right. Goodbye. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.