A Performance Mystery With Parameterized TOP In SQL Server

A Performance Mystery With Parameterized TOP In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into an intriguing and somewhat obscure performance issue in SQL Server related to parameterizing the `TOP` clause within stored procedures. You’ll see how using a literal value for `TOP` can lead to more efficient query plans compared to parameterizing it, especially when dealing with small numbers under 101. This behavior is due to exchange packet sending rules that allow earlier data transmission in certain scenarios. I explore this issue through various examples and demonstrate the impact on query performance, showing why this problem might go unnoticed until it causes significant delays. By the end of the video, you’ll understand how to mitigate this issue with better indexing strategies, using `OPTIMIZE FOR` hints, or by rewriting your queries for optimal performance.

Full Transcript

Erik Darling here with Darling Data, cleaning up a few silly little things in my way. And in this video, we’re going to talk about the dangers of parameterizing top in SQL Server. And this is maybe the weirdest query performance problem. that I have ever run across that wasn’t like a bug. It was actually sort of known behavior by like three people in the world. So that was fun. So what I have here is a store procedure. And this store procedure really only takes one parameter because that’s all it has to. And that one parameter is used here. where we select a top number of rows from the users table. And then we cross apply to the badges table to find the top one. Well, actually just mean to put this into English, to find the most recent badge that a user has gotten. And for this, the purpose of this query, we only care about users who have a reputation over, I think that’s 10,000. And we were going to order by reputation descending. All right, cool. So that’s the store procedure version. We also have the just the plainly written version with a literal value here for 38. And then we have the sniff top version here where we’re saying at top equals 38. The query plans for these, well, I had to run these ahead of time, because I don’t want to stand here for a minute and 13 seconds waiting for this query to finish. Much like we’re probably going to stand here for a minute and 13 seconds while I try to grab the right part of the query plan to mush things together. So looking at these two queries, and I I think, I mean, you know, probably the funniest part about all this is if like, if you saw this query, like in the plan cache or something, or in query store, monitoring tool or something like that, it would be really hard to decipher, maybe why one time it ran for seven seconds, and the other time it ran for a minute and 13 seconds. You know, cost costing being what it is and all that.

Nothing, nothing is very clear about this. And I know that there’s some green text up there. And the green text isn’t totally wrong. The green text is right. This, this, both of these would be faster if I had an index on the badges table. But I need to show you the behavior without the index, where things are really terrible and crappy, so that you understand what’s happening and what you need to fix. All right. So yes, an index would be helpful here. But that’s not that’s what I’m trying to explain the behavior to you. Not teach you another, not teach you another, this is not just, this is not another stupid video about adding an index. Any, any half-witted consultant in the world can make another post about just adding an index.

Me, I try to give you a little bit more, because you’re nice people and you deserve more than half-witted consultants just telling you to add indexes. So I need to move over a little bit to make sure all the green stays on the screen. Unofficially, when a top uses a constant, and the constant is a small number under 101, the exchange packets are allowed to send earlier than normal, as long as the exchange is below the top operator. The exchange packets are allowed to send as soon as they have a row, rather than waiting for those exchange packets to fill up completely.

This can only happen with constants or another thing that I’m going to show you in a minute. And this behavior is true going back to SQL Server 2005. I’m of course doing my testing on SQL Server 2016. Sorry, 2022 version 16, which you can see if you look right there under my armpit.

It may change in the future. So if you’re reading this at some far-off date and Microsoft has decided to fix this stupid issue, then I’m wrong all of a sudden. Well, you know, sorry. I was right for a long time, though. So I had that going for me.

When you parameterize top, it’s not considered safe to send exchange buffers early anymore, because you could stick any number in there up through the big int max, because top takes up to the big int max for a number of rows.

In cases where you’ve got a big top, say the big int max, which is this nine, this is that number right there that starts with a nine and has one, two, three, four, five, six commas in it. A lot of commas.

If you were sending that many rows one row at a time, it would be significantly more awful than sending over a smaller number of very full exchange buffers. If you’re surprised to hear that parallel exchange buffers can send at different times, depending on different situations, you’re not alone.

I was also very surprised. Not in a good way. Not in a, I get a golden ticket. It was more like, like when the doctor calls you on a weekend.

You’re like, what happened? So in the second query, where the exchange buffers are only sent when full, we spend a lot longer waiting for those exchange buffers to fill up so they can send.

This isn’t exposed anywhere in the query plan. You’re not going to be able to see this anywhere else in the world, unless you have a good, unless you’re handy with a debugger or you pay attention to things I say.

So obviously that’s not a good situation. And there are a couple ways to fix it. Of course, there actually, there are a few ways to fix it.

One way, you know, in my case, better indexing would mask a lot of the problem, but not completely solve the problem. One way of doing it is to add an optimize for hint so that your query, so that you tell SQL Server to optimize the query plan for top being equal to one.

And if you do that, this isn’t going to make the query faster than the original version with the literals. It’s going to make it on par with the original version with the literals.

That’s all. This, this thing is still going to take seven or eight seconds because I don’t have any good indexes, but this at least gets us a plan that doesn’t take like almost a minute and a half. Right.

And then another way of doing it is to rewrite the query to. And so, you know, like one thing that I try to stress whenever I’m teaching people about SQL Server stuff is there are a number of different ways to write a query.

And if you’re not getting very good performance with one of them, you should try another one. That’s query tuning. Top one is, you know, a sort of a shortcut, you know, top one with an order by.

You know, it’s sort of a shortcut for like getting a max or something or using row number. In this case, if we use a row number and we pump up the optimizer cardinality compatibility level, it’s not, not cardinality compatibility level to 150.

or 160. So that we get batch mode on rowstore. Oh, this, this formatting murder, murder on me.

Did I fix this one? I didn’t. Oh boy. SQL prompt is really letting me down. I’m going to mess everything up here. Another. So if we rewrite the query to get the top one using a slightly different method, and we allow SQL Server to use batch mode, this is, this will completely negate the need for an index because this finishes just about as quickly as one could hope for.

Sans and index, right? Cause this, this gets batch mode. You can tell by the window aggregate there. That’s a good thing. All sorts of good things happen in this query plan using batch mode that weren’t happening in the other plans.

So, uh, you don’t always need an index, do you? You don’t always need an index, do you?

You know what else you don’t always need? Half-witted consultants. You need full-witted consultants. Full of wits. Lots of wits. So, if you’re out there in the crazy borderlands of SQL Server, and you run into a query with a top that’s parameterized, and the execution is real weird, the execution time is real weird, well, you might be running into this.

And if you’re running into this, things you can do, one, um, explore, uh, better indexing scenarios so that, uh, you can at least mask the problem a little bit.

Uh, two, uh, stick an optimize for, uh, on the query, uh, optimize for the top being equal to one. That will at least help you understand if you’re hitting this problem.

Uh, that actually, that would really help you understand if you’re hitting the problem, and it might actually, uh, improve performance. And of course, number three, you can rewrite the query in a way that gets you, uh, you know, things differently, right?

Like one thing, like I said, I always try to teach people. There are many ways to write queries. Uh, if the first way isn’t fast enough, try the second way. The second way for me here was using row number rather than, uh, top one, because the row number, uh, gives you different sort of set of, uh, optimizer stuff that the top one thing, usually won’t get you, like the batch mode on rowstore for the row number thing there.

So, a few things you can try if you’re running into this problem. I hope you never run into this problem. It is a terrible, awful, no good, very bad problem.

Uh, uh, and when I ran into it, I almost cried. But, then I remembered, Robert Smith told me, boys don’t cry.

And so, well, didn’t, didn’t cry. Never, never had, never will. It was, I don’t, I don’t want to let Robert Smith down. He seems, he seems very sad.

if he, if he, if he knew I was out there crying, I don’t know, I don’t know that he could take it. I don’t know how long he would be for the world. So anyway, hope you enjoyed yourselves.

Hope you learned something. I, I actually, I truly hope that all of you have learned this from this video. Cause if you knew about this before, I don’t know.

I, I would, I would feel terrible. It’s, God, this sucks. Uh, if you like this video, which you might be, if you, if you like watching videos of car accidents and, um, like early, early road runner videos, uh, you, you can, you can give me a thumbs up.

Uh, you can, you can leave a, uh, motivationally, uh, I don’t know. Nice. Just a nice comment. Something, something sweet roses and stuff.

Uh, if you like this sort of SQL Server performance tuning content, uh, you can subscribe to the whole channel and you can get what can watch all the videos and you can get notifications every time I post a new one, which will be pretty frequently because I have nothing else to do except post videos for you.

Cause I love you and I miss you. And it’s been, it’s been too long. We haven’t hung out. Anyway, I’m going, I’m thinking I’m going to find something to do. It’s not standing in front of my computer for a little bit because I think the lights are beginning to affect me.

So thank you for watching. the industry is. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.