Why DISKANN Indexes Are So Slow To Create In SQL Server

Why DISKANN Indexes Are So Slow To Create In SQL Server


Video Summary

In this video, I delve into the intricate process of creating vector indexes in SQL Server, specifically focusing on the background tasks and query hints involved. By examining the detailed query plan generated during index creation, I uncover a series of unique trace flags, parallel plan preferences, and other hints that are not typically available to users. This exploration reveals how these elements contribute to the complexity—and current slowness—of vector index creation, while also hinting at potential optimizations for future versions of SQL Server. Through this analysis, viewers gain insight into the inner workings of SQL Server’s query optimizer and the advanced techniques used during index builds that could impact performance and usability in various scenarios.

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we are going to endeavor to answer an important question. Hopefully a temporarily important question. But why are disk ANN or disk ANN indexes so slow to build? These are, of course, the new vector search indexes that Microsoft has given us in SQL Server 2025. And of course, the usual spiel does apply here. This is, this is all pertinent to RC1. So release candidate one of SQL Server 2025. I believe this will be the last one until we get an RTM since in about a month, there will both Microsoft Ignite and past data summit will be going on. And usually that’s about when an RTM build drops. So things could change by then, but at least for the next three or so weeks, this is what happens in the background. Before we get into that, down in the video description, all sorts of helpful links. You can hire me for consulting by my training, support this channel. You can also ask me office hours questions. That is totally free. I answer five of those every Monday. And if you enjoy this content, you should probably like subscribe and tell a friend. May not want to pass this one on to anyone who works at Microsoft for now. Speaking of past data community summit, like I said, yeah, that’s got, that’s happening in beautiful Seattle, lovely this time of year. November 17th to 21st, I will be delivering two days of T-SQL pre-cons with Ms. Kendra Little. They are going to be the best T-SQL pre-cons that have ever taken place in all of human existence. So you have that to look forward to. Anyway, let’s go.

Look at this. Now I am, I am officially in two SSMS spawns territory with how busy I am with things, but that’s a, that’s a good thing, right? Busy is good. So I want to go over a couple of things and then show you the real, the real spicy stuff. So I am using, so start things way back. My, my dear friend, Joe Sack, my bosom buddy, Joe Sack has been doing a lot of work. And the, the AI, the AI, the AI space with things. He’s a, he’s a, he’s a bit more keen on it than I am. But, but he’s been messing around with, with the vector stuff in SQL Server. And so all credit to this database existing goes to Joe Sack. I’ll, I’ll have a link both to his, uh, substack where he, where he’s talking about all this interesting stuff. And, um, the GitHub repo where he keeps this database. If you want to play along, play a lot, play around or play along or whatever you want to do, whatever form of play you’re into. I’m not sure.

Uh, but Joe had a post, uh, actually yesterday, uh, was it? Well, I mean, yes, yesterday is, as I’m recording this, but you know, temporal elements are all, all screwy. When we start publishing things online, uh, about a query that, about like a top something query with vector search. It wasn’t slow. It was just, you had to ask for a much bigger top than you thought, um, in order to get back results.

Uh, but anyway, uh, I saw the query and I was like, wow, that’s messed up. Like there’s gotta be a way to fix that. And, um, it turns out there, there, there wasn’t cause of the way the vector search thing works, but, uh, it all sort of led to this. So, um, what I’m going to do is I’m going to, uh, so Joe has a table in his database called properties.

Uh, and I forgot an S there, didn’t I? There we go. All good. Brand new script, you know, not even saved with the file name yet. So we’re all, you know, a little, a little dicey there, but, uh, I’m just going to create a 20,000 row table based on Joe’s table that has stuff in it.

Right. So let’s drop and do that. Right. So 20,000 rows. And, um, now what I’m going to do is create the vector index. And one thing I want to point out here that’s sort of interesting is I have, uh, actual execution plans enabled.

So when I run these two queries, I of course get it in a, uh, a query plan for the, uh, creating the table and selecting data into the empty table, whatever, whatever select into really does. And then, uh, adding the, um, primary key to it. Right. So the first two things that I do there, cause you need a primary key if you want to create a vector index.

So now I’m going to create the vector index and this, keep in mind, this is 20,000 rows. And while this is not like the biggest beefiest VM that’s ever existed, it’s about 10 gigs of memory, the four cores, something like that. This is 20,000 rows. Right.

And then we’re waiting and we’re at 10. I mean, if you, if you look, uh, sorry, this elbow, you can see like 14, 15, 16, 20, 20,000 rows is pretty slow. So, and I did get to wondering why 20,000 rows was so slow and I’m going to show you in a moment what’s going on while a vector index builds.

But, uh, what I wanted to show you before that are maybe some other things that, um, you, you, you maybe won’t be so thrilled about and maybe something you will be so thrilled about when with vector indexes. So this all took about 30 seconds for 20,000 rows, which is not, not a good time, uh, to be honest with you. But of course, RC1, all this stuff gets better changes, you know, just currently things not so hot.

Um, so just like columnstore indexes, uh, when they first came out, adding a vector index to your table, uh, does not just make the, the column that’s indexed read only. It makes the entire table read only. So if like, if I try to update a column in the table, it’s called bedrooms, right? And I remember I created the index on this description vector column and, uh, I guess SSMS still hasn’t quite worked out all the like new stuff going on in there.

But, um, at least SSMS 21 has it, maybe SSMS 22 has, I should give that a shot someday. It makes the entire table read only, right? So data modification statement failed because tiny properties has a vector index on it. So you like the entire table is read only like you can, it’s not just like the vector index column, which is, I don’t know, maybe not great if you have a transactional system.

Uh, but two nice things that I do want to say about vector indexes is that they do not support rebuild or reorganize. So you cannot rebuild or reorganize a vector index. So I suppose Ola will have to get to work making sure to exclude vector indexes, at least as they currently live.

Um, but anyway, what I really wanted to show you today, what the, the real meat and potatoes stuff is what’s going on in the background while a vector index builds. Okay. So we’re going to start creating this and we’re going to run a little query over here that shows us what’s happening while a vector index creates.

Um, SP who is active strangely did not pick up on this, but, um, you know, whatever. So, uh, we get a query plan and some query text back. The query plan actually shows us a query plan for what’s going on.

Like when I created the thing over there, there was no query plan when I hit, when I said, Hey, like make this index. And all of a sudden the query plan was missing. So this is like the query plan.

And if we zoom around, we’ll see, uh, you know, uh, there’s tiny properties and here’s the vector index seek. We’re doing something over there and well, there’s tiny properties again. And then if we scroll over this way a little bit, we will have a clustered index seek into this graph edge table, 120, something or other.

And apparently we’re, uh, we’re missing some statistics on this table. I don’t know, Microsoft, you ought to create some statistics on that one. I think, but, uh, anyway, the query that’s running while all this happens is not a typical create index query.

No, no, no. It is this whole wacky wild adventure. So while that index creates, right.

I’m just going to format this stuff a little bit. So it fits on screen a little bit better. Um, these are some, I guess, parameters that get passed into whatever we’re doing. Um, we declare dop, right, which is an integer and equals four.

Um, I think that might be, um, I think that might be related to my system, particularly since I have four CPUs in there. And then we start running this query. So we have a, if random start ID equals zero, we do this, uh, and we get some stuff in here.

And this, this, to me, this is interesting. So the, the query that’s running in the background uses the enable parallel plan preference hint. This, this hint, uh, has up to now been completely unsupported, almost disavowed by Microsoft.

This does not show up in the valid use hints DMV. So maybe this is a sign that this hint is now safe for all of us to use in production. Right.

That’s exciting stuff, right? That’s big news there. Uh, and then we, uh, well, let’s see, we select a top one thing here that Microsoft might need to get itself a copy of SQL prompt because some of this formatting is an atrocity. Uh, I can, I can, I’m willing to share my license with you, Microsoft, if you’re, if you’re really feeling, I know there’s been a lot of layoffs lately.

So if you’re really feeling a budgetary pinch, we can, we can talk later. You just let me know. Uh, so we get the property ID and we’ve got a table sample here with repeatable one.

Hmm. All right. And then, uh, this is where I guess things start getting interesting. This is that graph edge table. Uh, I guess at least one of them that we saw in the query plan.

And we, uh, insert some stuff into there, right? Uh, we select some properties and, ooh, cast that as binary 192. Very exciting stuff.

And then we do this row per page and oh boy, we are, we are using the fizz lock in here. My goodness. Huh? There we are. We are fizz locking things. That’s, that’s a fun one.

All right. Uh, and then, uh, apparently down here. So like the, the query plan we saw was most likely the stuff that’s coming up down here because there is a bit of a loop or something that goes on, I think, or at least, uh, you know, in RC zero, there was a loop.

I forget what they do in this one. Maybe. Oh yeah. There’s a batch start and batch end. Anyway, uh, there is some interesting stuff that goes on in here. Um, uh, what was it? Uh, did it, did, uh, maybe it’s down here.

Uh, there’s a, uh, there’s a function that we can’t use is disc and prune, right? So this has got a whole within group thing. So this is some kind of like probably internal windowing function that, uh, that, that Microsoft has access to from like a system thing that we don’t have access to.

I haven’t tried any of that yet. Uh, and then down here we have a query with a bunch of hints on it. We go to merge union, very exciting stuff. We have our parallel plan preference hint again.

I don’t know. This is feeling very production ready to me now. And then we’ve got some trace flags. Now, some of these trace flags I am familiar with. So there’s stuff for like, uh, nested loops batch sort, disabling nested loops batch sort. There’s stuff for disabling spools.

There’s stuff for disabling nested loops pre fetch. Yeah. But this one’s 7454. I don’t, I don’t know what that one does. Be honest with you. I just don’t. Um, I don’t know every, every single trace flag, but then we exec sp vector index build update stats.

We update stats, uh, after doing that. And then, uh, we have a batch start where we select the max property ID and some batch size stuff in here, right? All good things.

Uh, and then, uh, I don’t know the sort of some of this stuff, uh, more batch size stuff. And then where was, there was something else. There was something else in here that was very interesting to me.

Um, uh, where was it? It’s not this thing. I mean, this is sort of the index creation stuff that goes on in the background, right? So like when you create a vector index, you tell SQL Server, whether you want it to be, uh, cosine or one of the other mathy things that they have, but then there are some options that you, at least currently, I don’t see a way to specify them, but, uh, like stuff like this, that shows up in the sys dot, uh, whatever vector indexes DMV there is, but it doesn’t show up as like a creation element right now.

But then actually this, this one was, uh, was interesting as well. So use hint, uh, disallow batch mode. Again, we have enable parallel plan preference, but then there were, there were some other funny ones and I don’t, I, I doubt these are going to be, um, available to us mere mortals, uh, out in SQL Server world, but parallel DML, so we can disable batch mode, enable parallel plan, enable parallel plan preference, parallel, enable parallel DML for vector index build and parallel DML using independent transactions.

So this is the, these ones I have not seen before. I don’t, I doubt these are valid use sense for us, but, uh, I might go and try those. And then we have, uh, the same query trace on stuff that we saw before. So I’m not going to explain that one again, but then we are optimizing for, uh, certain values for the batch start and end.

And, uh, then there was some other interesting stuff in here, uh, like this pack int function. Do we get to use that? I don’t know.

I, I, I, I, I’m, I’m just seeing all this. So, uh, I, I, I want to figure that out too. And, uh, good Lord. Come on, put cross apply in a new line. All right.

So then, uh, well, there’s unpack, there’s unpack int. So we get to pack int, pack an int and then unpack an int. That’s a fun, that’s a fun pairing of things. And down here there was a nut. Well, there was disk and prune, but then there was, oh, where was it?

Uh, I don’t know. This, this stuff, this stuff all gets rather difficult to read. But then, uh, you know, when, when people tell you to be careful with query hints, my Lord, Microsoft. My Lord.

Uh, we have some of the same things that we saw before with the, um, uh, disabling batch mode, enabling parallel plan preference. Uh, I apparently deleted something there that I shouldn’t have. There we go.

Line that back up. Uh, and then the parallel DML stuff that we saw before. And, and here’s a, here’s a fun one. Uh, Microsoft turns off eager index spools in this query. They’re like no eager index pools for you.

Select. So with sell to index on the fly, uh, that is the, that is the optimizer rule that would build an eager index pool in your, in your query plan. So Microsoft apparently anti eager index pool as well. So high five Microsoft.

We share, we share, we share a mutual loathing for the eager index pool. And then at the end, of course, we, uh, we, uh, we, uh, we sp vector index build update stats again.

So right now I would imagine that like, since this, this is all like very loopy stuff going on or like, um, I imagine that this, this kind of thing is why, um, building vector or disc and indexes is so slow currently.

Um, I would also imagine that this is why they are also currently, this also currently makes your whole table read only. Cause if you were to like run a command that changed stuff in the table and you had to run all this stuff over again to like figure out, like, like, like rebuild your vector index, um, or like update your vector index, like during like regular transactional processing, that would be kind of a nightmare.

So I understand why we are not, we do not currently have, um, writeable, uh, disc and indexes. Cause that wouldn’t be fun, but, uh, I really do hope that this stuff gets sorted out. Um, if not by RTM in three weeks, no pressure guys, uh, then hopefully in the near future, because, um, you know, uh, it would, it would be a terribly uncompetitive thing for a SQL Server to not have, uh, writeable, uh, vector indexes.

Um, and like, again, trade-offs and stuff, right? Like Oracle has writeable vector indexes. Postgres has them.

Elastic has them. I forget Mungo, but like lots of other database engines have writeable indexes. I don’t know what trade-offs they made in order to accomplish that. Perhaps Microsoft, there were, there were certain trade-offs Microsoft was not willing to make, uh, in doing that, or perhaps there are other limitations involved that, um, we, we, we, who do not have the source code are unaware of.

So anyway, if you, if you, if you’ve heard people talk about why, or talk about, uh, disc and indexes being slow to create, uh, hopefully this answers your question as to why they are slow to create.

Um, I think it’s, it might be worth, uh, looking at this. So this is the full properties table or a full or properties table. I, I, uh, there were some problems earlier.

Uh, I don’t want to talk about that, but, um, there are 400,000 rows in the pro, like I just reinserted some extra ones. The one that you download from Joel only have 100,000 in it.

So you have to beef the table up a little bit. But, um, when you create, when I tried to create a vector index on the full properties table, the, um, the query looked a little bit different. Um, I think it, I think it was doing more to batch it, but, um, it might not be terribly interesting to go find more of that in here at this point.

But I don’t know that, that looks kind of the same to me. But anyway, uh, at least now we know we have some answers as to, uh, why building the disc and index is not terribly fast as of RC1.

Okay. As of RC1 disclaimers and all that. Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video where we will, um, I don’t know yet, but actually I don’t, I don’t have anything else planned.

So it’ll, it’ll be a surprise to you and me. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Weird Thing With NOEXPAND

A Weird Thing With NOEXPAND


Video Summary

In this video, I share a humorous and intriguing experience I encountered while writing training material on index views and the NOV expand hint in SQL Server. I discuss how using the NOV expand hint can sometimes prevent parallel execution plans when you might expect it to allow them, highlighting an interesting quirk of the optimizer’s behavior. This video aims to provide insights into why this happens and explore potential reasons behind it, while also emphasizing that the use of index views and the NOV expand hint remains valuable for many scenarios. If you’re interested in diving deeper into SQL Server optimization or need help with your database performance, consider supporting my work through donations or by subscribing to my channel to stay updated on future content.

Full Transcript

Erik Darling here with Darling Data. And today’s video is a little funny thing that I ran into. I was writing some training material up and I ran into a funny thing was talking about index views and the NOV expand hint. Now, there are many good reasons why you may want to use a NOV expand hint when dealing with index views, particularly in standard edition. You know, like perhaps because you don’t want SQL Server’s optimizer to attempt to expand the index view to the underlying definition. You know, maybe have worse performance. You might also want statistics automatically created on your index view. Good reasons to use the NOV expand hint, but I ran into something funny with it where it was like when I used the NOV expand hint, I was not getting the parallel plan that I wanted. But when I dropped the NOV expand hint, all of a sudden, happy, fun, parallel query. So we’re going to talk about that today. Down in the video description, you will find all sorts of links that are helpful to both of us, right? It’s a symbiotic relationship. You can give me money. I can make your SQL Server faster or you can give me money and you can receive training from me or you can give me money and also continue to receive YouTube content, which I mean, you know, I don’t get paid much by the hour for the YouTube content, to be honest with you. So, you know, that’d be nice. But, you know, apparently I’m like 10 million subscribers away from that being lucrative. So we’ll just have to we’ll just have to punch harder. Get that and get those numbers.

You can also ask me office hours questions for free. I answer five of those every Monday. And of course, if you if this if this content is up your alley, you may you should like and and subscribe, of course. And if you know someone whose alley this might also be up, you should tell them to to watch it and hopefully like and subscribe as well, because how else am I going to hit 10 million YouTube members and then I can finally stop doing all the other stuff and just I can just stream. All day long. Wouldn’t that be great. This November, I will be in Seattle with Ms. Kendra Little delivering handle hand delivering two days of the best T SQL pre conning pre connery you will ever see in your life. So you should go there. You should come to Seattle on the appropriate date and come to the pre con and get some stuff from me and Kendra be great for everyone, right? We all win. We all continue to win.

We’ll win. We’ll win. We’ll win. We’ll win and triple security in here in the next August 1815 in this year is your exp dativeness of the past three years long. Yellow, you’ll win. It’s not very simple, it’s not great for you.

of the store procedure is to allow me to inject hints and options easily without having to rerun the query that hits the index view. And I’ll show you why here. So let’s first run this store procedure with the no expand hint. Thank you, SQL prompt for making yourself known.

And let’s look at the messages tab. And in the messages tab, we’re going to see the query that hits the index view. This is why I don’t want to have to keep highlighting the whole thing. This is much more compact. But you’ll see that I have the no expand hint supplied here, just like I have in the hints parameter up here. And the execution plan for this irks me. Why?

Well, it’s a serial plan. And I do not want a serial plan. I want a parallel plan. The problem is, in this case, the no expand hint does something kind of funny. It doesn’t prevent parallelism in all cases, of course, right? That’s not my point. I’ve certainly seen the parallel queries with the no expand hint applied to them. That’s not what I’m getting at here. This is just a curious case, right? So like, so just like a weird thing. Now, if I rerun the store procedure, right?

We execute this and we say no hints and no options. You’ll notice that that was a little bit faster. And of course, we got a parallel plan. Because of that, if you go over to the messages tab, and we scroll down to where the table gets hit, we now have a blank spot where there was once a, there was once a hint supplied in there. So like I said, the no expand hint does not prevent parallelism. If I run the query with the no expand hint, and I also use the use hint enable parallel plan preference, well, SQL Server comes up with a actually, let me show you one thing first.

If I run this, oh, wait, let’s come back to the original one. This isn’t this is sort of a funny thing about it. If I run this, when we get the serial execution plan, it has a cost of 22.6722 query bucks, right? So the serial plan was not expensive enough, right? The estimated cost of the plan was did not break my cost threshold for parallelism setting, right? Which is weird, because when we run it without the no expand hint, we get a parallel plan, right? So it’s like, like, why would the no expand hint end up here, right? Like, like, did SQL Server just not like, go find some more stuff? It’s a good question. We’re going to answer that. Now, if we look at the cost of the parallel plan that we get without the no expand hint, right? This, of course, has a cost of 9.3 something query bucks. So SQL Server did find a cheaper parallel plan, right? So at some point, without the no expand hint, the optimizer did some more stuff. What more stuff did it do?

Maybe, maybe we’ll be able to answer that. But what SQL Server does is it went out and it found maybe a more expensive serial plan and then found the cheaper parallel plan and said, oh, I’m going to go with the cheaper parallel plan. So there is the exploration space there. For some reason, we just don’t hit it when we use no expand. Now, like I said, it is possible to get a parallel plan with no expand hints. I’ve seen plenty of times where it happened completely naturally, but in this case, we don’t get it. So I’m going to go here and I’m going to run it with no expand and the use hint enable parallel plan preference so that we get a parallel plan even with no expand there. And you’ll see that it is possible. It does happen for us. Sort of curious. And this is, this is the very curious thing. If we look at the cost of the, the, um, the parallel plan with that hint supplied, it is 5.2 query bucks. So that’s even cheaper than when it found the parallel plan naturally without the no expand hint. So what happened in there? Well, I was able to get some answers, not full answers. Um, if we run, so I’m going to show off some, some rocket science-y trace flags here. Uh, the first one is 8675, which will show us how many tasks the optimizer applied to a query while it was optimizing it. So this is without the no expand hint and without the no expand hint, the, if you look at the, right, we get the, get the parallel plan. Uh, we go over to the messages tab.

This is where SQL Server will show us the stuff that it did, right? So the number of tasks, which if we scroll down a little bit, we’ll eventually get to do, do, do, do. So at the end of, actually that was important. The end of search one, we had a cost of 136.73 query bucks, right? So there we go, right? That was the cost of the serial plan. And then if we keep scrolling down, we’ll have more tasks. Look at 7, 7,200 tasks. And if we keep scrolling down, we’ll have this end exploration, right? And this is where SQL Server found the parallel plan that costs 9.3, whatever query bucks, right? So that keep that number in mind, 7,239. Okay. Now, if we run this with the no expand hint and those same trace flags, um, what we’ll see is something a little bit different, right? So we get the serial plan again. If we scroll through messages, we’ll have do, do, do, do, do, do, do, 1900 end of search one cost 2,200 query bucks. So SQL Server and did fewer tasks here, right? The other one had like 7,200. This is only 1,900 that the end of search one here, remember there were two end of search ones. So SQL Server ended search one on the last one with a cost of 130 something query bucks. This one ended search one with a cost of 22 query bucks. So this is why like we just didn’t go into that additional exploration space for some reason with the no expand hint applied. There were a couple other trace flags that I tried, or rather there was one other trace flag that I did try to look through a bit to try to figure out where things were going weird in here. But the output of 8615 is not very easy to manage. You’ll notice that the scroll bar over here is, and that’s a quite a small little rectangle scrolly thing. And there is quite a bit of information in here to get through that. I just couldn’t like, I couldn’t make enough sense of this to figure out if there were like where the, um, actual change and stuff was to like find where parallelism was, uh, where either were like the part, like the plan wasn’t where the same space wasn’t explored rather. So a little bit of just an oddity thing there and a little bit of how I looked into it further. Um, that’s about it. Uh, it’s not, not, not telling you not to use index views and not telling you not to use the no expand hint, but if you ever run into this situation, maybe, maybe you’ll, maybe this is the same reason why. Um, I don’t know, I don’t know if no expand, um, in general limits the search space more than not using the no expand hint. Maybe it does. I don’t, I don’t know. I don’t have that much insight into what the, the no expand hints interactions with the optimizer, but it might be something to, it might be something interesting to delve further into, uh, if I, if time, as time allows. But anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in tomorrow’s video. I believe we’ll talk something about query correctness or something like that. I don’t know, whatever. Anyway, thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Bug In Batch Mode Adaptive Join Plan Timings

A Bug In Batch Mode Adaptive Join Plan Timings


Video Summary

In this video, I delve into a fascinating issue related to adaptive join plans and operator times in SQL Server. Erik Darling from Darling Data highlights discrepancies between the actual execution time of queries and what is displayed in the query plan when using adaptive joins. Specifically, he demonstrates how the time spent on certain operators is not accurately reflected, leading to confusion and potential frustration for database administrators and developers. To address this issue, I’ve opened a user voice item urging Microsoft to rectify these inaccuracies. If you’re interested in supporting this effort or learning more about SQL Server performance tuning, be sure to check out the helpful links provided in the video description, including options to hire me for consulting, buy my training materials, and become a paid member of the channel.

Full Transcript

Erik Darling here with Darling Data. And this is just a quick video to promote a user voice item that I opened up about, what do you call it there? Bugs with operator times and adaptive join plans. That’s, I guess what that says, isn’t it? Anyway, down in the video description, all sorts of helpful links. You can hire me, you can buy my training, you can become a supporting paid member of the channel. You can ask me questions for free for my Office Hours episodes, and of course, as always, please do if you enjoy this content, even in the most modest increments, like subscribe, tell a friend, and all that stuff. I still am speaking, as far as I know, hopefully, at Past Data Community Summit coming up in Seattle, November 17th to 21st, where Ms. Kendra Little and I will be delivering the two most fantastic days of T-SQL pre-cons that you have ever seen in your entire life. So, hope to see you there. And as a, just a little teaser note on that, I sent the order for my new line of promotional T-shirts out to the printers yesterday. So, I will have brand new, very cool, very funny T-shirts to give to attendees of the pre-cons. So, we all have that to look forward to. Anyway, oh, I clicked one too many times. Let’s take a look at a couple things.

So, these are the two queries that I’m going to start with, right? So, one of them, they’re both the same query. The only difference is one of them has this use hint, disable batch mode adaptive joins thing. If we look at the operator times for these two queries, we’ll see that they both take, I mean, close enough to the same amount of time. 38.7 and 38.57. So, like 500 milliseconds difference, but that’s not really the issue.

The thing is that in the second plan, you’ll notice that the hash match inner join, like, so this is a batch mode plan, right? Mostly. We didn’t get a batch mode adaptive join, but we did get a batch mode hash join, which I’ll show you in a second. But you’ll notice that this thing correctly accounts for about 30 seconds of time spent in the batch mode hash join, right?

If we hover over this thing, you’ll see that it is batch mode. And because it’s batch mode, that means that it is only accounting for the time spent in that operator. And if we sort of trundle along through the query plan a little bit further, like we have, you know, the milliseconds that we spent reading from the tables, the 29 seconds in here, about two seconds here, about seven and a half seconds here.

So, it all pretty much adds up close enough to 38.574 once you account for, like, the little, like, eight milliseconds and stuff in other places. And the adaptive join plan, that’s a lot different, right? Because if we look over here, we have 400 milliseconds and 700 milliseconds.

And, of course, because this is the unexplored branch of the adaptive join, zero milliseconds, right? We have zero rows that came out of this thing. All of the row action came out of this part of the plan where we hit the comments table, not this one.

But the adaptive join says that it only ran for about 900 milliseconds, which is a little bit funny. Because if we look at the rest of the operator times, they line up pretty closely from one plan to another. It’s 1.6 versus 1.9, 7.2 versus 7.5, and then seven milliseconds versus eight seconds.

So, the time in this plan is completely unaccounted for. Now, of course, I checked all sorts of things like latches and spin locks, thinking, like, I must be crazy because, you know, Microsoft would never mess up this badly. But the other thing that I looked at is, sorry, I’m holding back a sneeze at the moment.

We might have to break recording or something. I don’t know. But if you look at the wait stats, the top one for both of them is CXSync port, and it’s got just about the same amount of time spent in both of the plans. There’s really nothing else interesting for other waits in these plans that would account for, like, the amount of time spent in, like, nothing else that says, hey, 38 seconds, right?

So, a while back, so the reason I’m bringing this up is because both of these plans end with a batch mode sort, right? Batch mode sort here and batch mode sort here, right? The reason I bring this up is because a while back, I had posted something, and Paul White wrote an article called, what’s it called up here?

Impossible Execution Timings, where he actually referenced me by name, which was, you know, of course, an emotional moment in a young man’s life. But in the article, Paul goes into detail about the batch mode sort and the threads and all the other stuff that happens in there. But he also brought up some workarounds.

And the reason why I bring this up is because I also tried the workarounds, and I’m going to show you those in a moment. So, we had one which was putting data into a temporary table, right? So, like, you know, this stuff here. The other, which was an eager index spool, right?

And you get that by doing this top spid minus spid plus 100% query. And the other was reducing parallelism, so max.2. And then row mode sort, so using trace flag 9347 to get a row mode sort instead of a batch mode sort.

So, and this isn’t to say that Paul was wrong about any of this, because when is Paul ever wrong? This is just to say, I tried this stuff. This isn’t what’s happening, okay?

So, this is just to say, I’m not crazy. Paul’s still right. I’m just not, I’m not the one with the problem, right? So, if we come back over to Management Studio, we’re going to see all of the things that I mentioned in Paul’s post that I have here. So, one of them is putting stuff into a temp table.

One of them is saying top spid minus spid plus 100%. The other one is running the query at max.2. And then finally using trace flag 9347 to get rid of the batch mode sort. So, for all of these query plans, you’ll notice that there’s still 30 some odd seconds in each of them, right?

Some of these, like, you’re going to see the operator time show up differently because of the different plans that we got. But none of them that use an adaptive join will have the time correctly assigned to them. So, like, the first query where we dump data into a temp table, at least I think that’s what we did.

Oh, come on, zoom it. You’ll still see about 30 seconds represented in that regular hash join, right? So, SQL Server didn’t choose an adaptive join for this one.

So, this one actually does still look okay. For the next one down, this is that funny top spid minus spid plus 100% query. And, you know, the operator time does show up earlier because we have the gather streams here, right?

So, this is where the batch mode stuff pretty much ends. None of the parallel exchanges support batch mode. So, this is where we go into row mode.

So, we see, like, 40 seconds in here, but the adaptive join still shows the wrong amount of time here, 938 milliseconds, which, strangely, is exactly what the one over here showed, too. Or, actually, am I on crack? Yeah, I’m on a little bit of crack.

Okay, 894. All right, never mind. Not exactly. Close, though. If we scroll down, this is the plan where we reduced max dot to 2. And this runs for a lot longer because we have a much lower dot, right?

The other query ran at dot 8. This is at dot 2. But we still don’t see the correct operator time show up really across here. The adaptive join shows a little more time because it ran at a lower dot, but it’s only 2.5 seconds, right?

So, this thing ran for two full minutes. We see 22 seconds spent here. We see 5 seconds spent here.

We see 2.5 seconds spent here. But we don’t see anything that fully accounts for the two minutes anywhere, right? We can pretty much guess that it’s going to be happening at this point. And if we scroll down, finally, we’ll have one more query that has an adaptive join.

This was the one that disabled the batch mode sort. So, the sort at the end is in row mode, right? It is in row mode.

This did happen in row mode. This is not a batch mode sort. But still, the adaptive join shows about 940 or so milliseconds of time. And the full query runs for just about the same 38, 39 seconds. So, I think this is a bug.

And because I think this is a bug, I have opened up a user voice item to say, Hey, Microsoft, I think something is wrong here. That link to that will be down in the video description. So, on top of all the other helpful links in there that we talked about before, you also have that link to click on.

And you can upvote this item and hopefully get this fixed. Because if, like, you know, I do a lot of query tuning work. And so, to me, when I said this, I said, this smells fishy. And dug into it.

But if you were doing this kind of work, you might just, like, give up and, like, run away. Like, I don’t know. Just, like, not be so psyched on things. Anyway, go vote for it. This is a problem.

It should be fixed. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video. All right. I think that’s good here. Yeah?

All right. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Default Constraint UDFs in SQL Server

A Little About Default Constraint UDFs in SQL Server


Video Summary

In this video, I delve into the fascinating world of default constraint user-defined functions (UDFs) in SQL Server, exploring how they behave differently from other UDFs when used as default constraints. I walk through creating and testing these functions to understand their performance implications, especially in scenarios where parallelism is involved. By examining both a simple processing function and a more complex badge check function, I highlight the nuances of how SQL Server handles these UDFs under various conditions. Whether you’re looking for insights into default constraints or just curious about the intricacies of T-SQL functions, this video offers plenty to explore and ponder.

Full Transcript

All right, Erik Darling here with Darling Data, your pal, your bosom buddy, your friend till the end. And today’s video, we’re going to talk about, there we go, thank you, PowerPoint, default constraint UDFs. Because I got an office hours question about these at some point in recent memory, and I was not prepared for the level of sickness that exists in this world.

So, I had to look at this more, and I did, and I regret every moment of it. So, if you look down in the video description, you’ll see all sorts of helpful links, where you can hire me for consulting, buy my training, become a paying, supporting member of the channel, like PBS or something. You can, for free, ask me office hours questions, and if you like this content, please do like, subscribe, and tell a friend, because, well, what’s the point of making these if no one watches them, I guess?

Good question, right? So, I’ve mentioned before, I do have a few announcements that I will be making, hopefully in short order, about upcoming speaking events. But the only one that is publicly available at the moment is Past Data Community Summit, taking place in Seattle, November 17th to 21st, where I have two days of the best T-SQL material with Kendra Little that you will ever see in your life.

And, you know, you should just be there, right? So, that’s fun. Anyway, it is still Rocktober, so we are going to continue to be spooky.

Anyway, we’re going to look at two different kinds of functions here today. We have this first function, which doesn’t do anything. We’re going to call this loosely processing in memory.

I have another one that performs data access, which is a little bit more interesting. But we’ve got this function here called serializer. It takes an input here and then spits that input back out here. Well, very simple stuff.

And what we’re going to look at is, you know, kind of how that works. Now, when I first started looking at this, I was like, man, like this could get real bad, right? Because like, you know, like a lot of the UDFs I have in the Stack Overflow database are ones that like, you know, you put in like a user ID or a post ID and you go get information about that user ID or that post ID, right?

So, I was like, man, if I can pass a column to a default constraint UDF, we’re in big trouble. Thankfully, we can’t do that, right? Thankfully, Microsoft was like, nah, right?

So, we’re not allowed to put a column in there. We can only put, let’s see, expressions, constants, constant expressions, and in some contexts, variables. But for us, we’re going to not do most of that stuff.

We’re going to, whatever. Anyway, that is not a name resolution thing. So, like if I create the table like this without anything, and then I try to add that constraint, right?

If I say, hey, why don’t we make that thing in here? SQL Server is still like, nah. All right.

So, same error there. Good. So, like that’s actually a load off my mind immediately. So, the best we can put in there is a literal value. So, if we just say the number one, which means this is going to just spit the number one out no matter what, and we create the table, we are now successfully do that. We’re all set, right?

SQL Server is happy. We’re happy. So, on and so forth. Anyway, if we, like, we have a default constraint, right? And obviously the column is not nullable.

So, we can’t like put a null in there. But, if we look at this and we actually supply the column with a thing to put in there, what happens is kind of neat. SQL Server actually doesn’t deal with the UDF.

Now, this will only be obvious when we look at the other plan, but usually when there’s a UDF, let’s say like in a query, there would be like a non-inlineable UDF for a UDF and a computed column or check constraint, and now default constraint, holy cow. We would get like a non-parallel plan warning somewhere in here. We don’t have that here, but we will in a minute.

So, when you supply a value for the column, SQL Server does not run the UDF because it doesn’t have to produce a value, right? So, if we look at sys.dmExec function stats, a brand new DMV in SQL Server 2016. Wow.

We will see that there are no executions of our UDF in there. But if we rely on the default constraint, so we’re just inserting into this sumDate column. We’re not inserting into the column directly that has the default constraint UDF in it.

If we do this, we’re going to have something slightly different happen. We’re looking at the query plan. Now, we have this tsql user-defined function is not parallelizable thing in the query plan.

Telling SQL is telling us we cannot generate a parallel execution plan. Not that we need it to insert one getDateRow, but you see my point, right? Like we can tell the UDF was invoked there.

And if we look in our wonderful brand new 2016 DMV, we will see that we have one execution of the UDF. Okay? All well and good.

Now, the same pattern is going to follow if we say insert a bunch of rows, right? So let’s put a hundred rows into the table. And let’s look in here, right?

So since we supplied a value for bad default, we’re not going to have any additional executions of the UDF after doing the insert, right? So even though we put a hundred rows into the table, we had a value for every row. So SQL Server was like, we can bypass invoking the UDF for those rows.

If we put more rows in, but we rely on the default constraint, right? We’re not, we don’t have that bad default column in here. So we’re relying on SQL Server to do its thing.

And we run this, we put a hundred rows into the table. But now we look at this. We only have two executions, right? So SQL Server really only invoked that UDF once. Since we’re supplying a literal value to that thing and the value that it produces is, you know, like always like a static value almost.

SQL Server is like, well, I just need to figure out what this is once and I can do all the inserts with that. Right? So we get, even though we put a hundred rows into the table again, right?

We did that. SQL Server only, only executed the function once when we didn’t reference it in the insert list and supply of value. Now, normally, if there’s a computed column in a, sorry, if there’s a UDF in the computed column or a check constraint, there are some weird, there’s weird stuff that happens around parallelism.

For a computed column UDF, it’s like, unless the column is persisted and used trace flag 174, you’ll get a fully, fully serial execution plan no matter what. If it’s in a check constraint, you get a plan that’s forced parallel if you reference the column that has the check constraint, right? So for these, it’s kind of funny because all of these queries, whether I reference the whole table or just the bad default column in any way, SQL Server is allowed to use parallel plans for all of these.

All of these plans have all the visible indicators of parallelism, whether it’s parallel exchanges happening in here like these, or whether it’s the little, you know, icons that say, hey, you went really fast, right? So all of these queries are allowed to go parallel despite the UDF default constraint. Now, if we run this and look just to make sure, we still only have the two executions of the UDF from the inserts that we did.

Good stuff. All right. Let’s look at a different type of function. This one is called badge check.

And what this one does is a little bit different. And this was actually kind of tricky to come up with because like I said, a lot of the, a lot of the UDFs that I have in Stack Overflow, except something like a user ID or a post ID. And then they go out and find some information about that user or that post.

So it was a little hard to figure out something that I might be inserting into a table that was something I could go out and look and like, like also insert like a value that exists in the Stack Overflow database as the same thing. So what this thing does is like you supply a badge name to the UDF. SQL Server goes and figures out if that badge exists in the badges table.

And if it exists, then you put the badge name in. If it doesn’t exist, then you just, you say, then there’s no badge, right? So pretty simple stuff there, but this is the best I could come up with.

So I don’t have an index. So I intentionally do not have an index on the badges table on the name column. So if I run this query, it’s going to take around 500 milliseconds.

It’s not going to return anything, which is fine because that’s, that’s what I expected. But I wanted to do this to sort of like exacerbate the issue with the UDF. So let’s clear up the plan cache and let’s redo our bad default constraint table with this.

Right. So we’re still supplying just a static value in here. Like we’re not like, like this isn’t going to execute, like do something different for every row. It’s always going to be nope. Right.

So we’re going to ever for every row that goes into the table, we’re going to go look at the badges table and say, hey, does a badge called nope exist? If not insert whatever like other value was in there. So it’s a little bit different from the last one.

And the almost the same thing happens, at least at first. So if we insert into the table here and we look at the query plan. Oh, thank you, tool tip.

I definitely wanted you to show up and not let me click on properties. We won’t get the non-parallel plan warning. Right. So all the same so far. If we look at the exec function stats, DMV, we see no executions. And then if we insert, but we make the default constraint do some work.

Ah, you notice that took about 500 milliseconds. Right. And you might notice that there’s a compute scale out right here that took about 500 milliseconds because we had to go look at the badges table.

Now we’re going to see that we could not generate a valid parallel plan because the function plan is not parallelizable. Right. So good, good, good. Right. So just about the same thing. And if we look here, we’re going to see that we have one execution.

And now we’re going to insert 100 rows, but we’re going to supply a value for the column bad default. Right. So just the same thing as before. You do this. That happens instantly. Right. So this, this, you know, the zero milliseconds. And we will not have the go away tool tip. I hate you.

We don’t, we don’t have that non-parallel plan warning over here. It doesn’t exist. So all like all kind of good stuff there. Right. And if we go look at our function stats, DMB again, we still only have the one execution from when we did the single row insert.

But, so you might notice that there’s not a hundred rows in here because 500 milliseconds times a hundred is like 50 seconds. Right. We don’t want to sit here for that long, proving that, proving these things. So we’re just going to insert 20 rows here. And this is a little different, right?

Cause now it’s 500 milliseconds times 20 rows, which I forget how long that is. It’s maybe 10 seconds, maybe 20 seconds. No, 10 seconds. All right, cool. So now we have our execution plan. And if we right click here and look at the properties, we have our non-parallel plan warning over here.

But far more interestingly, we have a compute scalar that takes up 10 and a half seconds of time almost. Right. Like there’s no time here in any of these, there’s no extra time. Oh, there’s a millisecond added there. And there’s a, well, there’s no time added there. So all like basically everything except one millisecond of execution time is in this compute scalar.

And of course this compute scalar. Oh, thank you. Tooltip. They keep showing up. You’re really helpful. If we dig in a little bit to through the defined values, oops, that’s the wrong one. That’s also the wrong one. We want this one. We’ll see our scalar string for badge check with nope. Right.

So that this, this compute scalar was responsible for invoking our UDF. Right. And if we look in here, right. And we say, Hey, what happened? Now we have 21 executions. So once per row that UDF executed to go look stuff up.

So depending on what your scalar UDF does, if it just processes data, let’s say in memory, right? Like it, like you take an input and you return some other input based on that without going out and touching user tables, it’s all fine. But if your scalar UDF goes out and looks at another table, right?

Then you’re going to get this really crappy, like once per row execution and maybe even very, very slow execution because of what your, if your function body is slow. And then for the parallelism stuff, well, let’s, let’s go look at that. Do, do, do, do, do, do.

We get three plans and all of them are allowed to go parallel. It’s all the same visual indicators of parallelism, whether it’s the parallel exchanges like gather streams or repartition streams, or whether it’s the little racing stripes on all of our little operators here. Right.

So all of these things. So at least it doesn’t hurt select query performance. But if you write a really monstrous T-SQL UDF that goes out and touches user tables, well, you might be in for a bad time for other things. And of course, if we go look at this, we should still have just the 21 executions of the function from the single row insert.

And then the 20 row insert that did not where we did not supply a value for the column that has a default constraint on. So anyway, I thought this was all very fun and interesting. I hope you also thought it was fun and interesting or maybe just fun or interesting.

Maybe one of the two, maybe you just found it incredibly sad and depressing, which is also a good outcome because now you won’t do this. Right. Or maybe you never would have. Maybe you’re just that much smarter than whoever asked me the question. I don’t know. But anyway, that’s good here.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where I think we’re going to talk about a bug in adaptive join plans with query execution time. So fun things to look at there. Anyway, thanks. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 38

SQL Server Performance Office Hours Episode 38



Questions:

  • 3. What are your thoughts on using AI (non-Microsoft) for query optimization and index maintenance? I’ve had 50-70% success tuning queries/indexes with 2-3 LLMs by providing schema/index/stat context first. Considering AI automation: identify poor-performing queries, add env context + tuning rules, use MCP tooling to test in read-only dev env; 3 models propose optimizations, best performer wins and goes to DBA for review. Thinks this is worth pursuing now, since AI will get better over time too? Not worried it will replace my job, performance tuning is only about 5% of my work and changes, generally, only make it to prod if prod reports performance issues. Even when I took the worst offending query and reduced runtime by 12X (no exaggeration) using indexed temp tables and a CTE to do aggregates then join on the temp tables instead of repeated joins on the same tables over and over. The change was *I think* was “too good to be true” and “might have a logic flaw that we can’t properly test for in dev” and thus hasn’t made it to prod… sorry for the rant.
  • Lets say there is a table with Key column of number but it has trailing zeros (may be obe or two zero), i had that column as varchar, but business would never put zero when searching for that column. All out queries had like clause with ‘%%’, that killed the perofrmance. How would you approach the problem ?
  • I’ve never seen anyone use a plan guide, not even in a blog post or quick video. Any idea why? When do you use them?
  • We have a UDF as a default expression for a column on a table. Does SQL Server still evaluate the UDF in default expression even when I explicitly insert values into that column?
  • Hey Erik. It took me many months, but I’ve finished watching all of your videos and turned your lessons into a drinking game. What do you think? I’m playing it right now https://dbfiddle.uk/CQbdBNH5 

To ask your questions, head over here.

Video Summary

In this video, I dive into a lively session of Office Hours where we tackle five community-submitted questions on various SQL Server topics. From exploring the use of AI in non-Microsoft environments for query optimization and index maintenance to discussing the pitfalls of storing numbers as strings with trailing zeros, we cover a wide range of scenarios. Additionally, I delve into the lesser-known topic of plan guides and address some humorous yet insightful queries about UDFs used as default expressions and even the personal experience of someone who has binge-watched all my videos—turning them into a drinking game! It’s been an engaging session, and I hope you found it both entertaining and informative.

Full Transcript

Erik Darling here with Darling Data. And today’s video is, of course, everyone’s favorite. My favorite, your favorite. Of course, that is everyone, isn’t it? Office Hours, where I answer five community submitted questions to my fancy little Google spreadsheet. And hopefully everyone is happy with the answers. If you look down in the video description, there are all sorts of helpful, useful links for you and me. You can hire me for consulting by my training, become a paid member of the channel. Of course, you can ask me Office Hours questions for free. They are not insert coin here. But we’ll see, we’ll see how this, see how this next year’s economy is. And if, of course, if you like this material, you, I would be ever so pleased, ever so chuffed. If you would like, subscribe and tell a friend or two or 10. I think I’ve suggested in the past, just taking people’s laptops and hitting subscribe in their YouTube accounts. Past data community summit coming up in Seattle, November 17th to 21st. Kendra Little and I doing two bang up days of T-SQL pre-cons, arguably the best two days of T-SQL pre-cons that have ever taken place in the world. I think I’ve mentioned before, I do have a couple other announcements coming up soon, hopefully sooner than later, so that you can see what you’re doing.

The fine folks of the world can start purchasing some tickets. Be nice there. Anyway, let’s go answer these gosh darn questions, shall we? Oh, we have a third. So if you watched last week’s episode of Office Hours, we might remember our friend with the high volume fintech OLTP something or other on SQL Server 2017 Enterprise Edition. Third question has arisen from our friend. Hopefully we can give our fintech friend enough free advice here. What are your thoughts on using AI non-Microsoft? Well, geez, I’m convinced. For query optimization and index maintenance.

We’re having AI rebuild our indexes? I don’t know. What is that? What are you on about? I’ve had 50% to 70% success tuning queries. Okay, well, I have 100% success. How’s that? With two to three LLMs by providing schema index that…

I’m just going to say something here. You work in fintech and you’re sending schema index and stat context to different LLMs. That’s interesting. Okay.

I hope they’re local. Considering AI automation. Identify poor performance queries. Add environment context and tuning rules.

Use MCP. Holy Moses. Okay. Performance tuning is only 5% of your work. Boy, oh boy.

Look, if it’s working for you, go ahead. I don’t know. Performance tuning is 5% of your work. If you need an LLM to tell you to use index temp tables and a CTE, you’re probably better off doing that.

All right. I need to move on before my head explodes. Okay. Let’s say there is a table. Yes, let’s say there’s a table. I would love to say there’s a table.

With a key column that’s a number, but it has trailing zeros. I don’t know why a number with trailing zeros is strange. Many numbers have trailing zeros.

Numbers with leading zeros are a different matter. I had that column as Varkar. Oh, great way to store numbers.

Varkar. Good job. But business would never put zero when searching for that column. All our queries had claws with double wildcard that killed performance. How would you approach that problem?

Well, I wouldn’t. That is just a fundamentally unsound design. You have allowed business users to trample over the database entirely, and they’ve won now. You can’t take that away from them.

I’m saying you because you asked the question. But whoever designed this screwed it up from the beginning. Even if you put a columnstore index on that, the double wildcard, assuming this is going to be a Varkar at least 10 or 11, right?

Because most people, when they store numbers as strings, at least account for the maximum width of whatever data type the number is supposed to start with. So if it’s an integer, then it would be like, what, 10, 11 or something, right? If it’s a big, the 9 quintillion number, it’s like 15 digits.

So, like, you know, it would be like a Varkar, like 10, 11 or 15 or whatever. So, Varkar 10, sorry, because it’s the same number of digits as a phone number. It’s a 214, three other digits, four other digits.

So, you’ve just ruined the whole thing from the get-go. You might be able to do some very complicated stuff with trigram searches, but I don’t know if you want to implement all that just to help these people out. But the smarter thing to do would have been to give them more restrained search protocols so that you did not end up here in the beginning or from the beginning.

There you go. Anyway. I’ve never seen anyone use a plan guide, not even in a blog post or quick video.

Any idea why? When do you use them? I’ve seen many people use plan guides in blog posts and videos. So I have no idea why you’re not able to find them.

Perhaps you do not search well, whoever you are. When do you use them? Well, you use them when you need to guide a query to a different query plan that it is not finding naturally. The name is somewhat self-explanatory in that regard.

I’m not going to pretend that they are very easy or ergonomic to use. There are many times when even yours truly has struggled to get the query to pick up on the plan guide. But with enough trial and error, you can usually get it working.

All right. Let’s see here. We have a UDF as a default expression for a column on a table. Does SQL Server still evaluate the UDF in the default expression even when I explicitly insert values into that column?

Oh, I see. It’s a default. Okay.

It shouldn’t know, but there’s no telling what strange things might happen when you start doing awful things like using UDFs as default expressions. I’d actually never even considered that level of terrible.

So good job you. You might make it into a future demo with that level of bad. Holy Moses. All right.

My Lord. Hey, Eric. Hey, Eric. It took me many months, but I finished watching all of your videos. You should go outside.

You should go enjoy life. There is a world out there. I don’t know why you’d want to watch all of them. Oh, well, that’s why. You’ve turned my lessons into a drinking game. Well, I changed my mind.

I wholeheartedly approve of you. What do you think? I’m playing it right now. Well, what do I think? I fully endorse drinking games. I think that they are one of the last vestiges of proper modern society and that you should continue to watch my videos and drink.

I’m not going to that link, but I don’t want to know because I have a feeling if I go there, I’m going to know how much you drank and I’m going to know how much drinking it took you to get through my videos and that might make me a little sad.

So I’m not going to go there, but I like that you have the kind of go-getter attitude that you create drinking games for me. All right.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in next week’s office hours, but also in tomorrow’s video, whatever that may be.

I’m sure it’ll be a humdinger. Actually, we’re going to go through my Rogols presentation that I recently finished giving at the Pass on Tour events. So we’ll have that to look forward to, won’t we? Rogols.

Yay. Fun. It would be awesome as if I talked about Rogols and no one compared me to other people who have talked about Rogols.

That would be fantastic. It’d be nice if I could talk about something without someone just pointing out that, hey, someone else has talked about that. Wow. Cool. All right.

Rogol and away all this week. Thanks for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Row Goals: Part 4

Row Goals: Part 4


Video Summary

In this video, I delve into the world of query tuning for Rogals—those pesky tables that can sometimes cause performance issues. We explore how to identify when and if using a top hint or row goal is appropriate to optimize queries. By analyzing a specific example where SQL Server scans 11 million rows unnecessarily, I demonstrate how adding a seek hint with the top operator can significantly improve query performance. Additionally, I discuss the nuances of using row goals in conjunction with cross apply and for hints, showing how they can help SQL Server make better decisions about execution plans without resorting to overly aggressive hints that might not always be allowed by strict DBAs.

Full Transcript

All right, welcome back to the fourth and final installment of the Rogals talk that I’ve been recording and hopefully presenting to the adoring masses. We’ve looked at a number of different things about Rogals, like sort of some like what they are, some good stuff, some bad stuff. We’re going to look at some cool query tuning stuff that you can do with them now, and ways that you can sort of figure out if Rogals are appropriate for you.

All right, so we’ve got this query right here, and this query is kind of funny because it only returns one row. Again, going back to the rows per second ratio thing, and it takes about 900 milliseconds to run. It’s not disastrous, but like, I don’t know, maybe we’re just unhappy with it for whatever reason.

The reason why I would be unhappy with this query is because I’ve got a perfectly good index on the post table that we could seek into and life could be grand with. Okay, but no, we start off by scanning the post table, and it takes about 900 milliseconds.

If we look at the work that SQL Server did to, on that scan, it was 11 million rows. All right. Why would SQL Server scan 11 million rows when it could seek to them?

Well, if we add in the for seek hint here, we will get a different execution plan, arguably better and faster execution plan back, right? So we say, oh, well, there’s 15 milliseconds now.

We started seeking into the users table. We found some rows. We did a seek into the post table. We did a seek into the comments table. Note here, there’s a top above a seek, and this is not a disaster the way a top above a scan was. I mean, this one’s, oh, this is okay for us.

So maybe for whatever reason, we were like, ah, we’re better than for seek. We can, we can get away without that. Maybe, maybe we have terrible DBAs who would yell at us for, for using query hints.

Cause some DBAs are like that. Some DBAs who couldn’t, who couldn’t actually tune the query themselves. It’d be like, you never use a hint.

Uh, so a way that we could get around our DBAs prying, hinting, hating eyes, uh, would be to, uh, use a top. Right? So just like we’ve done in other queries, we’re going to use a top with a big number here to tell SQL Server that we want to do a particular part of this query together.

And then, uh, we want to make sure that, um, SQL Server sort of like puts a little like hug around this and does this stuff and then does the other stuff or not, not like then, but it does this stuff on it in its own sort of world. So if we run this, uh, with the top in there, we get back a, another, another query plan. Again, this like, just like using the four secant and the last one, this one also takes about 15 milliseconds.

Now we don’t have the outer, we don’t have an outer top out here, right? We just have a top down here, right? There’s no root level, uh, row goal, but there is one down here in the top.

And the top puts this part of the plan together. What, what this does that’s kind of cool is it, again, it starts off with the users table, right? So we seek into the users table, we find the stuff we care about there, and then we go into our nested loops join.

And then for each row that, uh, comes through here, we go into the, we go into this part of the plan and we seek into the post table without having that four secant. So the combination of sort of cross apply in the top, um, with the cross apply with the top, and we have the, the correlation within the cross apply here. If we take out the top and we rerun this, we’re basically back to the original execution plan, right?

This one’s back to like 900, 900 milliseconds. SQL Server makes the same kind of boneheaded choice to scan the post table out here first, and then do a hash join to the users table, which is not what we want, right? But having the top in here is what kind of gets SQL Server on board with doing what we want within the cross apply.

Uh, one thing that’s actually, uh, kind of worth noting here, because, uh, someone actually asked me about this, uh, at, uh, the, the first time I gave this talk, which is in New York, I think. God, it all feels like 10 years ago now. I’ve slept like five minutes in the last three months.

Uh, but it was about the, the, the star in here, right? So we have a select P dot star from this, but obviously, uh, the only thing that we’re getting in the outer part of the query is the ID column, right? So SQL Server is smart enough during sort of like, uh, like some, some of the early optimization stages where like, I don’t know, parsing, binding, algebraizing, simplifying stuff, all that stuff that we usually skip over.

Well, I mean, simplification is cool, but the rest of it, uh, SQL Server figures out pretty early on that we only need the ID column in the, like the outermost project of this. And so it throws away all the star that we don’t need in here. So kind of a nice thing.

Uh, anyway, um, I’ve spent a lot of time in my videos, making fun of, uh, the optimize for unknown hint. Uh, because people use optimize for unknown, uh, I mean, or some, some equivalent of it to fix parameter sniffing, right? Uh, they’ll either use optimize for unknown or they’ll set, uh, some local variable equal to a formal parameter passed into the store procedure.

And they will put a quote in there that, Hey, I fixed this whole thing. Optimize for is cool for a lot of reasons. Uh, but not that.

So optimize for is neat because you don’t have to optimize for unknown. You can optimize for specific values. And one place that you can optimize for specific values is with top. Now this of course deserves, uh, the biggest shout out in the world to Adam Mechanic.

Because if you look through SP who is active or some of his other code and stuff, you’ll see this pattern all over the place. And it’s a very cool one. So thank you, Mr. Mechanic for, uh, introducing the world to this one.

Uh, so all the, all the credit goes there. So let’s say we have this query. And we run this and we have sort of a similar situation to the last one.

SQL Server, once again, chooses violence. It starts off by scanning the post table. And again, this is reading 11 million rows and it takes 900 milliseconds, right?

And we are, well, just about 900 milliseconds, 863 milliseconds, which we’re rounding up, uh, just to keep things in order, right? 900 milliseconds. This sucks.

We hate it. We’re mad at it. We don’t want this thing to keep doing what it’s doing. No, we could, of course, like in the previous example, we could throw a force seek hint on the post table. But again, we have mean, crabby DBAs who do not allow us to use force seek hints. They only allow us to use, uh, code from crazy people.

So let’s say we want, uh, in a different way for SQL Server to use, uh, the, to start with the users, kind of start with the users table and the filtering that we’re doing on the users table and then move on from there. All right. Let’s say that that’s just our goal for this.

All right. It’s our, it’s our, our goal. And we’re going to use a row goal to do it. So, uh, what we’re going to do is, uh, we’re going to use some again, top magic, right? We’re going to say, we’re going to declare the top and we’re going to set it to the nine quintillion number, right?

The big, big number, the hugest number in SQL Server. I guess, I guess some decimals and stuff could be bigger, but who has time for that? One, one could die thinking about these things.

Some, some probably have, you know, if you think, if you think carefully about it, but here we’re going to use top. Uh, we’re going to use this top, well, local variable, uh, in this part of the query. So we’re going to cordon off this select at this top at top from users where creation date, blah, blah.

Uh, and reputation is greater than blah, blah, right? And we want SQL Server to give us the top, uh, nine quintillion rows for this. But maybe we don’t want SQL Server to come up with a query plan for nine quintillion rows.

Maybe, maybe if we gave to see if we gave SQL Server that big number, maybe SQL Server would do something terrible. Maybe it wouldn’t use our index. Maybe it would freak out and just be like, uh, I can’t, I can’t wrap my head around it.

I’m going to give you the worst query plan I can. Ha ha ha. This is the most expensive one sucker. Right? Like you don’t know, like it just freaks you out when you think about it, keeps you up at night.

So let’s say we wanted SQL Server to, uh, do this, but, uh, Oh, sorry about that. Do this. Right.

We want, we want to get the top big number, but we want SQL Server to act like this is a small number. And maybe we don’t want to add a fast end hint to the outer query because again, the crabby DBAs, the discipline crabby DBAs will come and pinch us at night. So what we can do is something that’s very cool.

And we can say, we can tell SQL Server to optimize that for that top being equal to one. So even though we have the top with the big number, right? We’re going to keep, we’re going to get all the rows in the world back.

We’re going to opt. We’re going to ask SQL Server to optimize for a query plan that only asks for one row. So it’s sort of like saying fast one, just with the top and the row goal. And when we run this, we get back a much snappier execution plan.

Right. And we indeed have a top operator here and we have a seek into our index on the users table and then a key lookup to get some other stuff. And then we go and get the rest of the query plan done, going into a nested loops, seeking into the post table and then doing the not exist portion of the post table thing.

So row goals, fun, fun, fun. Uh, row goals are sort of like promises from the optimizer about how many rows will be produced. Um, you can set them, uh, or rather you may set them explicitly using top offset, fetch fast.

And those are the most common. Uh, there’s also like the set row count thing, but we don’t need to talk about set row count. That’s you use that.

I know, no mercy and no pity. Uh, and the optimizer may set them implicitly. Uh, if you use, uh, things like exists, not exists in, not in, or equals any equals all or equals any or equals some equals all is much different. Uh, and they can act at times like optimization fences to help you reshape query plans.

Why they matter is because they change how the optimizer costs and build plans. Uh, they can make some pretty impossible queries very, very fast by forcing, uh, better join orders or just sort of better, like things happening together that, you know, especially if you can get it. So they reduce rows very, very early on.

That’s a very powerful thing that you can do, um, to, uh, start getting better query performance without having to throw hints everywhere. Cause hints often break down over time. Um, exists will, uh, generally already have a role goal of one.

You don’t need to say select top one distinct order by group by all the other stuff. Cause SQL Server just throws it right in the garbage. Uh, top 100%.

Like we talked about, if you put this in a view, the optimizer completely throws it in the garbage, slam dunks on your head, gives you a wedgie. It’s a mean time. Um, if you are going to use top and you are expecting data back in a specific order, make sure you have use deterministic ordering. That is like, you know, the order by elements that you have, there is at least one unique tiebreaker within them.

Otherwise you have non-deterministic ordering and you may have silent bugs and stuff in your query plan. There are some very cool things that you can do, uh, especially with cross supply and top. Um, you can get nice, easy nested loops joins.

You can get SQL Server to seek into stuff when, uh, you thought that you would never convince the optimizer to do the right thing. Uh, if you were hitting problems with row goals and stuff like if exists, or, uh, you see that top above a scan, um, for the, if exists, we used count big to fix stuff. Uh, for the top above a scan, it’s usually creating an index. So you have a top above a seek.

That will make life a lot easier. Uh, and if you wanted to, uh, maybe ask SQL Server for a large number of rows, but only come up with an execution plan, like it’s going to hit a small number of rows, then you can use parameterize top and, uh, tell SQL Server to optimize for top equals some small number, right? It doesn’t have to be one, just has to be a smaller number than nine quintillion.

So, uh, that wraps things up. Uh, thank you for watching. Uh, again, we use row goals wisely and as always, I’m Erik Darling and my rates are reasonable. All right.

Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Row Goals: Part 3

Row Goals: Part 3


Video Summary

In this video, I delve into some interesting query optimization techniques using Roll Goals in SQL Server, particularly focusing on how sub-queries can impact execution plans and performance. We explore scenarios where sub-queries lead to inefficient nested loops joins or top operations above scans, which can significantly slow down queries despite having good indexes available. By understanding these patterns, you’ll learn strategies to improve query performance, such as using COUNT() instead of EXISTS in certain conditions and leveraging appropriate join algorithms like hash joins when necessary. The video also covers how to identify and address these anti-patterns through practical examples and real-world optimizations, helping you avoid common pitfalls that can plague your SQL Server queries.

Full Transcript

All right, so welcome back to Roll Goals. We’re in part three now. All right. If the three surprises you, there’s a playlist with all the others. I don’t know how you would stumble upon the third part without having stumbled upon the others, but welcome to the internet. Welcome to YouTube, I guess. Anyway, or wherever you found this. It could be Twitter, LinkedIn, other websites. If you got here from my OnlyFans, sorry. This is going to be much, much more difficult. It’s going to be much more boring. I wish I had OnlyFans money. Anyway, one place that Roll Goals can be very useful for reshaping query execution plans is with sub-queries. Now, I want to say this query isn’t like terrible, terrible, but it is a good example. All right. So let’s look at the query. Let’s look at the execution plan for this query. And what we’re doing is we are selecting some stuff from the user’s table. where reputation is greater than or equal to 1000. And what we’re doing in here is saying, I want to find each user’s max last activity date. And then I want to figure out if that max last activity date is less than or equal to, I guess, Halloween of 2009. Right? And then we’re going to order by some stuff out here, I guess, for whatever reason. Seemed like a good idea at the time, right? How do most queries end up with an order by? Seemed like a good idea.

Seemed like a good idea at the time. So we’re going to run this. And we’ll get the execution plan back eventually. And we get our rows and aren’t we so happy to have gotten our rows, but we are not necessarily happy with the execution plan. So we have the user’s table down here. And we might find it quite odd that SQL Server did not start with the user’s table, maybe find just the, you know, whatever rows it cared about there, and then go find rows in the post table that matched to that because, you know, we like the, on the, on the, on the users table, we have the clustered primary key on ID. And we currently do have an index on the post table on owner user ID, I haven’t, I didn’t show the indexes that I created for these demos. But you’ll see, you’ll, you’ll understand what they are eventually, right, you’ll get it. So it’s a little bit strange that SQL Server chose to take this part of the execution plan, the select max, last activity date, where p dot owner user ID equals u dot ID. And it chose to, like formulate this aggregate first, and then like, you know, like scan the post table, aggregate the data, and then filter out the rows that we don’t care about, right? The filter here is going to be Halloween of 2009. That’s our, that’s our, that’s our filter predicate in the query plan.

So it chose to get all this stuff, like, you know, scan 17 million rows, aggregate them down to 1.4 million rows, I think, 1, 4, 3, 5, 0, 7, 2, yeah, 1.4 million rows, and then filter out the rows we don’t care about down to 7,320, right? So it’s a little weird that SQL Server chose to do this. Because it had alternatives.

Now, what, what’s one thing that you and I know about this, about a query that does this, right? Like even, even just like, if we focus on this part of the query, right? If we selected the max last activity date from posts, it could only ever return one single row.

If we selected the max last activity date for, for posts, for posts, even grouped by owner user ID, owner user ID would only ever produce one row. You can’t have multiple maxes. I mean, you, like, let’s, like, let’s say both max last activity dates were like today, right?

Like you had like two or three or four or five of them. It would still be one max, right? Like they wouldn’t have multiple maxes. Like we have one max value.

So this actually can only ever return one row per owner user ID, right? We have that pretty much guaranteed. So if we rewrite our query a little bit and we stick a top one in here, then all of a sudden SQL Server understands that too. Why it doesn’t understand that to begin with?

You know what? Maybe it does and the optimizer just said, oh, I don’t like the cost of that. I don’t, I don’t like the, I don’t like how much that costs. It’s too expensive. We’ve got a max last activity date at home, Erik Darling. But if we do this, remember the last query that we ran, that took about 3.7 seconds.

Okay. 3.7. That’s the, that’s the mark to beat. If we put a top one in our little sub query here and we rerun this, well, that was noticeably faster. This is 714 milliseconds, right?

And notice now here we have a top and then we have a stream aggregate and then we have another top. And it’s all rather funny looking, isn’t it? But this ends up being a faster execution plan because, you know, even, even though we, we find, we find more rows in the users table initially than we, than we got when we joined to the post table the first time.

It took three, like extra three seconds the way that we did it last time. Now we’re getting one aggregate per user ID that comes out. And this ends up being a lot faster than getting all the aggregate user IDs at once.

This is a trick that you can use in sub queries in the select list, sub queries in a where clause that have an aggregate like this. You can use this almost anywhere and you can get a completely different execution plan shape that might be more efficient. It’s especially more efficient if you have an index that you can seek into down here.

I have an index that leads with owner user ID. So when this query runs, we can seek to each owner user ID that we care about, right? And that’s a, that’s a very efficient thing to do.

Even though we do the last bit of filtering over here for Halloween of 2009, this query still finishes a full three seconds faster. If you want to see kind of a cute little trick, we can add in a sort of an empty group by, right? We have a group by with these empty parentheses here.

And what this will do, this will actually remove the stream aggregate. It’s kind of a funny little side effect, right? If we run this and we get the, look at the execution plan. I mean, it saves a little bit of time, right?

The last one was 700 milliseconds. This one’s about 550 milliseconds. But now we just have two tops hanging out next to each other, right? What are they looking for? I don’t know.

We’ll get to the bottom of it someday. Anyway, row goals tend to work well when data is relatively easy to find, either because it naturally occurs often or you have a reasonable set of indexes to search with. Sometimes, even if you have a reasonable set of indexes to search with, it’ll be slow, though.

Row goals can be slow. And this is a pattern that I end up troubleshooting quite a bit. So, almost everyone I work with has some store procedure that either has, like, maybe a loop that does stuff, maybe, like, some batch things going on.

Or they’ll have, like, a conditional, like, update or delete, usually. So, what they’ll do is they’ll start a query with something like this. And they’ll say, if this condition exists, go do this thing.

Like, so, in this example right here that we’re looking at, we’re joining post of votes. And we’re looking for where vote type ID equals one and post type ID equals one. The reason why we’re looking for this is because a post type ID of one, which is a question, should never have a vote type ID of one, which is when a user accepts an answer.

Vote type ID of one should only happen for a post type ID of two, because only an answer can be marked as the answer. A question can’t be marked as the answer. So, we’re going to run this query.

And what we’re going to see is that even with good indexes available, we are unable to find this data. Right? We don’t find any matches.

But SQL Server thinks that it’s going to find matches, because SQL Server trusts you. One of the sort of conditions of the optimizer is that if you ask for rows via a query, SQL Server assumes that those rows exist. SQL Server is like, well, you seem like a nice person.

Why would you lie to me? Why would you send me on a wild goose chase looking for data that doesn’t exist? Right? And so, we look at all this stuff, and we see that SQL Server estimated three rows, but we ended up with 3.7 million rows. And so, we went into a nested, so SQL Server set a row goal and said, I think I can find this condition with three rows.

But it didn’t, did it? Nah. Eh, did not.

We went into a nested loops join, because row goals work, like nested loops join, like small navigational queries, like with nested loops and stuff, work very well with row goals. Right? Because you just, oh, I’m going to find this.

I thought, I’ll find it. Yeah, you get it. I’ll get it on the next row. I promise. Right? So, we have this nested loops join here, because SQL Server was like, oh, well, I can find this really easily. Right?

I’m just going to seek to the three rows that I care about here, and I’m sure I’ll find something down here. But we don’t. Right? No rows come out of the post table that meet that. So, if you have a store procedure or something, or some other, you know, code that runs, that looks for data conditions that shouldn’t exist, and seeks to correct them, right? Or like delete them, or update them so that they’re the right thing, or just like flag them so you’re like, wait a minute, this is wrong.

We need to figure out what happened. Then you can run into this problem quite easily. So, if we look at the properties of the index seek, we will see something that we saw before, the estimate rows without row goal. Right?

So, SQL Server was like, if I don’t set this row goal, I’m going to have to go through 3.7 million rows. That’s no good. I think I can do this in three rows. Right?

Which is like, I guess, 2.8 something. Right? It’s like, I can do this and it gets rounded up to three. Okay. But guess what? None of that happens. And then at the end of all this, so it looks a little funny that we get a lot of rows here, and then zero rows here, and then zero rows here. But then we end up with one row here, but that is coming from this constant scan.

So, SQL Server is like, ah, well, I got to figure out if this exists or not. And I have to return something to say, did this exist or not exist? So, we get that.

One way that I fix this quite frequently is I make SQL Server count. Oh, oh, that’s tooltip hell. Ah, all right.

So, we can actually run this same if exist query, and we can just say, if this exists and it has a count greater than zero, now all of a sudden, SQL Server has to do something differently. SQL Server has to count everything and then filter out on that count. All right.

So, we went from, how long was that one? 6.464 seconds. All right. Now, let’s run this one where we make SQL Server count things. All right.

We make SQL Server do some extra work. That was noticeably faster, right? So, this is 1.2 seconds. And we get a much more sort of appropriate execution plan for this. So, this filter is where the halving on the count comes in. But, we notice that we get all 3.7 million rows here, and the whole thing just takes about 1.2 seconds total.

We get a nice parallel plan. It’s still nested loops, which, you know, you can think what you want about that choice. Maybe this would have been faster with a hash join.

I don’t know. There’s a lot of scanning around for stuff when you do that. But, you know, SQL Server is like, I’m going to get 3.7 million rows here, and then I’m going to probably get 3.7 million rows here, but I got none. All right.

So, maybe the nested loops join, not the best choice, but still a lot faster. All right. Not that crazy single serial nested loop row goal plan. All right. Avoid that.

Another big anti-pattern that I see quite a bit is a top above a scan. All right. So, top above scan, almost always a bad sign. When people talk about query plans, they often say, there’s no such thing as a bad plan.

There’s no such thing as a bad operator. Like, they’ll make all sorts of, you know, like, I know a lot about stuff, things. But there are certainly query plan patterns that I dislike seeing quite a bit.

And when I see these query plan patterns, I almost always say, hmm, I think you’re having a bad day because of this. So, this is an example of that. So, what I’m going to show you first is the estimated plan.

All right. And if you were, say, looking at this plan in query store or the plan cache or maybe some monitoring tool that correlates plans and stuff, you might say to yourself, look, how could this possibly take a long time? How could this, look at those tiny little lines.

What could, what could go wrong here? Are you insane? This was slow. This had to get blocked. The server must have been, like, shutting down when this query ran. No way this query took however long, right?

So, it looks harmless, but it is not harmless. All right. If we run this and we get the actual execution plan, we might be shocked. We might be very shocked.

We might be very dismayed. We might take an extra long look at our phone. We might start doom scrolling some social media or another. We might, I don’t know, we might call about our car’s extended warranty.

We might find all sorts of things to do in the time that these queries take to run. We might find all sorts of things to do in the time that it takes these things to finish, right? We can get a lot of things.

Now, one thing that I joke about quite a bit in my stuff is part of query tuning is a rows to seconds ratio. All right. How many rows per second should you be able to get back?

Well, we get back one row, but it takes us almost 24 seconds, right? That’s not a good time. And I mean, a lot of this query does still have rather thin lines to it, right? These are all tiny, thin lines.

That’s still a tiny, thin line. This one, you can barely, it’s barely a smidge. But look what happens, right? This is all 24 seconds. I mean, all 24 seconds is really like here. But we have this top above a scan.

And this top is just asking for one row over and over and over again until it finds something, right? So SQL Server said, I can do this in one row. Just one row.

I can knock this out. No problem. Clearly, that didn’t work out. It took 58 rows to knock this out. Now, what the top is doing is every time it gets a row, right? Because we didn’t find what we wanted, right?

We didn’t find that on the first row. It took on the 58th row we finally found this. So every time this ran, we did a scan of the votes table. How long that scan takes depends on where in the data what we’re looking for is, right?

So this one, for some of them, it probably would scan the whole table. For other ones, it would only scan part of the table. If we look at the properties over here, like, you know, we’ll see the actual number of rows read is, let’s see, 244689052.

That’s a nine-digit number of rows. Nine is a very hard one to do. I guess I’ll just put a thumb down. So nine-digit number of rows.

So the votes table itself is 52 million rows. But, you know, it’s, but 52 million times 58 is not that number. I don’t think it is.

Maybe it is. I don’t know. But I don’t think this, this, the votes table gets read in full for every one of those. Maybe, maybe, maybe not. Because sometimes, because we’re looking for not exists. So if we find that a row exists, we can exit out early, right? Because it’s a semi-join.

It either is there or it is not there. So sometimes it might find everything. Sometimes it might not. But, so we talked about this. So, of course, this would be fairly trivial to fix on our own.

We could just add an index on post ID. And then we would have a top above a seek. It would be much easier to do, let’s say, 57 seeks into an index on post ID. And figure out if a row is there or not.

Mostly not. And then it would, what do you call it? On the 58th row, we would get, we would figure out, we would find what we wanted. So, this pattern is fairly easy.

Like I said, fairly easy to spot in an, even in an estimated execution plan. Either from query store, the plan cache, your favorite monitoring tool. You may need to get an actual execution plan to know that there’s a real problem with it.

But, getting one just like that was a fun and exciting time. We had 24 seconds to just relax, stretch, yawn. I don’t know, scratch ourselves if we wanted to do that.

Anyway, enough about, enough about those kind of row goals. Let’s talk about some good kind of row goals. There are two different types of row goals, to my mind. There are top level row goals, which is like when you say select top, whatever.

And then there are sort of interquery row goals, right? Where you might select a top within a query like this. So, if we, let’s, let’s actually quote this out first, right?

So, we’re going to take this and we’re going to run this query. And we’re going to look at the execution plan. And we will see that we just have one top out here. And that SQL Server scans the users table.

And then it does a nested loops join. And then it hits the post table. And then it does another nested loops join. And it hits the comments table. This query is not disastrously slow. It’s about 1.3 seconds.

But look how the query plan shape changes when we put a top in here. We’re going to put that top there. And we’re going to run this.

And it takes, I mean, it’s still about 1.3 seconds. Like I said, this isn’t an example of fixing something with a row goal. This is just an example of top level versus inter-query row goals.

But now we have one top over here. And we have one top over here. And the shape of the query plan has changed a bit. So, now we start, we hit the users table out here.

But within this branch, we join posts to comments, right? So, things have changed in the shape of the query a little bit. So, in the next video, and I think the final video in this talk, we’re going to look at how we can use this sort of knowledge about how we can sort of get things to happen together in a query with row goals to fix some bad performance issues.

All right. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you over in the next video.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Row Goals: Part 2

Row Goals: Part 2


Video Summary

In this video, I delve into the intricacies of execution plans in SQL Server, particularly focusing on how row goals affect query performance. I explore various scenarios where row goals are introduced and how they influence the query plan, such as when using `TOP`, `EXISTS`, and `IN` clauses. Through detailed examples, I demonstrate how these elements can lead to unexpected behaviors and highlight the importance of understanding their impact on query optimization. Additionally, I discuss the nuances between setting a row goal versus a row limit, illustrating practical cases where large numbers or specific conditions can significantly alter the execution plan, offering insights that can help optimize SQL queries more effectively.

Full Transcript

All right, Erik Darling here with Darling Data. And we are going to, this is part two of the Rogals playlist. Two of what? I don’t know yet. Hopefully four. I think I’m aiming for four. And if you have stumbled upon this and you’re hearing that it’s the second one and you are taken by surprise, well, there’s a playlist of these that you can find the other videos in. But right now, we’re going to do this. So, most people in the world, at least SQL Server people in the world, right? Because we have very nice graphic execution plans to look at. Not everyone is so lucky. Some people have text. And some people just have like weird tree looking things. But we’re lucky and we read execution plans kind of like starting over. Well, that’s, I guess this is mirror image from the right to the left, which is fine because that’s the way data flows. But when a query plan executes, it’s really, this is something that Paul White has talked about in a few blog posts and in a few videos where, you know, you have stuff that, sorry, my mirror image, stuff that opens up and starts executing over here and then sort of asks for rows down as it comes along. So, that’s part of why things like Rogals and operators like Top kind of work in the way that they do is because they are just continuously asking for rows until the, top specification is met or you run out of rows to produce. So, I have this table called high rep users. There’s only about 600 rows in it. But I want to show you sort of three different execution plans. They’re all two of them look pretty close, but it’s interesting to note this stuff. So, one is just selecting the top 10 rows with no particular order. Now, what we’re going to observe here is that rows will get returned in clustered index order.

It’s not a guaranteed order. This is only an observed behavior for this query is the way it’s running. So, keep that in mind. Then we have another query that’s asking for the top 10, but with the rows ordered by ID descending. Okay. Now, there’s another, the third query we’re going to run is the top 10, but with, but this is going to be ordered first by reputation descending. We don’t have any indexes on the reputation column that are going to help have, they’re going to put this in order. So, we’re going to have to physically sort the data. This will all make more sense when we look at the execution plans. So, let’s run these three. And what we’re going to see is, of course, three execution plans. If SSMS will be so kind as to let me get a little more screen real estate here.

So, the first query just starts asking for rows and these, these rows could be returned in any order, right? Cause we didn’t specify an order by. So, if we use maybe a different index than the clustered index, then we might get rows in a different order for this, right? But for this one, notice that the 10 rows, just SQL Server reads 10 rows from here, right? And we get those back in whatever order the thread happened to find them. And it’s a serial plan. So, it’s most likely just going to be clustered index order. Again, observed behavior, not guaranteed. The second query ordered by the top 10 with ID descending is we still don’t have to sort this data because we can still read this in clustered index order just backwards. And we just go and get 10 rows from here, right? So, this is just where we get all 10 rows from.

The last query is a little bit different. So, where this one is different is that the other two just had top, right? Just a top operator. This one has a top end sort. And the reason why this query is different is because we actually have to read all 613 rows from the table into the sort operator. And then we just get 10 rows from the top end sort as we get the data from here. We sort it and then we just get the top 10 rows in the order that we asked for it in, which is reputation descending and then ID descending.

So, while row goals do trickle down through the plan, sometimes you’ll end up with row goals that don’t quite make it down as far as other times, right? It all sort of depends on how your queries are and what your indexes look like. Now, because exists are something that may introduce a row goal, one thing that I see a lot of people do is act absolutely ridiculous when they write exists subqueries.

So, when we run this query, right, we’re going to say select star from high rep users where we have some where one of those users has a badge, which of course is going to be all of them because it’s impossible to get a high reputation without getting a badge, I think. Oh, wait, no, I lied. It’s only 609 rows. So, four people have a high reputation and no badge. All right. Maybe I’m wrong about this stuff sometimes. But if we look at this execution plan, it is just, it looks just like this, right?

We scan the clustered index of high rep users and then we have a left semi-join. In this case, the left semi-join does introduce a row goal. All right. I’m holding back a sneeze because I’m not wasting this five minutes.

And if we look at the properties of the index seek on the badges table, we’ll have this line right here, which tells us that a row goal was introduced, right? Estimate, estimate, estimate, not estimated, estimate rolls without row goal. Okay. Not estimated. Okay. Just estimate. Okay. Whatever.

But this shows us that a row goal was introduced, which happens sometimes with exists because a semi-join only needs to find one row. So, what I want to show you next is something that I see people do quite a bit with exists subqueries, which is a lot of useless typing, right? They will throw a distinct, a top one, they’ll group by, they’ll order by, they’ll do all this stuff.

Now, if we were to, if we run this query on its own, I just want to show you the execution plan for this real quick. If we run this without that outer correlation, we will get sort of a predictable execution plan, right? We have a top because we had a top. We have, well, I mean, we did a distinct and group by like fools, but, you know, okay.

We’re not going to distinct and group by, but, you know, stupid. But, you know, you at least have an aggregate here that shows we did some aggregating of things. If we run the full query, right, if we take out, if we, rather, we put the correlation back in and we run this, what we’ll see in the query plan is that SQL Server completely ignored all of the stuff that we did around the correlation, right?

The distinct, the top one, the group by, the order by. The execution plan doesn’t change one bit. We just go back to having a left semi-join.

There’s no aggregate, there’s no top, no nothing. Why? Because SQL Server is already sort of putting a top in there with the row goal. So, just a couple notes on some other T-SQL things. There are, there is syntax that is logically equivalent to exists, which is, which is like the in clause, okay?

We can say select from high rep users where ID is in, select user ID from badges. And if we do this, we will get the same semi-join query plan, right? The exact same thing that we got before.

I mean, obviously it’s not like, like exact exact because it’s a different query. Like other things will be different, like hashes and handles and things, but it’s still the exact same query plan shape and operators. Another way of writing a query with like this is to say where ID equals any.

Okay. That’s a funny one. Right? Bet you didn’t know you could do that. But any equals, uh, is sort of like doing in or a left semi-join, right? Right.

You could say equals any, and you could get this plan back, right? Which is the same thing with the left semi-join. You could also write that as high rep users equals some, which is equivalent to any, right? So any and some, uh, equivalent in this case.

And this will give you the same left semi-join plan. So there are all sorts of funny ways to get this syntax. We don’t have in, in T-SQL a way to say like, uh, like, uh, inner or outer semi-join or something like that. There’s no way to, um, give like, uh, request a semi-join, but there are ways that you can sort of, uh, influence the optimizer to give you the semi-join behavior.

There’s also another funny way, uh, to write queries, but this is not the, uh, this is not the equivalent. This will just get a very strange, well, not strange execution plan, but we won’t get any rows back because not all, all rows match. Don’t, not all rows match across all of these.

So this is it. Um, not in is of course not equivalent to not exists. Um, I’ve written and recorded quite a bit about that. So I’m not going to go into detail on that here.

Good Lord. Just know that I’m dying and my dying wishes for you to get to the end of this video. Uh, so I’m not going to talk about not in here, but, um, uh, we’re going to, uh, just point you to a handy little link that you can go to. All right.

Anyway, um, top 100% is absolutely useless in views. Um, if you see people do this, uh, yell at them. Um, if we were on this query and just the, just like from like without, not from the view, but you’ll see that we, we like SQL, there’s no top and there’s no top end sort. We do sort the data because we asked for data in a particular order that we do not have an index to support, but there is no top SQL Server ignores the top 100%.

So if we create a view that says select top 100% stuff, and then we select from that view, uh, you’ll note the execution plan has no top, no sort, no nothing. Right. So this obviously the optimizer does throws us away and it, but it does this to be helpful, right?

There’s no top operator. There’s no sort. So the optimizer does this to clue people in that they are potentially being dumb or potentially they are trying to rely on behavior that is not guaranteed. So we should all give the optimizer a round of applause for correcting our terrible, uh, I don’t know.

What is, what would you call that? Malfeasance? Right. Maybe correcting our terrible query writing.

I don’t know. Something like that. Anyway, there are other ways that row goals can sneak into your queries. Um, one interesting way, at least I think it’s interesting, but I think lots of things about this stuff are interesting that maybe other people don’t. Is, uh, when you have, when you’re, uh, filtering on row number.

So if we say where the, the N up here, N equals row number, right? Uh, if we say where N is between one and 10, SQL Server will, uh, do something that we’ve actually kind of saw before in one of the other queries. Where, uh, we have a top, right?

And this top starts asking for 10 rows, but before it can, before it can ask for 10 rows, those rows have to get sorted. Right. So we get all 613 rows from high rep users, but we don’t, we don’t reduce that to the, the row goal of 10 until we’ve sorted them. Right.

And then the, of course, the two operators in here are the row mode representation of generating the row number, the sequence project in the segment. Uh, but if we were to change this to say 20, we would see that row, we would see the, the, the, uh, 20 in here instead of a 10 in here. Right.

We would see this creep up and say, oh, now we need 20 rows from the sort. So row goals can sneak in, in all sorts of ways that you may not expect. There are other things that will also set row goals, but this, there’s a difference, right? Because we talked about it in the first video between setting a row goal and setting a row limit.

Right. So, uh, a query like this, where we say, uh, select whatever from table offset zero rows, fetch next one row only. This will set a row goal of one, but also only return one row.

Right. We’ll get Chris Jester Young here. We can also set a row goal with, with a fast end hint. Of course, you can put whatever number you want up to, I think the big int max in here, but that would, I don’t know.

I don’t know at what point that would become counterproductive. But when we say option fast one, we’re still setting a row goal. We’re just not setting a row limit.

Okay. So fast, like the fast hints will return all the rows, but with a query plan where the row goal is set to, uh, whatever number you put in there. Now, uh, the top operator, of course, this goes for offset fetch and fast two, uh, accepts, accepts up to the big int maximum.

There are a couple of neat shortcuts you can take. If you don’t remember the numbers involved, uh, using the power function. So, uh, two dot, the dot is very important.

If you forget the dot, you will get an error. But, uh, this one is two to the power of 31 minus one. And the big int max is two to the power of 63 minus one. But this returns those, those maximum numbers.

Sometimes if I’m, uh, feeling very ambitious, I will set up a view that has, uh, various SQL Server data types in their full ranges. I’m not like, like the, not like 2 billion rows full range, but like the min and the max for them. Because it’s, it’s a very handy reference for me when I’m trying to figure out what I, what I want and what I want to do.

Um, I don’t know if you ever do things like that. But, uh, getting back to rule goals a little bit. Uh, we can put big numbers into tops.

And those big numbers can, uh, again, sort of influence the optimizer in various ways. If I run this query where I’m asking for the top nine quintillion rows from high rep users where the display name equals John Skeet, we get this query plan where we seek into an index on high rep users that happens to be on the display name column. We do a key lookup for John Skeet.

And we still, but we still have a top, right? When we said top 100%, SQL Server said, well, I know better than that. But when we say top very big number, SQL Server does not know better than that. And SQL Server has to figure out a way to get that number of rows back to us.

So fun stuff there. What’s, what’s, what’s, what’s interesting too is that the optimizer and like sort of what, what might start cluing you in to what, how SQL Server kind of fences things around stuff like top. Is if we run a query like this, where we’ve sort of nested the top away under here.

And we have the display name filter out here that when we run this, we’ll get a different execution plan back. We still have a top, right? And we still have like SQL Server still gets the top nine quintillion rows from the 613 row table, but it has to do this first.

And then it can filter stuff out. This is like, this is more like logical correctness stuff though, right? Because asking for the top nine quintillion users and then figuring out if the top, anyone in the top nine quintillion users has a display name, John Skeet, or just got, is different from asking for the top nine quintillion John Skeets, right?

There’s two logically different things. One thing that I see quite a bit and often bothers me is people using non-deterministic ordering and not really caring that what they’re doing is potentially going to introduce incorrect result bugs. Okay.

So if we run this, that’s like the top five from score. And again, like remember what we were talking about in some of the previous demos in this video where I said like we’re most likely going to return this data in clustered index order. We had a serial plan.

You know, SQL Server was just asking for the top 10 rows. So we could observe that as long as we use the clustered index and as long as we had a serial plan, SQL Server would return data in the clustered index order, right? Again, observe, not guaranteed.

If we run this query, which is selecting the top five rows from the comments table ordered by score. And we look at the IDs that get returned. Like notice a score here is all zeros, right?

So there are duplicates in here. We get these five rows back nine, 11, whatever, 855, 19, blah, blah, blah. If we run this a few times, we’re going to see different rows showing up in here because we have a parallel execution plan. And when top, when there’s a parallel execution plan with top like this, SQL Server is just sending out the dot threads and whatever threads come back first or the top five rows that we get back.

Right? Right? So obviously, depending on timing and reads and all the other stuff, we might get back five different rows every single time.

If I run this query again and we look at the results, we can see that even the order of IDs have changed and which IDs we get back. This all changes every single time. The only way to make sure that ordering is deterministic is to have some sort of unique column in there to break ties within the column that has the duplicates.

There we go. All right. We’re smart people.

We can do this stuff. So if we run this query a few times, we will see that we get back this data in the same order each and every time. Right? So this is deterministic ordering. The one we did before is not deterministic ordering.

All right. So we have reached nearly the 20-minute mark. This is a good place to pause because I just finished a demo. And now we have another demo that we can do right after this. This is fantastic.

Right? And we can actually start looking at reshaping execution plans with row goals a little bit now that we’ve established some bare minimum fundamentals. All right. So thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video, which will hopefully be three of four. But we’re just going to have to see how things go. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Row Goals: Part 1

Row Goals: Part 1


Video Summary

In this video, I dive into the fascinating world of row goals in SQL Server, a topic that might not get as much attention these days but is crucial for understanding query optimization. I share insights from my presentation at the PASS On Tour events and co-present with Kendra Little on two days of SQL Server performance tuning pre-cons, promising to be the best T-SQL pre-cons ever. Alongside explaining row goals in detail—both those introduced by us and those set by the optimizer—I also discuss the difference between row goals and row limits, emphasizing why understanding these concepts is vital for effective query optimization. The video includes helpful links for consulting, training, and becoming a channel member, as well as my usual humorous advice on how to contact me correctly. Enjoy exploring this essential aspect of SQL Server with me!

Full Transcript

Erik Darling here with Darling Data. And today’s video we’re going to get started going over the material that I presented at the Pass On Tour events, plural, about Rogols. Why? Because they’re fun. And it was kind of amusing and no one really talks about them anymore. So I decided I’ll give it a shot. I don’t know. You tell me how I did. If you look down in the video description, there are all sorts of helpful links. You can hire me for consulting. You can buy my training. You can become a paying member of the channel. You can ask me questions for free on my Office Hours episodes. And of course, if you enjoy this content, you can like, subscribe, and tell a friend. In the honored tradition of liking, subscribing, and telling a friend, I’m going to tell you about Sport Drink. Not because they paid me to and not because they asked me to, just because I like it. I’ve got the orange, you glad it’s not chemicals flavor. It’s quite nice. There we go. This November, I don’t even, I don’t know when this video is even going to come out. That’s how much fun life is. But this November, this might be after November. I don’t know. Who can tell? Time is a strange thing, right? Time is a flat circle, as OI’s man once said. I’m co-presenting with Kendra Little. Two days of SQL Server performance tuning pre-cons. Sorry, my back just did something strange. You get old and laughing hurts. Two days of performance tuning pre-cons with Kendra Little. They will be the best two days of T-SQL pre-cons that have ever existed. And you should come, or else. Right? Sorry. I got a text from Joe Sack. I can’t show you what it is, though.

Shh. Secret. Anyway. All right. Anyway, let’s go talk about row goals. So, row goals. Very interesting thing. Now, when we talk about query optimization, there are two ways that query optimization can be talked about. Right? There’s the query optimizer that goes and does stuff. Right? Makes a bunch of cost-based decisions. Thinks about your query. Says, hmm. I think this is a good way to go get that. And then there’s query optimization, which is kind of like what you and me do. Well, I don’t know what you do. I know what I do. And when we optimize queries, we think, hmm. I don’t like the way that worked the first time. I need to go fix it.

But row goals are one thing that sometimes the optimizer will do that will change your query and your query plan. And row goals are also something that we can do to change our queries and our query plans and hopefully get better ones. So, normally, this is where I would introduce myself to the audience because I’m a very professional presenter. I would tell them about all the great ways that they can hire me. I will tell them that I am a consultant with reasonable rates. And if they ever want to hire me or follow me or any of that stuff, these are all the ways that they can do that.

I usually also make a joke about making careful note of the fact that when you want to contact me, my name is Eric with a K. If you go look for Eric with a C, you might find the love of your life. You might, I don’t know, you might get married, you might have kids, you might have the greatest future that you could dream of, but it will not be with me. So, Eric with a K. Also, be very careful because whoever you meet, if their name is Eric with a C, they will likely be a sociopath because their parents spelled their name wrong.

So, be careful. Anyway, moving right along. I’m very funny in person. I told a very funny joke at the Pass on Tour event in the Netherlands. I told the Europeans that I reformatted all the material for a European audience by taking out the jokes.

They all laughed. The first time they’ve laughed. Crazy. Anyway, row goals come in two forms. There are ones that you may introduce commonly with top offset fetch or fast enhance.

And there are ones that the optimizer may introduce commonly with not exists, exists in, not in, stuff like that. You can think of a row goal as a bit of a short circuit for the optimizer. I don’t mean short circuit the way that it’s glitching.

I mean, sometimes you might think it is, but it’s not. But it’s sort of like the optimizer can say, hey, well, wait a minute. Normally, I would have to go get like 3 million rows and 10 million rows.

And I get to figure out a plan to get that many rows. But the optimizer can then be like, hey, well, wait a minute. That nice person at the end said, I only need to get the first 10 rows.

I don’t need to think of a way to get 3 million times 10 million. I just need to think of a way to get 10 from 3 million times 10 million. So we can kind of think about strategies a little bit differently about like there’s a quick, like probably maybe a faster, different way or different approach to getting 10 rows.

And there is some getting like 30 million rows, right? So that’s the first thing. You can also think of a row goal is sort of like a promise.

It’s like when you sell, right, that only some number of rows will be produced or that you’ll run out of rows to produce, right? So you’ll either say, well, I want the top thousand rows from this expression. And SQL Server will be like, well, you look like a trustworthy person.

If you asked for a thousand rows, I’m going to go find a thousand rows. I’m not going to stop until I find a thousand rows or I can’t find any more rows. So you can kind of use them to fence things off.

If you use them in a table expression that be derived or common. And you can often use that to force the optimizer towards a plan shape that you prefer. Of course, it does take some work to figure out what plan shape you prefer.

But once you find that plan shape, you’re golden. Now, one thing that it’s sort of important to establish up front is that there is a difference between a row goal. Say, I need you to hit this number of rows, right?

These rookie numbers, you hit this row goal and a row limit, right? So the terms may feel interchangeable. But think about this.

Just about every other SQL dialect that you will come across uses the word limit instead of the word top. T-SQL is the only one that uses the word top as a row limiting sort of device. Everyone else says like limit whatever, right?

It goes to the end of the query, right? It’s not like select limit something. It’s like select your stuff limit 10 or 100 or something. So there’s a difference between like saying I only want to get 100 rows from this and something like the optimizer saying, well, I only need to find one matching row. An example of that would be like a semi-join, right?

Because semi-joins don’t need to get all the rows. We’ll talk about that more later though. But some examples. This is a row limit, right? You’re saying I want the top 100 columns from some table ordered by some column, right?

Top without order by is usually frowned upon, but we’ll talk more about that later. So this is limiting the rows. This is limiting the rows and the results to 100 rows or maybe, I don’t know, if the table only has 99 rows in it, you’re kind of naturally limited anyway.

But we don’t often run into those tables. This is an example of a row goal. So when you say option fast one, you are not limiting the number of rows in the output, but you are telling the optimizer to devise a query plan as if it only needed to find one row.

This is set at the very root of the query plan. This and that specific wording, that pedantic wording will come in handy later. This is an example of, again, potential row goaling by the optimizer.

So when you say select something from table where exists something else, well, exists and not exists and in and not in, we often see them expressed in the query plan as a semi-join. The reason it’s called a semi-join is because it’s not a full join. And I don’t mean full join in the sense of like full outer join.

I mean that when you have tables that have one-to-many or many-to-many relationships and you join them together, the many’s need to be respected. When you have a semi-join, you don’t need the many. You just need to know if one thing is there or if one thing is not there.

Right? Either that row exists or that row doesn’t exist and you don’t need to find all the other matching rows if you’ve already acknowledged that a row is there or not. Right?

So that is an example of a row goal that the optimizer may set. Query optimization by the optimizer. Right? The query optimizer’s query optimization is largely driven by how many rows are expected to come from tables, survive, where and join clauses, get past group by and having clauses. And setting a row goal is a way to influence the optimization choices by the optimizer without having to use query or table hints and all that.

Because the number of estimated rows that SQL Server gins up during query plan exploration and all that stuff or cardinality estimation and all that stuff has a pretty huge impact on how things get costed. And of course, as we discussed during the lost in cost videos, the lost in cost videos from last week, if you have not watched that playlist, you can if you want. It is not required material for this.

Completely separate material. Except for some of the things I’m going to say here. Query costs are estimates and nothing but estimates. You’ll hear a lot of very smart and knowledgeable people prove that they’re in the cool kids club by calling them query bucks. Right?

But the important thing to understand, again, if you watch lost in cost, this will sound terribly derivative, but cost does not equal time. Cost does not equal, does not measure speed or efficiency or anything else useful. They are, costs are all unitless metrics.

They are not durable performance tuning metrics. All right? Even in an actual execution plan, costs are all estimates. There are no actual equivalents for costs derived after queries execute like other metrics.

Make sure query plans are turned on here because wouldn’t it be nice if there were a setting or a button you could press to just keep query plans turned on? Like maybe if in presenter mode, there were an option for like enable actual execution plans and just have them on for all your SSMS tabs. Sure would be great.

If only someone opened an Azure feedback issue on SSMS about that. That’d be cool. Anyway, if you, if we look at this query plan, we pause for a moment with a, I mean, you know, technically this is on topic because there’s a top 10 in here. So, you know, we do, we do set a row goal and a row limit here, but if we run this query and we look at this index scan, we will see all sorts of things that have the word actual next to them.

And we will see one thing that should have the word actual next to it, but does not. And then we will see some things that have the word estimate next to them. These things that have estimated next to them do not have an actual equivalent, right?

There is no counterpart to these where the actual costs are shown anywhere because they don’t exist. They are, they are not a thing. But these ones down here, well, you’ll, you will see actual stuff for those.

Now, what it’s, what is important to understand is that costs are only how we got the plan that we’re looking at, right? All the costing stuff is how SQL Server figured out what plan it wanted to execute for our query, right? The optimizer did all its costing and compared choices and you got the cheapest combination of choices or the optimizer timed out and say, you get this combination of choices because I’m sick of thinking about it.

Not, not an uncommon thing. I’ve seen your queries. So you tell SQL Server which tables you want via the from and join clause.

You tell it which rows you want via things like the where and on clause. I guess even having would, would make sense in there too. I gotta, I gotta fix that in post, I guess.

You might tell, you tell SQL Server which columns you want via the select, you tell it which columns you want to summarize in the group by and you tell it which columns you want in what order you want your results via the order by clause. So when I go to conferences and then I talk to people and then I go home, I will, I usually tell the conference organizers, hey, I’d like to talk to some people who wanted to talk to me. Maybe, maybe you could give me this information, right?

Because my rates are reasonable, I just say, hey, just give me anyone whose consulting budget is over zero, right? If you have zero or negative consulting budget, we’re probably not a good fit. But if you have more than zero consulting budget, you could maybe work something out.

Could do a little shuffle for you. And of course, I would want to get this information. I don’t want to get all the information because too many, too many is too crazy.

But I just want like the top hundred people and I want them by consulting budget descending so that I make sure I maximize my input. But queries are just descriptions of what we want to see. And the optimizer goes and figures out how to best do it.

It’s impossible to escape people teaching you about SQL or databases without telling you that SQL is a declarative language. And other programming, whereas other programming languages are largely imperative or procedural programming languages. Meaning that you tell the computer exactly how, when, where, and why you want things done.

And it goes in and follows your instructions. Whereas queries are a completely different world. Queries, we have to describe all these things to the optimizer.

And the optimizer has to do a good job of going and getting it for us. There are some corollaries in databases that sort of match that a bit. Indexes contain data.

Statistics describe data. Databases contain data. And we describe what data we want to see from that using our queries. Costs are really just a bunch of internal algorithms that SQL Server uses to shape and choose execution plans for us. And the hope, the grand hope, the big hope that we have every day is that those costing mechanisms are correct enough.

And that we have provided our optimizer with good enough information about the data contained in our databases in order for it to come up with a good plan to answer our question. Costing will consider, and of course, the important one is first. Costing considers rows, right?

Cardinality estimation. How many rows are going to come out of this table? How many rows are going to qualify after this where and join clause? What’s the data distribution that we have from our statistics?

Costing will also consider the type of I.O. Because, you know, different types of I.O. cost different things to the optimizer. It’s random I.O. versus sequential I.O.

It will consider how much CPU effort it thinks will go into something. Of course, parallelism may reduce the cost of CPU effort because you’ll have multiple CPUs working on something. And, of course, memory requirements will also figure into things.

Costing is a highly generalized computation. Again, it’s based on one very old, very specific piece of hardware. And the point of that hardware was, well, I mean, the point of that hardware was for someone to do their job, obviously.

But the point of the costing algorithms was to come up with a good execution plan on any piece of hardware. So if you have two cores and 128 gigs of memory, or if you have 128 cores and two gigs of memory, the optimizer will come up with the same plan for either one because it doesn’t think about much of that.

If you ever want to see the old piece of hardware that costing was done on, it is here. It is this thing, right? It is this cantankerous contraption.

There might be like a where in the world is Carmen Sandiego CD stuck in there. There’s probably like an Encarta thing in here. Like it’s an ancient, ancient thing. Your servers at home, I mean, not your servers at home, but your servers like wherever they live.

And I would say at work, but they’re probably in some lousy cloud somewhere. But they do not have a lot computationally in common with this old hunk of junk, right? This old plastic thing here.

Thank you for showing that. That was very helpful. So now you know all the questions that I’ve answered. Lucky you. Anyway, all plan decisions are based on these costs.

Some costs are fixed per unit like CPU and IO. And there are other costs are based on things like statistics and metadata, things like table size, histograms, either system created or attached to indexes. Or I guess user created statistics would be another one.

And of course, uniqueness. Describing uniqueness to the optimizer can be a very helpful thing. Some things can really help the optimizer make better enough plan choices. So useful indexes, targetable predicates, unique constraints, enforced foreign keys, value constraints, up-to-date statistics, limited query complexity, a goal we should all strive for.

We yearn for non-complex queries, don’t we? And avoiding things that don’t have terribly good costing support. You know, XML, JSON, string splitting, built-in functions.

That means like left, right, substring, replace, up, down, reverse, that stuff. User-defined functions, local variables, table variables. And of course, the list could go on.

But we do have to go home someday. Okay. Row goals. Oh, come on back here. There you go.

Zoom it. Row goals are just another thing that affect costing and the query optimization process generally in SQL Server. So we’re going to stop here because we’re about at the 20-minute mark. And the next kind of section gets into the live demos and stuff.

So we have talked for long enough, right? We have gone through enough green text. Our eyes are now burning green and we need to go look at some execution plans. So I’m going to get this one uploaded and then I’m going to carry on from right here.

And I’m going to highlight that so I don’t forget because I’m probably going to go do something and then come back. All right. Thank you for watching.

Hope you enjoyed yourselves. Hope you learned something. And I will see you in the next video, which will be Row goals part two. Electric Goaloo.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 37

SQL Server Performance Office Hours Episode 37



* Does the order of INNER JOINs in a view impact how the queryoptimizer chooses how to build the execution-plan?

* Is there any performance gains when inserting into temp-tables to use the WITH (TABLOCK) hint? Thanks!

* Hey Erik! Is it data or data? I’ve heard some people saying data but I say data like you, so I think I’m right.

* 1. In our high-volume OLTP fintech environment on SQL 2017 EE, we have poor DB design leading to increasing deadlocks and blocking (readers blocking writers). As a short-term fix, we’ve set low deadlock priority on readers and high on critical writers, to avoid even more NOLOCK. We want to enable RCSI but management worries about safety, especially inaccurate results from in-flight version store data that might roll back. Previous DBA enabled snapshot isolation on a few DBs, but devs likely aren’t using it correctly or understand isolation levels. We have ample tempdb space (rarely used, as devs prefer table variables). How can I prove RCSI is safe? Dev env is limited—only some DBs match prod, with slower HDDs/CPU.

* 2. About 80% of our tables (millions to hundreds of millions of rows) have outdated indexes (5+ years old), despite daily dev work. Queries mostly from EF Core; servers reboot weekly (working to stop), losing index stats—but workload is consistent week-to-week. Biggest issue: slow upserts/deletes on tables with 50+ child tables (no cascading). I propose starting with narrow compressed indexes on FK relationships to avoid table scans. Tried sp_indexcleanup, but low uptime limits results beyond compression. Thoughts on this approach and overall indexing strategy?

To ask your questions, head over here.

Video Summary

In this video, I dive into some pressing SQL Server questions from viewers, covering topics like join order in views and performance considerations for parallel inserts. I also tackle the age-old debate over “data” pronunciation and delve into issues of deadlocks, NOLOCK usage, and snapshot isolation. Additionally, we explore strategies for optimizing upserts and deletes on large tables with many child relationships, discussing indexing foreign keys to avoid table scans. Whether you’re a seasoned SQL Server pro or just starting out, there’s plenty here to help improve your database management skills. If you found this content valuable, please like the video, subscribe to my channel, and share it with colleagues who might benefit from these insights.

Full Transcript

Erik Darling here with Darling Data, and you know as well as I do, it’s Monday, you know as well as I do that on Monday we do office hours where I answer five of your most pressing, I don’t know, impressive, depressive questions, I don’t know, whatever you have going on. And I don’t know, everyone’s happy. Everyone’s happy and everyone lived forever holding hands, right? Anyway, if you look down at this page, the video description, there are all sorts of useful links for you to click on and many of them will result in you giving me money. You can hire me for consulting, you can buy my training, you can become a supporting, and again the supporting with money member of the channel. You can ask me office hours questions, which of course free, you know, I don’t know, maybe I should charge like a dollar a question or something. Some of you ask a lot of, some of you ask a lot of questions and type in ways where I know it’s the same person, I’m like, ah, you owe me a buck. And of course, if you enjoy this content, please do like, subscribe and tell a friend.

I have some exciting conference announcements coming up, but of course I cannot announce them yet as they have not been announced publicly by the presenters. So you will just have to hold on to the very seat of your pants for those. But the one that I can talk about is past data community summit coming up in Seattle, November 17th to 21st, where Kendra Little and I have two days of the best T-SQL pre-cons you will ever see in your life. And with that out of the way, let’s go answer some questions. We will go to the magical Excel file.

And we’ll, we will do our best to answer these. And the first one is, does the order of inner joins in a view impact how the query optimizer chooses how to build the execution plan? So theoretically not, right? Because the optimizer is a cost-based situation and the cost, the costing model will look at your joins and it will, you know, assuming that you get past like, you know, into the right search phase of things. It will start reordering your joins to figure out what is the best way to do it. Where that kind of falls apart sometimes is that SQL Server will set a budget for the number of steps that it will take to optimize your query, which includes join reordering.

And you sometimes you’ll see what’s called an optimizer timeout. That timeout is not in time. It is in steps that it is budgeted based on the cost of your, the sort of like heuristic cost of your query based on complexity. And so SQL Server might not have time to reorder all of your joints. If you know the best way for your joints to be ordered, well, I mean, you’re in a, it’s in a view, so you can’t add a force order to that. But if you know the best way that you want your joints to be ordered, I don’t know, you could go ahead and write them in there.

And then if the optimizer does timeout, maybe it’ll just listen to you, but probably not. So in general, no, but there are situations where, you know, if you write them in some kind of order, the optimizer might timeout before it has time to reorder everything. But you’re still probably not going to see joins specifically written in the order you write them in, unless you have a force order hint on the query. I didn’t highlight that as I was, as I was reading it. I do apologize.

I’ve changed the width of my, my zoom it thing for various reasons. And, um, it’s a, it’s a bit, it’s a bit intrusive at times. Is there any performance gains when inserting into tent tables to use the width tab lock hint? Thanks. Yes. Um, depending on local factors, you might find that using a tab lock hint will result in a fully parallel insert in which the insert operator is within the parallel zone of the query plan.

If you don’t have a parallel query plan, uh, then that won’t happen anyway. There are also things that you can do that will mess that up, like having an identity column, having a clustered index, having a primary key, having any nonclustered indexes, um, referencing the table that you’re inserting into and like a not exists in the query. Uh, there are lots of things that will screw that up. Um, if you need something that behaves like an, an identity column and it’s important to you to have a fully parallel insert, just use row number instead.

No, no, no, no inhibitions on that one. So, uh, in general, yes, but also, um, the degree of parallelism can have a big impact on how fast things are. There are some queries where a higher parallelism will higher degree of parallelism rather will result in faster inserts and sometimes in slower inserts. Of course, the sweet spot is generally somewhere around eight, but you’re free to experiment with the max stop hint to see where that best aligns for you.

All right. This looks like a very important question here. Hey Eric, is it data or data? I’ve heard some people saying data, but I say data like you. So I think I’m right. Yeah. Sounds good to me. Good job. I don’t know any other way to pronounce it. So there’s, there’s some alternative pronunciation of data out there. Well, uh, it’s just, this used to be a proper country as they say. All right. In our high volume OLTP FinTech invite you work for FinTech and you’re asking me questions for free.

Oh, my God. The poor, starving consultant. Golly. I’m going to have to take a walk after this one. This is a paragraph.

My Lord. Why don’t you just hire a professional? Uh, we have poor DB design. Oh, gee. Is there a consultant around who could help with that? Leading to increasing deadlocks and blocking. Gosh, do you know anyone?

Is there a short term fix? Oh my Lord. We’ve set deadlock priority high on readers. Sorry. We can say I’m beside myself. We’ve set low deadlock priority on readers and high on critical writers. Why readers are always going to lose. They don’t take up, take up any transaction log.

Like it’s the reader is always going to be the victim. My goodness. To avoid even more knowledge. Oh, gee. Is anyone out there in the world good at talking people out of NOLOC? I don’t know.

The mind boggles. We’re truly puzzled. Is there a doctor in the house? Previous DBA enabled snapshot isolation on a few DBs, but devs likely aren’t… No kidding, they’re not. It’s getting real hard not to curse here.

Are likely aren’t using it correctly or understand isolation levels. We have ample temp DB space. Rarely used. Devs prefer table…

You! You are also wrong. Of course table variables use temp DB. What do you think they are? Magic? I have a million videos where I prove this point a zillion times. My goodness. Okay.

How can I prove RCSI is safe? Well, you’ve got snapshot isolation enabled for a few DBs. As you noted. Why not start having some queries there? Use snapshot isolation. Hmm. That’s a good one.

Only some… Oh, wait. Hold on. Is there… Is there more to this? I feel like there’s maybe more to this and I’m missing it.

Is there more? I feel like there’s something else going on here. No. All right. How do I prove RCSI? No, that’s the end of it. Yeah. Just… You could start by having some queries ask for snapshot isolation and use that.

That would be a good way. You know, that’s probably it there. Anyway. Okay. Yeah. Let’s…

Oh, you are… You… You… You numbered your questions. You… You… Oh, Lord. If I cry on camera…

This’ll be why. About 80% of our tables. Millions to hundreds of millions of rows. Gosh. Does anyone have experience dealing with data of that size?

I don’t know. Of outdated indexes. Well, how does an index get outdated? Does it have an expiration date? Does it go bad?

Despite daily dev work? Well, your developers, if they’re the ones from question number one, you should probably just drag them into an alley and shoot them. Oh, queries mostly from entity framework core. God bless.

Servers Ruby reboot weekly. Mother. Okay. Biggest issue. Slow upserts deletes on tables with 50 plus child tables. No cascading.

I propose starting with narrow compressed indexes on FK relationships to avoid table scans. Tried SP index cleanup, but low uptime limits results beyond compression. Yeah.

Um, indexing foreign keys is one of the, um, the least crazy things that you can do. At least in a database. Some databases are nice. And when you create a foreign key, they create an index for you. Indexing foreign key columns is not terribly controversial.

The thing to be explicit about here is that if you’ve already got an index that leads with the foreign key column or columns, there are some multi key foreign keys. We must include those.

Um, then you, you don’t need another index on that has that leads with that column or those columns. Um, so you don’t need specific index. You don’t like if you name an index FK, it doesn’t give like it magic powers.

The thing to be aware of is that the optimizer is still free to choose the execution plan for your foreign keys. Right? Like, like, like, uh, foreign key validation.

Like you could have all the indexes in the world. SQL Server could still choose a merger hash join with an index scan on the inner side of the join for the foreign key relationship. Sometimes you actually have to add a loop join hint.

Um, I’ve got a blog post about, um, sort of like, about this, but also about how sometimes triggers are, uh, are a more useful approach for this. Because you can control the execution plan of triggers in a somewhat easier way. But since you’ve already got all those foreign keys, you might, may as well, and just, may as well, as long as you don’t already have indexes that perfectly express that relationship, then, then you might want to try indexing them, sure, at your high volume OLTP fintech.

All right. Well, I’ve, I’ve got a bottle of four year old Will It Rye that is screaming my name right now. So we’re gonna go do that.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I’ll see you in the next video. Have a good one.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.