Batch Mode Sorts and Row Mode Indexes in SQL Server

Batch Mode Sorts and Row Mode Indexes in SQL Server


Video Summary

In this video, I dive into the intricacies of batch mode sorts and rowstore indexes in SQL Server, sharing insights from a recent client-facing issue that highlighted the challenges and nuances involved. We explore how forcing parallel plans can sometimes lead to unexpected results, such as the need for additional sorting even when an index perfectly aligns with query requirements. I also discuss the trade-offs between optimizing for speed versus memory usage, providing practical advice on when to use the `MAXDOP` hint to achieve better performance in specific scenarios.

Full Transcript

Erik dishwaskeling here with Darling Data. And today’s video, we’re going to talk about batch mode sorts and rowstore indexes. And hopefully I said that right as I’m reading it from the teleprompter in front of me. Great. So this was an issue that a client of mine had trying to like, like various parts of their workload, you know, row mode execution just doesn’t cut it. You need to get batch mode stuff involved. But sometimes, especially batch mode on rowstore is a tricky thing to sort of get happening in the way that you want it to happen. So we’re going to talk about a little bit of that. And we’re also going to talk about, of course, the little caveat with batch mode sorts and rowstore indexes. So we have much on our plate. Now, I did mention that this was a client facing problem. And, you know, I do have clients. I do have nice people who pay me for my time to do things for them and have a client-facing problem. Help them with things. If you would like to be one of those lucky people, there is a link down in the video description where you can hire me to do SQL Server stuff. It’s a crazy scheme, I know, but it works. Trust me. You can also, by training, become a channel member, ask me office hours questions. And of course, if you find this channel content just so groovy and ghouly that you want to make sure that it spreads far and wide like the herpes simplex virus, you can, of course, like, subscribe and tell a friend. Get me in that sweet YouTube algorithm or whatever.

Past Dita Community Summit. Oh, boy. Oh, it’s so close. Two days of T-SQL pre-cons. Me and Kendra Little. The entire event is in Seattle, November 17th to 21st. So you should be there. Hopefully you bought tickets and all that stuff at this point because we’re only a few weeks away and sometimes planning this sort of travel takes some time. You got to start early, right? My wife tell you when she books travel, it’s like a million years in advance. So I had chat. So like by the time this video publishes right now, it’s still October. By the time this video publishes, it will no longer be October. So I had said chat GPT. My birthday is in November. Can you give me a birthday themed drawing? Just like iterate on the one that we’ve been using and it left in some Halloween this right? We still have a ghost and a vampire thing, but it is a festive birthday and I am kind of a spooky, spooky person. So we’ve got, we’ve got a ghost and a vampire showing up to my party. So anyway, let’s go talk about the stuff that we wanted to talk about. See, I’m so spooky. I have a black background sometimes. All right. So I think I already did this, but let’s just make sure. Yeah. Wonderful. All right.

We’ve got an index. So what I want to show you first is, and I don’t want to run this like actually like at real time run this because it takes a long time. So just really what I just want to show you is the estimated execution plan for this. Now, this is a query where I am forcing a parallel plan. And since if you watched my video about what Microsoft is doing to create their vector disc and indexes, you should know that the enable parallel plan preference hint is used in that code.

So it’s safe for production now, right? Because Microsoft uses it in production, so you can use it in production. So now it’s safe. They’ve blessed it. So what I want to show you specifically here is that when this query executes in row mode, right? And if we, let’s buy ourselves a little bit more query plan real estate up here, we can generally visually infer that this plan ran in row mode.

Because like, like we have a repartition streams and repartition streams there, they don’t like none of the parallel exchanges support batch mode segment and sequence project aren’t used in batch mode. We get a window aggregate operator for windowing functions when we execute in batch mode.

The filter, of course, could be in batch mode. Actually, I mean, it probably would be if I didn’t disallow it. But and then the gather streams is, of course, another parallel exchange operator. So that can’t be batch mode either. So this is a fully row mode plan. But the thing that I want to point out here is that because we have an index, right? And if we scroll back up here, I should probably sell a little bit of my query plan real estate. We have a non-clustered rowstore index that not only fully covers our query, but supports the window function specification exactly right.

So user ID and sort descending, which is what we’re asking for in here, right? So because we have this, we don’t have a sort operator in here where SQL Server has to like, like, like basically rearrange data from the way it’s stored somewhere to the way that the window function needs it to create its row number, right? So we don’t have to do that there. Now, next is me levels like, you know, try like, hey, like, get that out of the way. Like that’s, that’s gonna, that’s gonna come in handy later. The thing about the sort. So keep that in mind. Now I’m like, okay, well, we want batch mode, right? We’re like, like, we’re gonna use our auxiliary columnstore helper table, we’re gonna do this left join to it. It’s got a clustered columnstore index, it’s got no rows in it, we’re gonna do this thing so that we get a batch mode thing. And we’re gonna say no, but we’re gonna use this now safer production query hint in there. The thing is, though, if we get the estimated plan for this, it’s the same as last time, right? And like, we zoom around a little bit, we’ve got row and like, you know, the like, none of this stuff can be in batch mode, like, like repartition streams, segment sequence project gather streams, but in our filter operator is also still in row mode.

Now this was kind of an interesting one. And what it comes down to for batch mode on rowstore in this query plan specifically, is batch mode on rowstore takes one look at the text column in the comments table and says, hell no. The text column is in VARCAR 700, right? So if we remove that from the query select list, right, we’re gonna specifically name our columns in here. And we’re gonna say no text column, right, we’ve removed it from existence. And we run this, right, we all let’s just get an estimated plan for this. Now we see a much more batch mode plan. Right? But we now we have a sort, we’re using that same nonclustered index, right? So the data from the index is in order. But when we read from this, we’re reading from it in batch mode. Okay. So let’s run the query. And let’s look at how this thing fares. So let’s start over here on the left. And we get a do do do do memory grant of 1418 megabytes. So that’s about 1.4 gigs, I think, depending on how you like divide if it’s by 1000 or 1024, you might, you might have a slightly different take on exactly what that is. But let’s just call it 1.4 gigs for the sake of 1400 megs. Yeah, it’s about 1.4 gigs there. All right, all right, cool, we did the math. But this query finishes in about 1.3 seconds. But the thing is that, why is this sort here? And it’s kind of an interesting thing, because this only happens in parallel plans, right? So if we come down here, where we’re going to do almost the same thing, except we’re going to force this query to run at max.1, right, we’re going to say, hey, and just keep in mind for this query, we didn’t even need this to get batch mode on rowstore.

If we check the estimated plan here, right, we’ll see that we do, we have a very similar plan, but without the sort, right? So reading from the row mode index, with a max.1 plan, we don’t have to sort data here, we can actually rely on the order that this data came from.

So, but if we run this, at max.1, okay, so this whole thing, you know, like it’s a fully batch mode plan. So each operator only has the operator time for itself. So we have 5.2, and then another 137 milliseconds, and then another 35 milliseconds. So let’s just say it was about five and a half seconds total. But the memory grant for this is 120, 1024 KB, which is like one meg. So this is an interesting thing. And this is, this is specifically just a product limitation for the SQL servers implement implementation of batch stuff. It can’t like use the, like in a parallel plan, where batch mode is used like this. It cannot trust the order that things come out of the index. And it can, the index is like the, it’s not the, not the, not that the index is out of order, but the order that stuff gets read in, in batches might not be true to like the, the sort thing. So like it might not maintain things correctly. So we still have to sort data here. So the question for you becomes, what do you want to optimize for? If you want to optimize purely for query speed, well, it’s probably fine to have the, the, this thing run in parallel and have a sort in it and use like 1.4 gigs of memory.

If you want to optimize for memory usage and you don’t, and you’re willing to sacrifice some time for that, you can of course force the plan to run at max.1, not sort data and use less memory. But this was a very interesting thing. And I had to do a little bit of research on the batch mode sorting to make sure that I was correct in this. So, so I talked to some nice people who, who worked with the product and they said, yeah, that’s true. And I said, yeah, that’s great. So yeah, batch mode on rowstore, using windowing functions, when you get a parallel plan, right? Even if you have an index that perfectly puts your data in order for the windowing specification, the parallel plan sort of negates all that and you will still have to sort things. When a serial plan, right? You do not, like you see the, the can trust the order coming from getting stuff from the index and you do not have to sort things.

Reading from big tables with a single thread, even in batch mode does take more time, right? 5.2 seconds versus 829 milliseconds. So really depending on what you want to optimize for, you might choose to hint max.1, get rid of the memory grant, take a little bit longer, or say, screw it, use the memory, use the, use the extra threads, finish as fast as you can.

Anyway, that was about it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I’ll see you in tomorrow’s video where we will talk about, I don’t know what kit. Well, maybe I’ll just make it up as I go as usual. All right. Thanks for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

The Geometry of Performance Issues

The Geometry of Performance Issues


Video Summary

In this video, I delve into an experiment I’ve been working on to visualize performance issues in SQL Server’s query store using spatial data types. I explore how adding shapes based on metrics like CPU and duration could potentially help users better understand the relative impact of different queries. This is very much a work-in-progress—far from being production-ready, it’s more of a test to see if this approach holds any value for others. I’d love to hear your thoughts: do you think visualizing performance issues in this way would be useful? Or does it just add unnecessary complexity? Your feedback will help me decide whether to continue pursuing this idea or focus on other areas.

Full Transcript

Erik dishwaskeldarling here with Darlene Data. And in today’s video, so I’ve been tinkering with some big ideas lately, right? And one idea that I had was to try to use, so like SQL Server for years has had this spatial data stuff in it. And one idea that kind of occurred to me was no one ever uses this. I wonder what it’s all about. So what I wanted to do was try to attempt to add some shapes to query store. So that when you’re looking at the results, you might be able to infer a little bit of the sort of sense of proportion of performance problems that one query might be having or causing. So we’re going to talk about the geometry of performance issues in this video. And this video, honestly, is a little bit of market research because I would love to hear if you think this is something that would be useful to you and something that I should pursue working on. Right now is just sort of something that I’ve been testing out a bit and marinating on. It is nowhere near complete or what I want it to be. But I think you might be able to see a little bit of kind of where I’m going with it. So I did want to talk about it to sort of figure out if anyone would care that it’s there.

Or if you would just stare at it, you would just see a new tab show up called like spatial results and scratch your head and walk away. So we’re going to do that in this video. Down in the video description, if you would like to hire me for consulting, buy my training or support the content on this channel, you can do all three of those things there. Those do require giving me money. You can also do things for free like ask me office hours questions. And of course, like subscribe and tell all your friends. Hopefully your friends were not affected by the recent AWS outage. So we do, we do, we do pray for their AI souls if they were. Past Data Community Summit coming up in Seattle, November 17th to the 21st.

Kendra Little and I banging out two days of the most magnificent T-SQL pre-cons that have ever been endured on this planet. So I do hope to see you there. I will have all sorts of neat gifts to give you, to give those who show up. If you don’t show up, beat it. I don’t have no use for you. Anyway, I still haven’t, I still haven’t changed this. I still haven’t quite thought of a good November theme. My birthday is in November, so maybe it’ll be my birthday month theme. Maybe I’ll do that. That sounds nice, right? Cool. Anyway, let’s talk about what I’ve been doing over here. So apparently I just left that up.

So what I’ve been trying to do, and I just stapled this on to the very end of the last Quiky and Quickie store. This is by no means production ready code or what things will look like when they’re done. I’ve got a thoughtfully named column here called G, which attempts to draw a polygon using geometry in the ST Geom from text. And just to start off, I decided to use total CPU and total duration.

I really, in order to get the shapes to be visible, I really had to jack up. I had to do some math on the count of executions for this, because otherwise the squares were like this big, which wasn’t very helpful. And then I thought that, you know, maybe I would look at some slightly different metrics if what I was troubleshooting were a parameter sensitivity issue.

So I’ve got another column down here called P. I didn’t format P quite as nicely as G. Let’s fix that on the fly. There we go. And this one looks at some slightly different metrics, but I still have to really jack up the count of executions locally in order to get things to sort of behave the way I want them to. But anyway, when we run a query store now, and I’m going to just do this from scratch so you can see what happens.

We get some results back, and I haven’t set up to look at like a specific parameter sniffing thing or anything. I just want to show you kind of like what the general layout and stuff is, as well as some of the sort of limitations on spatial results that I can’t do much about. So we get this new tab back up here, right? It’s called spatial results. It shows up between results and messages.

And by default, it will go to the first spatial column. You can see over here, that is the column that I helpfully alias G. And if we want to change the label, we have to do that manually. I cannot currently change. I cannot currently set up a default label. Now, what’s annoying is when you first choose the label column, it doesn’t actually label anything.

But if you just click on the zoom thing, then it shows up. So that’s kind of interesting. But then now we can at least sort of see the queries that are in here. Another thing that I haven’t quite figured out yet is the stacking, right?

So like all these shapes sort of like blow together and they don’t maybe help differentiate things as well as they could. There’s also some that are clearly missing labels, right? I think the labels just get hidden if the shapes have a shape over them or something.

I’m not really sure how that works, but I don’t know. Like some or maybe I don’t even know where this one is supposed to be. And maybe this one was supposed to be here. I honestly don’t know what’s going on in spatial world.

But it is kind of cool that if you hover over, well, of course, it’s going to disappear. If you hover over one of the shapes, you get some information back. One reason why I might need to separate this out into a different result set is so that I can sort of choose the columns that I want here.

By default, it just lists out the columns and the ordinal position of the table, which is not the most helpful thing in the world. So I’m probably going to need a second result set if I want to show more pertinent information in here and not have like the beginning of the query plan XML showing up. Because why on earth would you need that?

Right. But not very helpful to anyone. So the other thing that I can’t do or maybe maybe I can, but I haven’t quite figured it out yet is a way to label the accesses on this thing. Right.

Because we have these two lines of numbers going up and down and across. But what are they? Right. Well, for this, it’s total CPU and total duration. Right. Which, you know, I know because I did it. But if you ran this and got this back, you would say, what are these numbers?

What do they mean? I don’t get it. Right. So maybe figuring out a way to label things and there would be helpful as well. But if we change the spatial column from G to P, we will get some slightly different squares back.

We still maintain the query text label, which is very helpful. And we still get back, I guess, depending on precisely where we click some, you know, the same sort of pop up with the columns that are in there. This block is apparently forgotten to be labeled as his.

Well, I mean, probably the one hiding behind this one has forgotten to be labeled as well. So obviously some limitations, obviously some stuff in here that needs to be like worked out probably mathematically. I’m not quite sure how to do that yet.

I’m also not quite sure how far to sort of pursue this. Given some of the limitations and some of the little annoyances, it might just be more confusing than it is helpful to a lot of people. It might have to be like an optional thing that gets sent back.

Like if you are like cool with however this works. And then, of course, there’s also this sort of dynamic bit of, you know, figuring out, you know, what metrics to draw the squares by. If the count of executions is sufficient to draw a square where the size, you know, is visible on the screen.

So there’s like some stuff that really has to like, you know, get, you know, figured out at first. But I do want to sort of get a general sense of if you, the community, the people who watch these videos, hopefully the people who use my store procedures like SP Quickie Store would find any of this stuff at all useful when troubleshooting performance things.

Like the query store GUI itself is a godawful disaster. And whoever designed it has clearly just a lot of disdain for humanity and user experience. But they do have colorful graphs, right?

They do have charts and graphs. There are some visuals there that people find helpful. The one thing that Quickie Store doesn’t have going for it, pretty pictures. So I figured I’ll try to draw some pretty pictures.

But there we are. Some things that I don’t want to do are leave people with needing to export data somewhere else, right? I don’t want to have to copy and paste data.

I don’t want people to have to like have like Grafana or Tableau or Power BI to go do this stuff because it’s an extra step, right? Most of the time when I’m working with clients, the last thing I need is an extra step with a dependency on it that might not exist. So what I want to do is just have something that works in SSMS, gives you some feedback on the proportion of a query’s performance impact, and gives you a way to sort of like judge based on whatever metrics you care about, like what stuff you should be dealing with.

So if you like it, let me know. If you don’t like it, think it’s stupid, say, Eric, stop wasting your time on these fanciful feats. Go back to work.

Go spend time with your wife and kids or something. I don’t know. But feedback is requested. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I hope that you will give the feedback that I have requested. And I don’t know. We’ll figure it out from here, won’t we? All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Introducing sp_QueryReproBuilder: Quickly Get Executable Query Examples from Query Store

Introducing sp_QueryReproBuilder: Quickly Get Executable Query Examples from Query Store


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 40

SQL Server Performance Office Hours Episode 40



Questions:

  •  I know that you praise indexed views as the best alternative to columnstore in Standard Edition. If their tragically limited syntax supports my query, how do you feel about using them for performance improvement on Enterprise Edition?
  •  You mentioned you like select into #tmp and creating the indexes after. I find not having control over the nullability of the column is a bit problematic if you want a primary key after. Is alter table #tmp add unique cluster sufficient? Or do you have any tips on controlling the nullability, even isnulling it doesn’t seem to help sometimes.
  •  What is your choice of background music when working away on T-SQL?
  •  Had any surprises, good or bad, with Accelerated Database Recovery?
  •  Hi, Erik! Can scalar UDFs be sniffed the same way as stored procedures?

To ask your questions, head over here.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

PASS Data Community Summit SWAG-A-THON!

PASS Data Community Summit SWAG-A-THON!


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Follow Up On Fixing Parallel Plan Row Skew In SQL Server

A Follow Up On Fixing Parallel Plan Row Skew In SQL Server


Video Summary

In this video, I delve into the issue of parallel skew and share a practical solution to address it in SQL Server queries. Parallel skew can be a tricky problem, especially when dealing with highly skewed data distributions that lead to inefficient thread utilization. I demonstrate how creating an artificial distribution by using a `VALUES` clause and joining it back to the main table can help achieve better thread distribution while also making the query more batch-friendly. This approach not only improves performance but also showcases the importance of considering parallelism in query optimization, particularly when natural solutions are insufficient. Join me for the next video where we’ll explore other aspects of SQL Server query optimization!

Full Transcript

Erik Darling here with Darling Data, and in today’s video, we are going to talk about how to fix parallel skew, or rather how I fix parallel skew for a particular query. All right. I can’t pretend this is going to be the thing that you do every single time you have this problem, because this is hopefully a rare problem, at least in today’s day and age, but you can still run into it plenty of times. Lord knows I do. And I find it interesting, and I’m going to talk about this. I’m going to be a quick-enough to talk to you about, and I’m going to be able to tell you a bit about it. And I’ll talk to you about it. So, if you find this interesting, well, stay tuned. Down in the video description, all sorts of helpful links. Perhaps you have lots of parallel skew, and you would like a young, handsome consultant with reasonable rates to come fix it for you. You can find all sorts of helpful links to do that. You can also buy my trainings. You can also support this channel with money, not just with various forms of internet applause. You can ask me office hours questions. That is a totally free endeavor for you. And I will answer five of those every Monday. And of course, if you enjoy this content, please do like, subscribe, tell a friend or two, tell your AI girlfriend who hopefully wasn’t too terribly affected by the recent AWS outage. We pray for her well-being.

I will be at Pass Data Community Summit in Seattle, November 17th to the 21st, giving two days of the best T-SQL pre-connery that has ever existed with Ms. Kendra Little. And then, I don’t know, well, hopefully I can talk about my other upcoming stuff soon. It’s just, people are very slow to announce things. Apparently, they just like to wait for excitement to boil over. Anyway, let’s talk about this parallel skew stuff. I’m going to make sure I go to the right copy of Management Studio here. So, let’s start off on this tab, because this tab is going to show us why we have such skewy problems with these.

So, what I’ve done with the votes table is something kind of mean. And so, to give a little background on this, when you have very skewed data, that can lead to very skewed parallelism. And in the public version of the stack overflow database, swallowed a bug. Most of the voting history has been anonymized, of course, because voting is a private endeavor. It should not be made public.

And so, what I’ve done is I’ve made a copy of the votes table. And I’ve set the user ID column equal to whatever the vote type ID was. So, vote type ID is a number from like 1 to 8 or something that has various types of voting that people can do on questions, comments, and answers, and whatnot. Well, I guess not comments in this table. But, so what I’ve done is I’ve effectively made a very skewed user ID column.

And I’ve created an index and all this other good stuff. But, this is the way that the user ID column breaks out. So, for vote type IDs between 1 and 4, these are the most common vote types. Well, actually, three of them are very common. One of them is not terribly common.

But, vote type of 2 is an upvote. Vote type of 1 is accepting an answer as the answer, the little green checkbox. And a vote type of 3 is a downvote. So, sorry, upvotes, accepted answer, downvote if I have my voting history correct here.

And 4 is like offensive or something. So, not too many offensive flags. Because Stack Overflow is a place for friends. It’s also going the way of MySpace, a place for friends at this point.

But, at some point, I’m going to be giving demos on Stack Overflow. And someone’s going to be like, what is Stack Overflow? I’ve never heard of it. It’s going to be funny.

Anyway. Anyway. Yeah. So, this is the way that these integers spread out. And these are the numbers for those. Because we have a group buying account and whatnot.

So, 37 million, 3.7 million, 3.5 million, and 733. All right. So, when we run a query that does, like this, that does this outer apply and whatnot. And we look at the query plan.

You may remember this from the original video that I did. And we come over and we look. I mean, not there. The tooltip is utterly useless for discerning what our problems are here. If we look at the way that rows were distributed across our DOP 8 plus 1 threads, you’ll see that they all ended up on thread 3.

Right? That number there would be the total count of the 37 million plus the two around 3.5 millions plus the 733. They all ended up.

So, the parallel page supplier runs this hashing function. And depending on how things hash out, that’s how the rows get distributed to threads. It’s dealing out all these rows. And unfortunately, our hashing function kept hashing these things out this way.

Now, sometimes when you do this, you can set DOP to different numbers, like especially odd numbers, in order to get the different thread distributions. But I tried that for every DOP between 2, which got me a serial plan all over again. So, theoretically, in reality, it was really between like 3 and 16 in order to get a different spread here.

But they all ended up, I mean, not always on thread 3, but always all on a single thread. And you can sort of visualize how this happens a little bit differently if you rewrite the query to turn this from one seek into, I guess, four different seeks. So, if you do one select for vote type ID 1, one select for vote type ID 2, and one for 3, and one for 4, and so on.

And then you get the dense rank over those. Then you get a slightly, well, you get a very different execution plan, but it helps you to sort of visualize things a little bit, right? So, if you look at these, we get one, two, right?

You can tell that’s 37 million, or sorry, I think that’s, yeah, that’s 3.7 million. That’s 37 million, probably. That’s 3.5 million.

And there’s our 733, right? So, we can see all four of the individual seeks when we do this. And if we go look, again, not at the tooltip, which keeps really making its presence felt here. We’ll see all of the rows end up on thread 1 here, right?

Which is, you know, not great still. And if we look down here, we’ll see all 37 million rows, well, they all ended up on thread 1. And then down here, there’s 3.5 million rows all on thread 1.

And over here, there’s 733 rows all on thread 1. So, all of these ended up hashing out to the same thread. Right?

Just in different seeks. And then if we come over about here and look, we’ll see thread 1 gets all 44 million some odd threads, rows on one thread. So, things do not end up better when we do this.

SQL Server does not like, unfortunately, SQL Server does not like spread those out better or do anything else smarter. It all just sort of ends up a complete washout. Now, the one way that I was able to rewrite this query in order to get better thread distribution was to do this.

So, rather than correlate on the four vote type IDs that I cared about, remember, because in the original query, what I had done was written where vote type ID is between 1 and 4 here. Right? So, we kind of narrowed it down.

We kind of forced ourselves to have a crappy parallel distribution on this one. But it’s a lot easier to do that intentionally than it is to get it to happen by accident. So, we kind of forced the issue on that one.

But what I was able to do to get this one worked out was to sort of nest things a little bit and put the numbers 1 through 4 into a values clause. Right? And call that vote type.

Alias that column that comes from these as vote type ID. And then left join the select here and join that on vote type ID there and say where user ID equals user ID out here. Right?

So, oh, it’s still logically a correct query. And when we look at the query plan for this one, this is kind of at the intersection of fixing parallel skew and being batch mode friendly. Because what you’ll notice in this plan, which is going to be different.

So, like, just to come back to these a little bit. You see we have a parallel nested loops here. And for this one, we have a parallel nested loops here as well, if I’m remembering correctly. Ah, I was.

Wonderful. Yeah. So, parallel nested loops. Good, good, good. Everything correct there. Now, we get something different. We get hash joins. Right? We have two hash joins instead. And if we look, we have this constant scan.

So, this constant scan is going to be the four rows, the numbers one through four that I put into that values clause. And if we look, I mean, there’s obvious signs that we got batch mode, like the window aggregate here. The important thing is this query finishes in about four seconds.

And the other queries over here, this one finished in 29 seconds. And this one finished in 59 seconds. So, I guess writing the four seeks out as multi-seeks was at least somewhat helpful by about 20 seconds.

But probably not where we want things to be anyway. But this query, if we come over and we look at the index scan over here, again, not the tooltip. We’ll see that the rows ended up hashing out very nicely on this one.

Right? I mean, there’s a little outlier here. Like, this is maybe not some, maybe not perfect, but way better than all 44 million rows ending up on a single thread. So, this is sort of at the intersection of fixing parallel skew and writing batch mode friendly queries, right?

Somewhere in there. But, yeah. So, you know, obviously, this is happening in batch mode. If you look up there, this sort will be happening in batch mode.

And since this sort is a child aggregate of this batch mode window aggregate, this sort operator is a child operator of the batch mode window aggregate. There we go. Batch mode sort, blah, blah, blah, blah, blah.

Then the window aggregate can read from the batch mode sort using parallel threads. Usually, the producing side of a batch mode sort will be single threaded, but not when there’s a window aggregate involved. And so, we get batch mode pretty much across this whole thing.

And this is exactly what we want. It’s just batch, batch, batch all the way down. And we get a much faster query with much better thread distribution. So, when you’re dealing with parallel skew, particularly like the kind that you cannot resolve naturally, maybe by forcing a different dop.

Again, particularly odd number dops tend to fix stuff like this. But it didn’t happen for me here. So, I was forced to rewrite the query in sort of a weird way.

But, you know, I’ve used the values trick in other contexts before in order to get different query transformations from things. Usually, you would take values and cross-apply to them. But doing cross-apply here, or it would be outer-apply because it’s a left join.

Using outer-apply here produced the same skewed parallel nested loops plan as before, which didn’t buy me anything. But doing the join, that means you can’t correlate inside of this. You have to do it outside of this.

So, you actually force SQL Server to do all this stuff and then join onto it here and then correlate to the user ID that gets produced here after that. So, it was a much more efficient use of threads and times. And, of course, getting batch mode involved was a very big help in making this thing go much, much faster.

Because we are processing a lot of rows, right? 44 million rows. You want batch mode for that many rows. That’s a good number of rows to batch up and not deal with on a row-by-row basis.

Because internally, that’s what row mode is. One row at a time. And batch mode is, of course, multiple rows, SIMD instructions, CPUs, and all that. So, this was just a fun way to rewrite this query to, A, fix the parallel skew, and as a happy side effect.

A very happy side effect. There are no fiascos, just happy screw-ups. I’m going to skip the alliteration on that one.

But this is just a happy accident that batch mode also got involved when I changed the format of the query. Anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. And I will see you over in tomorrow’s video where we will abandon all parallel skewness. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

How Many Plan Variants Can You Get With The Parameter Sensitive Plan Optimization In SQL Server?

How Many Plan Variants Can You Get With The Parameter Sensitive Plan Optimization In SQL Server?


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Faster Better Wronger

Faster Better Wronger


Video Summary

In this video, I delve into an often-overlooked aspect of query tuning: ensuring that the results are correct and sensible. Erik Darling from Darling Data highlights how optimizing a query for speed can sometimes lead to incorrect or nonsensical results if the underlying logic is flawed. He demonstrates this with a real-world example, showing how a poorly written query can produce absurdly large numbers due to incorrect joins, leading to misleading data. The video walks through fixing these issues by rewriting the query and emphasizes the importance of validating both the correctness and sensibility of the results, especially when working on complex queries that affect multiple tables.

Full Transcript

Erik Darling here with Darling Data. And we’re going to talk today about an important aspect of query tuning that does not get enough attention. And that is making sure that the, not even necessarily, I mean, of course, necessarily the results of the query that we made faster, like making sure that those are correct. Like maybe comparing those to the results of the original query, but also making sure that the results of the original query that doesn’t really make any sense whatsoever. Because despite all of the good intentions that exist in the world, people may write queries with incorrect logic. They might write queries in a way that does not return sensible data. We all need sensible data in our lives. I’m not talking, of course, about no-lock hints, which only dangerous and irresponsible people use in their production queries. We’re talking about just general logic and correctness. So we’re going to talk about that a little bit today.

Down in the video description, you will see all sorts of helpful links. You can hire me for consulting by my training, become a supporting member of the channel, be an official Darling Data supporter. And you can also find links to Ask Me Office Hours questions, which are, of course, free. Yay! And then, the usual, if you enjoy this content in any way, shape, or form, however minor, please do like, subscribe, tell a friend so I can get a bigger YouTube or something.

Okay. Seattle. I’ll be in it. November. With Kendra Little doing two days of the most fantastical T-SQL pre-cons that the world has ever known. I’ll have all sorts of the swaggerific new stuff there, stickers and t-shirts and whatnot.

And I will also be giving all attendees access to all of the T-SQL training. So you’ll have that going for you, too. So think of all the wonderful things you’ll get for free just by showing, by having your employer pay for you to show up. Right? It’s a good exchange. This is absolutely bribery.

So let’s go talk about query correctness stuff. Now, that’s the wrong thing. Cool. All right. Let’s pretend that didn’t happen. And let’s go over here and let’s look at this query. This query is slow. Right? If we look at the execution plan for this query, it ran for one minute and 23 seconds.

Not a good time that right. Took a long time to run. We can see all sorts of strange things happen in this query that we don’t like. Right? All sorts of things that we would never want to see. Right? We have bad estimates and we have long running nested loops joins because of those bad estimates.

And everything just goes to hell in a handbasket. But what really goes to hell in a handbasket are the results. Someone wrote this query here. Now, we could do a bigger top here, but we would be here for a long time.

This took a minute and 23 seconds for one user. If we had more users, well, by God, by the time this thing finished, I’d probably be packing it up to go to Seattle, which is in about a month now. Actually, exactly one month now. I should probably book that travel.

But someone wrote this then, you know, we get the top one user and we select from that CTE and we join to posts and then we join to comments. But the way that these joins occur, we have this sort of incorrect multiplicative effect on things. Right? Look at the numbers we get back. John Skeet, despite being an absolutely prolific human being, does not have 236 million total posts and comments.

And these posts and comments scores are also very large numbers. These are like almost foreign phone number long. Right? They are crazy. So how did this happen? Well, let’s say that we wanted to rewrite this query in a way to make it faster. Right?

That would also sort of show us why we have this bad multiplicative effect on things. Right? And why we got such large numbers. So if we rewrite the query like this, right, we’re going to do the same thing here and we’re going to, but we’re going to separate the join to posts out. We’re going to pre-aggregate that.

And then we’re going to cross join to comments. Right? And we’re going to correlate the cross join to the, the user’s CTE down here. So let’s run this query and let’s, let’s, well, first we’re going to note that it is much faster. Right? That finishes instantly, not in a minute and 23 seconds. Good for us. And we get the same results back.

But the only way that we can get the same results back here is by giving like, again, the multiplicative effect. We would have to multiply post score by the total number of comments. And we would have to multiply the comment score by the total number of posts.

And when we would have to multiply the comment count times the post count in order to get the numbers to match. And that’s probably not right. Right? Because we know John Skeet does not have 236 million posts and comments.

And we can validate that. Right? If we go back to the source data and we say, hey, John Skeet, how many posts and comments do you have? Well, these are, these are the numbers that live in the actual tables. Right?

We have about 331,000 post score. We have about 14 and a half thousand posts. We have the comment score of 46,000 and we have a comment count of about 16,000. So like when you look at all this, you might wonder, well, how did we end up with those crazy numbers? Right?

And it’s really because of the way that the two inner joins work. Right? They’re both one to many relationships. And when we join the one to many relationships together, we get many more relationship. Right? Like if we come back to the original query, we joined, where was it? Oh, how far? Oh, those are indexes.

There we are. We joined that CTE to posts. Right? And on owner user ID to ID, that’s a one to, that’s a, well, I guess then backwards, that’s a many to one relationship. But then joining to comments, that’s also a many to one relationship. So we really get sort of this, I don’t, I don’t want to use the word Cartesian product, but I do want to say it sort of explodes the results in an incorrect way.

Where joining posts and comments on these things in the same sort of set of joins, it blows the results up and gives us incorrect numbers. Right? That like these sums and counts here, we’re just way, way, way out of proportion. So if we wanted to rewrite the query in a way to give us just what we wanted back, you know, we could, we could rewrite it like this. Right?

And we could just stick the select top one and join to the pre-aggregation on posts here and then join to the aggregation on comments here. And this would not explode the results. We would still have to add post count and comment count together in order to get the total like posts and comments. But that’s not so big a deal. And this number is correct. And these numbers match what we had in the source table for that one user.

We could also just have taken our original query and not blown things up with the multiplicative effect that we saw from the, the two jo- the two joins that we did that exploded our results earlier. And we would get, of course, correct results back from this. So when you’re writing queries like this, it’s so for me, as a consultant, it’s hard.

And this is why I do like prefer to like do things with like on zoom with like people who have more domain knowledge. So like when we’re reviewing results of things, we can sort of like, you know, like sanity check stuff on our own and be like, like, I can say, hey, like, I don’t know this data very well. You probably know it better than I do. Is there any way that like these numbers are accurate?

Cause this looks funny to me. Like, like, like, you know, like I usually like on a sales call, someone would be like, we’re a blah, blah dollars a year company. And I’m like, cool. Want to pay me more? Like, oh, you, so you have money. Great.

So like, but you know, like then like you might see some reports come back and like, you know, you’ll see like one customer has spent like $10 billion in the last month. And you’re like, well, you sure? Cause that sounds, sounds a little wrong, but these are important things for you to spot check and validate as you are making queries faster. Is that the, not only are the, are the results correct in that, like, maybe like they match whatever previous thing, but also that the results make sense.

Right. Cause you don’t want to give people data that is not sensible back. Right. Doesn’t do you any good to write a faster query that still gives you crappy results. Anyway, that was all I had to say today.

All right. Mind your data. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I think this is dropping on a Friday. So I will see you next week with the office hours, Monday video.

All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 39

SQL Server Performance Office Hours Episode 39



Questions:

  • What are your favourite alternatives to alerts for Page Life Expectancy? I know that I have memory issues, but alerting on high I/O waits seems silly.
  • You mentioned you often prefer select into and adding the index later for parallel inserts. Not having real control over the nullability is causing me issues with creating primary keys after select into. Is a unique clustered basically the same thing?
  • You look like you’re bulking up again. I thought you were dying. Are you on steroids? What is your routine?
  • What are your lest favorite things to see in a query plan?
  • I forgot to ask how much protein you ate with steroids

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five user-submitted questions for free during a live office hours session. We tackle topics ranging from alternative methods to monitor page life expectancy when dealing with memory issues, to the pros and cons of using `SELECT INTO` for parallel inserts and adding indexes later. Additionally, we explore what I consider less favorable elements in query plans, such as eager index spools, unnecessary scans, and parameter values embedded directly into the query text without being reflected in the plan properties. The session was packed with valuable insights and practical advice, making it a great watch for anyone looking to improve their SQL Server skills. If you have any questions or topics you’d like me to cover next, feel free to drop them in the comments below!

Full Transcript

Erik Darling here with Darling Data and today we’re going to do office hours in which I answer five, 1, 2, 3, 4, 5, that’s this many fingers, user submitted questions for free, right, that’s a whole point I guess. Down in the video description if you would like to ask a question there’s a link down below where the fingers are pointing down below just look, look down, right, my links are down here pal. And you can also find links there if you if you would like to hire me for consulting, buy my training, become a member of the channel in a paid way to support my endeavors here. And of course, the usual, if you like this blah, blah, blah, like, subscribe, tell a friend, you know, that, that. This Seattle, I’ll be in November with Kendra Little teaching two days of the finest T-SQL pre-cons you have ever seen. I will have all sorts of new swag-er-ific items for you, T-shirts, stickers, training materials. So there is no reason for you not to show up, is there? No, you can’t, you can’t give a good re-, you can’t give me a good reason for that. So, let’s answer these questions, right, let’s do, let’s do this thing, cuz I guess I promised you I would. Uh, what are your favorite alternatives to alerts for page life expectancy? I know I have memory issues, but alerting on high IO weights seems silly. Well, if you have memory issues, there are certainly going to have high IO weights, so why is that silly. Um, alerts? I don’t know, like, I, I don’t, I am not in a situation where I receive alerts from servers, so I don’t think a lot about what I, what I would, uh, what, what, like, alerts I would send myself, uh, in, in the course of the day. Um, you know, for me, it’s, it’s more general monitoring. Um, you could look for spikes in page IO latch SH or EX weights. Those would be worth, uh, worth monitoring.

I think for me, uh, if I was sure that I had memory issues. Um, I also may want to alert on, um, when a significant amount of memory goes towards query memory grants, perhaps even when resource semaphore weights happen. Uh, those are things that I would rather alert on than PLE, cuz PLE is a stupid dump. All right. Uh, oh, I should have, I should have highlighted that question. There we go. Yes. Um, anyway. Um, these are usually how memory issues manifest themselves. So, uh, that’s, that is what I would keep an eye on. Uh, uh, uh, uh, here we go. You mentioned you often prefer select into and adding the index later for parallel inserts. Not having real control over the nullability is causing me issues with creating primary keys after select into.

Uh, is unique clustered index basically the same thing. Uh, so to, to hopefully remedy your first situation, if you wrap whatever column you wish to not be null, uh, in your select list in the is null function and give it an appropriate replacement, uh, your, the resulting column will, uh, have a not null, uh, thing to it. So use is null. You can’t use coalesce for it. Coalesce doesn’t work, but is null. Uh, if you wrap your column in that, the result will be a not null, uh, not nullable column. Um, is unique clustered index basically the same thing? Yes. Aside from the fact that a unique clustered index will allow one null value in there.

So, uh, depending on how you prefer to approach this, uh, you could add a unique clustered index. If you’re sure there are no nulls in there, or you could also wrap the column in, uh, is null and have a better time. Okay. Uh, come on, cursor. Where are you? Oh, you’re over there. Ah, there we are. All right. Uh, you look like you’re bulking up again. I thought you were dying. Are you on steroids? What is your routine?

Uh, all right. Well, um, I guess thanks for noticing. Uh, I, I, I, I have, I have been, uh, a little bit better about, well, actually I’ve always been good about the gym, but, um, um, I, I have been good about, um, I guess more, uh, progressive training. Um, I am not on steroids. Uh, I, if, if I were on steroids and I still looked like this, I would get my money back. Uh, my routine is generally, uh, basic barbell exercises, squats, deadlifts, bench press, overhead press, uh, depending on, uh, what I’ve done.

Uh, you know, deadlifts typically one set of five squats, either a three sets of three, three sets of five, uh, somewhere in there. Um, if, if I’m feeling real wiped out, I might do a lighter weight on squats with three sets of eight, uh, or three sets of 10, just to get some extra work in. If, if, uh, my legs are feeling particularly burned out, uh, overhead press, I do one day of, uh, 10 singles at a pretty heavy weight.

The last, the last one I had was, uh, 200. So, uh, for the 10 singles. And then, uh, I’ll also do a slightly lighter day on overhead press for, with three sets of five. Uh, also in the mix, I’ll, uh, usually make room for some rows, some, uh, chin-ups, uh, in there.

Uh, occasionally, uh, I guess RDLs. Those are, those are nice hamstring work. Uh, if you get them heavy enough. And, uh, yeah, that’s about it. I don’t, I don’t, I don’t do anything. Uh, I don’t have like a, a bodybuilding routine with lots of hypertrophy and whatnot mixed in.

Uh, to do. What are your less favorite things to see in a query plan? All right. Lest, lest we forget. Uh, I don’t know. Um, okay. Let’s think of some stuff in here. Uh, I hate when I see spools in a query plan, uh, especially eager index spools.

Um, uh, lazy table spools are also not, not something that I usually love seeing. Uh, I don’t like when I see really big lines going into a sort operator. That usually tells me we have, we have an issue.

Um, uh, let’s see. Uh, I dislike seeing a top above a scan. That’s another one that I don’t like to see. Uh, and I don’t like when I see, uh, the sort of constant scan, um, concatenate stuff, sort merge interval, and then a nested loops join.

Because that usually tells me that, um, we either have a join with an or clause or that we have, um, a mismatched data type. That quite frequently, if you have, like, let’s say, a date time, uh, uh, date time to column, and you have a date or a date time parameter, SQL Server has to do some extra work to, to work things out. So those are things that I typically dislike seeing.

Um, of course, you know, the usual stuff, um, unnecessary scans, right? Like you have a, you have like a scan where there’s a predicate on it that’s not a bitmap. You’re like, oh, maybe I should create an index there.

Sometimes you’ll get the helpful little green text, sometimes not. But, um, you know, those are things that I typically dislike seeing. If I have an actual execution plan, uh, that’s a little bit different because then I have operator times to sort of guide my tuning efforts in a, in a smarter way. Uh, estimated plans, you know, you can spot some red flags in those, but typically, um, you know, you’re like, eh, uh, it takes a little bit more effort to figure things out.

Um, table variables, uh, dislikes, uh, table variables and plans. One thing that I, uh, really dislike in, uh, seeing an execution plans is when like there are parameter values, uh, like as part of like the query text. But then you go and you look in the, the, like you get the properties of the root operator in the plan and there are no parameter values in there.

Cause that usually tells me, uh, unless there’s a recompile hint where the literal values get, where the parameter embedding optimization kicks in and you get literal values in the query plan. That usually tells me that someone is using local variables and I think, hmm, I shake the fist at you. Um, not necessarily because the local variables are causing a problem, but just because I can’t get the values for local variables.

Cause SQL Server does not sniff and use them, right? Like it does with a parameter. So those are things I dislike seeing. There’s probably others if I thought, uh, I forgot to ask how much protein you ate with steroids.

Uh, all right. So once again, I’m, I’m clearly not on steroids. Um, I, I, I, I, it’s not, not my thing.

Um, you know, life is hard enough without trend making me emotional. Uh, but as far as protein goes, um, I drink two protein shakes a day, uh, that are 50 grams of protein a piece, uh, unflavored Isopure, uh, powdered Pedialyte, some, uh, psyllium husk to, to keep things. Uh, orderly.

And, uh, I, I have the scoopable creatine. So I’ll do 10 grams of creatine a day and across the two shakes. Um, and then of course you must, you must eat real food. What lest you become a skeleton.

Uh, and I usually try to get around another hundred or so grams of protein from actual, uh, meat sources. So chicken, steak, stuff like that. Uh, not a big fan of pork.

Um, not a religious thing. Just, you know, it’s not, it’s not that it’s bad. It’s just, I, I like other stuff better. Uh, so I think, I think that about does it.

Interesting mix today. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video, uh, whenever that may be. Uh, I guess tomorrow.

All right. Thank you. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Why DISKANN Indexes Are So Slow To Create In SQL Server

Why DISKANN Indexes Are So Slow To Create In SQL Server


Video Summary

In this video, I delve into the intricate process of creating vector indexes in SQL Server, specifically focusing on the background tasks and query hints involved. By examining the detailed query plan generated during index creation, I uncover a series of unique trace flags, parallel plan preferences, and other hints that are not typically available to users. This exploration reveals how these elements contribute to the complexity—and current slowness—of vector index creation, while also hinting at potential optimizations for future versions of SQL Server. Through this analysis, viewers gain insight into the inner workings of SQL Server’s query optimizer and the advanced techniques used during index builds that could impact performance and usability in various scenarios.

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we are going to endeavor to answer an important question. Hopefully a temporarily important question. But why are disk ANN or disk ANN indexes so slow to build? These are, of course, the new vector search indexes that Microsoft has given us in SQL Server 2025. And of course, the usual spiel does apply here. This is, this is all pertinent to RC1. So release candidate one of SQL Server 2025. I believe this will be the last one until we get an RTM since in about a month, there will both Microsoft Ignite and past data summit will be going on. And usually that’s about when an RTM build drops. So things could change by then, but at least for the next three or so weeks, this is what happens in the background. Before we get into that, down in the video description, all sorts of helpful links. You can hire me for consulting by my training, support this channel. You can also ask me office hours questions. That is totally free. I answer five of those every Monday. And if you enjoy this content, you should probably like subscribe and tell a friend. May not want to pass this one on to anyone who works at Microsoft for now. Speaking of past data community summit, like I said, yeah, that’s got, that’s happening in beautiful Seattle, lovely this time of year. November 17th to 21st, I will be delivering two days of T-SQL pre-cons with Ms. Kendra Little. They are going to be the best T-SQL pre-cons that have ever taken place in all of human existence. So you have that to look forward to. Anyway, let’s go.

Look at this. Now I am, I am officially in two SSMS spawns territory with how busy I am with things, but that’s a, that’s a good thing, right? Busy is good. So I want to go over a couple of things and then show you the real, the real spicy stuff. So I am using, so start things way back. My, my dear friend, Joe Sack, my bosom buddy, Joe Sack has been doing a lot of work. And the, the AI, the AI, the AI space with things. He’s a, he’s a, he’s a bit more keen on it than I am. But, but he’s been messing around with, with the vector stuff in SQL Server. And so all credit to this database existing goes to Joe Sack. I’ll, I’ll have a link both to his, uh, substack where he, where he’s talking about all this interesting stuff. And, um, the GitHub repo where he keeps this database. If you want to play along, play a lot, play around or play along or whatever you want to do, whatever form of play you’re into. I’m not sure.

Uh, but Joe had a post, uh, actually yesterday, uh, was it? Well, I mean, yes, yesterday is, as I’m recording this, but you know, temporal elements are all, all screwy. When we start publishing things online, uh, about a query that, about like a top something query with vector search. It wasn’t slow. It was just, you had to ask for a much bigger top than you thought, um, in order to get back results.

Uh, but anyway, uh, I saw the query and I was like, wow, that’s messed up. Like there’s gotta be a way to fix that. And, um, it turns out there, there, there wasn’t cause of the way the vector search thing works, but, uh, it all sort of led to this. So, um, what I’m going to do is I’m going to, uh, so Joe has a table in his database called properties.

Uh, and I forgot an S there, didn’t I? There we go. All good. Brand new script, you know, not even saved with the file name yet. So we’re all, you know, a little, a little dicey there, but, uh, I’m just going to create a 20,000 row table based on Joe’s table that has stuff in it.

Right. So let’s drop and do that. Right. So 20,000 rows. And, um, now what I’m going to do is create the vector index. And one thing I want to point out here that’s sort of interesting is I have, uh, actual execution plans enabled.

So when I run these two queries, I of course get it in a, uh, a query plan for the, uh, creating the table and selecting data into the empty table, whatever, whatever select into really does. And then, uh, adding the, um, primary key to it. Right. So the first two things that I do there, cause you need a primary key if you want to create a vector index.

So now I’m going to create the vector index and this, keep in mind, this is 20,000 rows. And while this is not like the biggest beefiest VM that’s ever existed, it’s about 10 gigs of memory, the four cores, something like that. This is 20,000 rows. Right.

And then we’re waiting and we’re at 10. I mean, if you, if you look, uh, sorry, this elbow, you can see like 14, 15, 16, 20, 20,000 rows is pretty slow. So, and I did get to wondering why 20,000 rows was so slow and I’m going to show you in a moment what’s going on while a vector index builds.

But, uh, what I wanted to show you before that are maybe some other things that, um, you, you, you maybe won’t be so thrilled about and maybe something you will be so thrilled about when with vector indexes. So this all took about 30 seconds for 20,000 rows, which is not, not a good time, uh, to be honest with you. But of course, RC1, all this stuff gets better changes, you know, just currently things not so hot.

Um, so just like columnstore indexes, uh, when they first came out, adding a vector index to your table, uh, does not just make the, the column that’s indexed read only. It makes the entire table read only. So if like, if I try to update a column in the table, it’s called bedrooms, right? And I remember I created the index on this description vector column and, uh, I guess SSMS still hasn’t quite worked out all the like new stuff going on in there.

But, um, at least SSMS 21 has it, maybe SSMS 22 has, I should give that a shot someday. It makes the entire table read only, right? So data modification statement failed because tiny properties has a vector index on it. So you like the entire table is read only like you can, it’s not just like the vector index column, which is, I don’t know, maybe not great if you have a transactional system.

Uh, but two nice things that I do want to say about vector indexes is that they do not support rebuild or reorganize. So you cannot rebuild or reorganize a vector index. So I suppose Ola will have to get to work making sure to exclude vector indexes, at least as they currently live.

Um, but anyway, what I really wanted to show you today, what the, the real meat and potatoes stuff is what’s going on in the background while a vector index builds. Okay. So we’re going to start creating this and we’re going to run a little query over here that shows us what’s happening while a vector index creates.

Um, SP who is active strangely did not pick up on this, but, um, you know, whatever. So, uh, we get a query plan and some query text back. The query plan actually shows us a query plan for what’s going on.

Like when I created the thing over there, there was no query plan when I hit, when I said, Hey, like make this index. And all of a sudden the query plan was missing. So this is like the query plan.

And if we zoom around, we’ll see, uh, you know, uh, there’s tiny properties and here’s the vector index seek. We’re doing something over there and well, there’s tiny properties again. And then if we scroll over this way a little bit, we will have a clustered index seek into this graph edge table, 120, something or other.

And apparently we’re, uh, we’re missing some statistics on this table. I don’t know, Microsoft, you ought to create some statistics on that one. I think, but, uh, anyway, the query that’s running while all this happens is not a typical create index query.

No, no, no. It is this whole wacky wild adventure. So while that index creates, right.

I’m just going to format this stuff a little bit. So it fits on screen a little bit better. Um, these are some, I guess, parameters that get passed into whatever we’re doing. Um, we declare dop, right, which is an integer and equals four.

Um, I think that might be, um, I think that might be related to my system, particularly since I have four CPUs in there. And then we start running this query. So we have a, if random start ID equals zero, we do this, uh, and we get some stuff in here.

And this, this, to me, this is interesting. So the, the query that’s running in the background uses the enable parallel plan preference hint. This, this hint, uh, has up to now been completely unsupported, almost disavowed by Microsoft.

This does not show up in the valid use hints DMV. So maybe this is a sign that this hint is now safe for all of us to use in production. Right.

That’s exciting stuff, right? That’s big news there. Uh, and then we, uh, well, let’s see, we select a top one thing here that Microsoft might need to get itself a copy of SQL prompt because some of this formatting is an atrocity. Uh, I can, I can, I’m willing to share my license with you, Microsoft, if you’re, if you’re really feeling, I know there’s been a lot of layoffs lately.

So if you’re really feeling a budgetary pinch, we can, we can talk later. You just let me know. Uh, so we get the property ID and we’ve got a table sample here with repeatable one.

Hmm. All right. And then, uh, this is where I guess things start getting interesting. This is that graph edge table. Uh, I guess at least one of them that we saw in the query plan.

And we, uh, insert some stuff into there, right? Uh, we select some properties and, ooh, cast that as binary 192. Very exciting stuff.

And then we do this row per page and oh boy, we are, we are using the fizz lock in here. My goodness. Huh? There we are. We are fizz locking things. That’s, that’s a fun one.

All right. Uh, and then, uh, apparently down here. So like the, the query plan we saw was most likely the stuff that’s coming up down here because there is a bit of a loop or something that goes on, I think, or at least, uh, you know, in RC zero, there was a loop.

I forget what they do in this one. Maybe. Oh yeah. There’s a batch start and batch end. Anyway, uh, there is some interesting stuff that goes on in here. Um, uh, what was it? Uh, did it, did, uh, maybe it’s down here.

Uh, there’s a, uh, there’s a function that we can’t use is disc and prune, right? So this has got a whole within group thing. So this is some kind of like probably internal windowing function that, uh, that, that Microsoft has access to from like a system thing that we don’t have access to.

I haven’t tried any of that yet. Uh, and then down here we have a query with a bunch of hints on it. We go to merge union, very exciting stuff. We have our parallel plan preference hint again.

I don’t know. This is feeling very production ready to me now. And then we’ve got some trace flags. Now, some of these trace flags I am familiar with. So there’s stuff for like, uh, nested loops batch sort, disabling nested loops batch sort. There’s stuff for disabling spools.

There’s stuff for disabling nested loops pre fetch. Yeah. But this one’s 7454. I don’t, I don’t know what that one does. Be honest with you. I just don’t. Um, I don’t know every, every single trace flag, but then we exec sp vector index build update stats.

We update stats, uh, after doing that. And then, uh, we have a batch start where we select the max property ID and some batch size stuff in here, right? All good things.

Uh, and then, uh, I don’t know the sort of some of this stuff, uh, more batch size stuff. And then where was, there was something else. There was something else in here that was very interesting to me.

Um, uh, where was it? It’s not this thing. I mean, this is sort of the index creation stuff that goes on in the background, right? So like when you create a vector index, you tell SQL Server, whether you want it to be, uh, cosine or one of the other mathy things that they have, but then there are some options that you, at least currently, I don’t see a way to specify them, but, uh, like stuff like this, that shows up in the sys dot, uh, whatever vector indexes DMV there is, but it doesn’t show up as like a creation element right now.

But then actually this, this one was, uh, was interesting as well. So use hint, uh, disallow batch mode. Again, we have enable parallel plan preference, but then there were, there were some other funny ones and I don’t, I, I doubt these are going to be, um, available to us mere mortals, uh, out in SQL Server world, but parallel DML, so we can disable batch mode, enable parallel plan, enable parallel plan preference, parallel, enable parallel DML for vector index build and parallel DML using independent transactions.

So this is the, these ones I have not seen before. I don’t, I doubt these are valid use sense for us, but, uh, I might go and try those. And then we have, uh, the same query trace on stuff that we saw before. So I’m not going to explain that one again, but then we are optimizing for, uh, certain values for the batch start and end.

And, uh, then there was some other interesting stuff in here, uh, like this pack int function. Do we get to use that? I don’t know.

I, I, I, I, I’m, I’m just seeing all this. So, uh, I, I, I want to figure that out too. And, uh, good Lord. Come on, put cross apply in a new line. All right.

So then, uh, well, there’s unpack, there’s unpack int. So we get to pack int, pack an int and then unpack an int. That’s a fun, that’s a fun pairing of things. And down here there was a nut. Well, there was disk and prune, but then there was, oh, where was it?

Uh, I don’t know. This, this stuff, this stuff all gets rather difficult to read. But then, uh, you know, when, when people tell you to be careful with query hints, my Lord, Microsoft. My Lord.

Uh, we have some of the same things that we saw before with the, um, uh, disabling batch mode, enabling parallel plan preference. Uh, I apparently deleted something there that I shouldn’t have. There we go.

Line that back up. Uh, and then the parallel DML stuff that we saw before. And, and here’s a, here’s a fun one. Uh, Microsoft turns off eager index spools in this query. They’re like no eager index pools for you.

Select. So with sell to index on the fly, uh, that is the, that is the optimizer rule that would build an eager index pool in your, in your query plan. So Microsoft apparently anti eager index pool as well. So high five Microsoft.

We share, we share, we share a mutual loathing for the eager index pool. And then at the end, of course, we, uh, we, uh, we, uh, we sp vector index build update stats again.

So right now I would imagine that like, since this, this is all like very loopy stuff going on or like, um, I imagine that this, this kind of thing is why, um, building vector or disc and indexes is so slow currently.

Um, I would also imagine that this is why they are also currently, this also currently makes your whole table read only. Cause if you were to like run a command that changed stuff in the table and you had to run all this stuff over again to like figure out, like, like, like rebuild your vector index, um, or like update your vector index, like during like regular transactional processing, that would be kind of a nightmare.

So I understand why we are not, we do not currently have, um, writeable, uh, disc and indexes. Cause that wouldn’t be fun, but, uh, I really do hope that this stuff gets sorted out. Um, if not by RTM in three weeks, no pressure guys, uh, then hopefully in the near future, because, um, you know, uh, it would, it would be a terribly uncompetitive thing for a SQL Server to not have, uh, writeable, uh, vector indexes.

Um, and like, again, trade-offs and stuff, right? Like Oracle has writeable vector indexes. Postgres has them.

Elastic has them. I forget Mungo, but like lots of other database engines have writeable indexes. I don’t know what trade-offs they made in order to accomplish that. Perhaps Microsoft, there were, there were certain trade-offs Microsoft was not willing to make, uh, in doing that, or perhaps there are other limitations involved that, um, we, we, we, who do not have the source code are unaware of.

So anyway, if you, if you, if you’ve heard people talk about why, or talk about, uh, disc and indexes being slow to create, uh, hopefully this answers your question as to why they are slow to create.

Um, I think it’s, it might be worth, uh, looking at this. So this is the full properties table or a full or properties table. I, I, uh, there were some problems earlier.

Uh, I don’t want to talk about that, but, um, there are 400,000 rows in the pro, like I just reinserted some extra ones. The one that you download from Joel only have 100,000 in it.

So you have to beef the table up a little bit. But, um, when you create, when I tried to create a vector index on the full properties table, the, um, the query looked a little bit different. Um, I think it, I think it was doing more to batch it, but, um, it might not be terribly interesting to go find more of that in here at this point.

But I don’t know that, that looks kind of the same to me. But anyway, uh, at least now we know we have some answers as to, uh, why building the disc and index is not terribly fast as of RC1.

Okay. As of RC1 disclaimers and all that. Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video where we will, um, I don’t know yet, but actually I don’t, I don’t have anything else planned.

So it’ll, it’ll be a surprise to you and me. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.