A Little About Parallel Insert Restrictions In SQL Server

A Little About Parallel Insert Restrictions In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the intricacies of fully parallel insert restrictions in SQL Server, particularly focusing on a lesser-known issue that can impede performance. Erik Darling from Darling Data shares his insights and experiences, highlighting common pitfalls such as non-enlightenable scalar UDFs, triggers, index views, and the use of `SET ROWCOUNT`. He also discusses how columnstore indexes can affect parallel inserts due to row group compression requirements, suggesting strategies like using `ROW_NUMBER()` instead of identity or sequence columns for better performance. Throughout the video, I provide practical examples and tips on optimizing your insert operations, ensuring you get the most out of SQL Server’s powerful features.

Full Transcript

Erik Darling here with Darling Data. It just never gets better. Never gets worse either. In today’s video, we’re going to talk about fully parallel insert restrictions. There’s pretty good documentation in general around the stuff that will prevent you from getting a fully parallel insert plan.

But in this case, I’m going to talk about one that I haven’t seen written up anywhere, which I’ve seen bite a lot of people in the butt. Now, fully parallel inserts are very, very useful if you need to move a lot of data from one place to another, especially into temp tables because a lot of people use fairly big amounts of data in temp tables.

And it’s usually a good thing if you can do that insert across multiple threads because you can often reduce the total runtime of your insert query when it does that. So it’s a good thing to have.

It’s a little trickier when you start dealing with inserts into tables that have columnstore indexes on them, especially clustered columnstore indexes, because of the compression stuff. Now, for columnstore indexes, you need to have a certain number of rows get inserted for it to qualify for compression on insert.

Otherwise, it ends up in these uncompressed row groups. And so if your fully parallel insert splits the rows up too finely, if there’s too few rows per thread, each one of those threads will end up inserting into a new uncompressed row group.

And that can slow things down, actually. But for inserting into either a rowstore heat table, I’m not sure if there’s a columnstore heat table. Probably not.

Or inserting into a rowstore clustered index. That will certainly be beneficial even with multiple threads because you don’t necessarily have to worry about any of the columnstore stuff. But before we do that, of course, we need to do my usual song and dance about me and you and our lives together moving forward.

If you like this content, because it’s free, you can continue to appreciate this free content by liking and commenting and subscribing. If you would like to support this channel and my endeavors to bring you high quality, in-depth SQL Server information, you can sign up for a very low cost membership.

There’s a link below in the video description. And if you can spare four bucks a month, you know, that’s cool too. You will pay for… Well, I don’t really go out for coffee.

Most of my coffee consumption is via my espresso machine. But if you want to help me pay off my espresso machine, you can give me four bucks a month. If you need help with your SQL Server, if you are having performance or health issues or anything else really, let me know.

My rates are reasonable. If you need some high quality, low cost SQL Server training, because you maybe don’t want to hire someone, you just want to learn and do it on your own. Maybe you just want to get better at SQL Server generally.

You can get all of my training, 24 hours of it. Beginner, intermediate, and expert level training for about $150 US dollars for life with that there discount code. There is a link, of course, in the show description, video description.

I’m almost sure if this qualifies as a show to do that. If you would like to see me live and in person, you can do that in Seattle, Washington this November 4th and 5th. I will be there co-hosting two days, not one, but two full days of performance tuning madness with Kendra Little.

If there’s an event nearby you that you think I would be a good fit for, let me know what it is. If I can get a pre-con submission accepted, I’ll show up and do my song and dance just for you. I mean, if one person buys a ticket, I’m not sure that that would qualify.

Not just for you, but for hopefully a room full of people eager to learn more about SQL Server performance. And with that out of the way, let’s do some stuff. Let’s be festive. Let’s enjoy our, let’s enjoy ourselves what we can.

Now, a while back, there was a blog post about this published at that URL. That will also be in the video description. It’s not there yet because the video doesn’t exist.

But once it does, that link will be there. And there are some good things, there’s some good information in that blog post about what’s not allowed. So if you’re on 2016 or Azure SQL DB or Managed Instance, you need to be in compat level 130 or better in order for this to take effect.

Well, that’s partially true. You can use the option use hint database compatibility level thing and set that to 130 for this to work. There’s kind of a funny note after that that says, In addition, it is recommended to use a suitable SKU from the premium service tier to ensure that the IO and CPU requirements of parallel insert are satisfied.

Well, you know, if, you know, Microsoft, if I have to buy a premium service tier to make sure that a parallel insert is, you know, properly satisfied from a CPU and IO perspective, I think we need to talk a little bit about how you’re doing these service tiers. I think you might be taking us for a ride a little bit.

Hmm? I think maybe this is kind of crappy rent seeking behavior on your part. You know? Just saying.

So there’s some other more common things that will prevent a parallel insert, like the usage of any non-enlightable scalar UDFs. So that would be a problem generally. If there are triggers on the table or index views reference the table, then the parallel insert goes away because that would really make life difficult on both triggers and index views.

I can only imagine the insanity and deadlocking that might occur. Set row count. If you’re still using set row count, God bless.

It’s time to join the future. And by the future, I mean like 2012 or something. Don’t use set row count.

Goofy. The output clause only affects the table that is being output to. Now, this is, I mean, well, it’s not exactly a departure. So with output, you do have to be careful because if you output to the client, right?

So like if you just put an output clause in your, you know, modification query, insert, update, delete, or merge, and you just return that data out to SQL Server Management Studio, then that will force the entire plan to be single threaded. Ditto if you output to a table variable because outputting to a table variable has the same restrictions that modifying a table variable does in any other way, where modifications to a table variable force the whole query plan to run on a single thread.

They disallow parallel plans. In this case, though, outputting to another table that is not, obviously not SSMS because it’s a table, or a table variable, just means that the insert from the output portion will be single threaded, but the insert select to the normal table will be parallel.

Parallel insert is only used when inserting into a heap without any additional nonclustered indexes. It is also used when inserting into a columnstore index. So that’s fun there.

You know, I could have sworn that I’ve seen it for empty rowstore clustered indexes, but, you know, I could be goofy on that. In any case, it might be a good idea to, if you are the type of person, and I like the type of person who thinks about indexing temp tables, it might be a good idea in that case to do your insert and then create your clustered index later.

Sometimes the timing difference is worth it. So, you know, often, you know, if you do the insert with a empty clustered index on there and you don’t get the fully parallel insert, you might want to do the insert, then create the clustered index, because it might, like, depending on how long it takes a clustered index to create, you might see an improvement there.

And also, watch out when identity or sequence is present, because either one of those will mess things up. What I greatly prefer for this is rather than use identity or sequence, and this is a neat trick that I picked up sort of along the way, is to use row number instead. Right?

So, if you are inserting into a table that has an identity or a sequence on it and you want to keep going over that, you do need to sort out what the current max value of that is before you, you know, do the row number thing, because you have to add that value to the row number.

If you just want to, like, dump a whole bunch of data into a table and you want to have, you know, just like a surrogate primary key on it, like you’re going to loop over stuff in the table or you just want it there for whatever weird, safe reason you find helps you sleep better at night, use row number instead of an identity or sequence on there because you’ll get better performance on the insert.

So, let’s talk about one thing that’s not in this list, because this is an important thing. So, let’s pretend that we have a really important process that wants to do a parallel insert and we create a table called parallel insert. Again, it is an empty heap table.

There is nothing, no indexing going on here. And if we look at the query plan for this, now I just have the enable parallel plan preference hint on here to make things easy. We’re going to insert into that table with a tab block hint because, you know, that’s always a good idea to make sure you get that fully parallel insert.

Select into will almost always get you that in compat level 130 or better as long as the plan qualifies. But, the tab block hint certainly helps urge things along. If we look at the estimated plan for this, we’ll see that the insert is before the gather streams, right?

So, we have our insert right here and then our gather streams right here. So, the insert happens within the parallel zone. The thing that’s not documented is if you do something like this.

So, let’s say that, you know, you do, you know, you do, you know, various passes on a table or something and you do, you reference your table somewhere else in the query. Like, you say, and not exists because you don’t want to put stuff in there that already exists. Even if, you know, both, like I haven’t put any data in the parallel insert table, so there should be nothing.

There’s really no not exist to happen. But if you look at this, you’ll see that now the insert is outside of the parallel zone. There’s a gather stream.

Now there’s a gather streams here and a table insert here. So, this would certainly prevent that from happening. So, a lot of the times when I see processes like this running, you know, I want to see if the parallel insert would be useful. I have to put data into yet another table and reference that table in the not exists in order to make sure that, you know, we aren’t getting screwed up there.

So, just a couple quick notes on parallel insert stuff there. You know, not a whole lot to say beyond that. I hope you enjoyed yourselves. I hope you learned something.

I hope that you will continue to watch this channel and keep me in bats company over here because I don’t know what we’d do without you. You would get very lonely just talking to each other all day, wouldn’t we, bats? Yes, we would!

Mwah! Love you, bats! Anyway, thank you for watching. Goodbye. Bye!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Another Trick For Working Around Scalar UDF Inlining Restrictions In SQL Server

Another Trick For Working Around Scalar UDF Inlining Restrictions In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into another fascinating UDF inlining restriction trick, this time focusing on scalar UDFs and their limitations when it comes to string aggregation functions like `STRING_AGG`. I explain how Microsoft introduced the SQL Server 2019 feature of UDF inlining, which aims to improve performance by treating certain scalar UDFs as inline code. However, there are restrictions—specifically with XML methods and the `STRING_AGG` function—that can prevent this optimization from occurring. To address these limitations, I demonstrate how you can replicate `STRING_AGG` behavior using XML without breaking scalar UDF inlining. The video also covers practical examples of rewriting scalar UDFs as inline table-valued functions to maintain performance benefits while avoiding the restrictions. Whether you’re looking for tips on optimizing your SQL Server queries or just curious about the intricacies of UDF inlining, this video has something valuable to offer.

Full Transcript

Erik Darling here with Darling Data, doing some Darling Data stuff. Boy, oh boy. My arm’s tired. In this video, we’re going to be talking about another UDF inlining restriction trick. I’ve talked to, I don’t know, I think I have a few other videos about how Microsoft around SQL Server 2019 introduced this new query processor feature called SQL Server. UDF inlining, where it would attempt to take some of your scalar UDFs and basically treat them as inline versions of the code, reducing some of the performance problems that you’d see there. Again, typical performance problems with scalar UDFs is they disallow parallel execution plans from the query that calls them, and they don’t run once per query. They run once per row that the function has to process, which could be way more or way less than you’re thinking of, then based on where the function is in the query. If it’s in the select list, it’s typically the number of rows returned by the query. If it’s somewhere weird, like in a join or where clause, you could have to process way, way more rows to produce a result and then compare it to either another column in the join or a predicate in the where clause. So before we get into all that, though, let’s talk about you and me. Let’s talk about getting serious here. No, not actually serious. We don’t, we, you know. Well, the things we talk about are serious. We don’t talk about them.

in an overly serious manner because you would get sick of me real quick if I was, if I was a bore. So if you would like to say thank you in some small way for producing all of this content, you can subscribe to the channel with a membership for $4. There is a link to do so in the video description. If you can’t do that for whatever reason, because everyone has reasons, you can like, you can comment, you can subscribe, and, you know, you can keep me company in between, in between recording takes. You can, you can, make me a happy fella. If you need help with SQL Server, I’m really good at all this stuff and my rates are reasonable. If you need help with something else involving SQL Server, let me know what it is. My rates will remain reasonable. If you would like some very high quality, very low cost SQL Server training to the tune of 24 hours of it that will last you for the rest of your life without the need to renew a subscription, you can get all of that stuff from me for about $150.

Again, there is a link for that in the video description. It’s a good, good way to spend your money. If you would like to see me live and in person, right now, the only thing on my calendar is Past Data Summit in Seattle, Washington, where I will be co-hosting two days of performance tuning excellence with her excellence, Ms. Kendra Little. So you should, you should come to that. If you can’t make it to that. If you can’t make it to that, but you go to other events and you think, boy, it’d be great if Erik Darling came to these other events, you can tell me what they are and I’ll show up. This is assuming they need a pre-con speaker because that’s, that’s how I pay for some of my flight and hotel to get there. Independent consulting does means I don’t have a company paying for it. That’s not me. So we make the trade off.

Anyway, let’s talk about scalar UDF inlining and all the more restrictions that it has. So the two things that I want to highlight from the scalar UDF inlining docs are here. UDF doesn’t reference XML methods and here. UDF doesn’t reference the string ag function.

There’s another XML thing down here about namespaces, but we don’t care so much about that one. I am always going to laugh at this one though, because CTE, such garbage. They’re so bad. They had to get disallowed from UDF inlining.

Who would have thunk it? Anyway, what we have here is an example of a scalar UDF that references the string ag function. Right. And the only reason that I have this one is because I need to show you that you can replicate the string ag behavior using XML without breaking scalar UDF inlining.

Now, if, now, usually when I’m writing scalar UDF, or sorry, when I’m writing XML, like string concatenation stuff, I’ll do something like this. Because this gives me a lot of type safety and, you know, just gives me like pretty much the simplest version of an XML query plan that you can get. The problem is that using this is the type of XML method thing that scalar UDF inlining gets mad about.

So this dot value clause, right? The dot value thing in there messes that up. So since we can’t use that, we have to change our XML query a little bit. We can’t use for XML path comma type, and we can’t use the dot value thing to produce a scalar value from the XML.

So we have to do these two things like this. I’m just going to make sure that both of these functions are created the way that I intended them to be, since I am the god of the SQL Server. That’s just what we have to deal with.

Now, the first thing that I’m going to show you is something that I feel like I have to caution everyone about when they first start rewriting scalar UDF’s inline table valued functions. And that is that you really have to mind your indexes when you do this. If you haven’t minded your indexes when you do this, you will end up with awful query plans.

So what I’m going to do is I already have compat level set to 150. So if I run these two queries, we’re going to see some slight differences in here. The one down the bottom is the scalar UDF version, where we see the query that calls the scalar UDF, right, which is this thing right here.

And this thing is going to have all of the standard scalar UDF problems. If you look at the properties, we’ll see a non-parallel plan reason. And because I’m on SQL Server 2022, I have the T-SQL user defined function, not parallelizable reason.

If you’re on an earlier version or a non-cloud version of SQL Server, you’ll just see you’ll just see could not generate valid parallel plan. We’ll also have this compute scalar. And the compute scalar will be responsible for calling the scalar UDF over here.

And this will, again, like run once per row that the query returns, right? So good, good, good. We have all that figured out.

Since we’re selecting 5,000 rows out, that function will have run 5,000 times. For the first query plan, we see this big parallel view, right? Look at all this stuff it’s doing with all this parallelism.

Isn’t that wonderful? Isn’t that great? We just parallelize the whole thing. It’s fast. It’s wonderful. Ooh, blah, blah. Really put those CPUs to work. The problem is that we don’t have a good index on the badges table.

So we’re going to end up with this eager index spool over here. And this eager index spool is going to slow things down. If you are replacing T-SQL scalar UDFs with inline table valued functions, T-SQL inline table valued functions, you really need to pay attention to the query plans. Because if your query slows down even more so than it did with the scalar UDF in there, this is the kind of query plan pattern that you have to watch out for.

In our case, it’s pretty simple to fix. All we need to do is add an index to the badges table, and both of those will go away. Or rather, that eager index spool will go away.

So if we look at these two plans now, just the estimated ones, we’ll see the eager index spool is gone from here. All right. So now, how does this work?

How much better is one than the other? Well, we’re going to let these run. And we’re going to hang out and do some stuff. And we’re going to have, I mean, the results are correct for both.

I’ve, you know, I’ve validated that the results are correct for both of these. But if we look at the query plans, our query plan that calls the inline table valued function is at about 1.6 seconds. And our query that calls the non-inlineable string ag version of the function takes almost 7 seconds.

And if you pay careful attention to some of the operator times in here, you’ll see about 3.6 seconds here. And then a little time here and a little time here and a little time here. And then the rest of the time.

So 6.7 minus 3.6 is about 3 seconds even. So we spend about 3 seconds just in the scalar UDF right here. Because remember, this compute scalar is where the scalar UDF gets invoked.

All right. So this is where, oops, I forgot to expand that. This is where string ag gets called in here. And this is where things get messy for our query.

This is what causes the majority of our query problems. So if you are writing or rewriting T-SQL scalar UDFs and the purpose of one was to do some string aggregation stuff and come up with some sort of delimited list of text, and you’re really excited about using maybe the new string ag function and maybe some of the new enhancements to the string ag function like ordering within groups and stuff, it’s going to suck for you.

You can’t do it in a function and still get UDF inlining. You could, of course, rewrite it as an inline table-valued function, but you wouldn’t get the same sort of, if you felt like it.

It really depends on what you want to do. But if you have just a really big, complicated function and you don’t want to, rather you’re unable to rewrite it as an inline table-valued function, having it remain as scalar UDF can allow you to keep using, or rather to use the XML stuff without the dot value method or anything like that and still get good results.

So string ag in scalar UDF prevents inlining, XML stuff does not, and that can really mean the difference between you having good performance and bad performance. So if you asked me what I would do, I would probably just rewrite it as an inline table-valued function where there are really no restrictions on anything, but you just might not be in a position where you can do that easily or do that without changing a whole bunch of code.

Part of the beauty of the scalar UDF inlining feature is that you don’t have to change the way that your query gets called in the select list like you would if you rewrote it as an inline table-valued function.

So there are some upsides to remaining on a TSQL scalar UDF, but having it be eligible for inlining rather than just doing the full rewrite to an inline table-valued function.

So anyway, with that out of the way, me and Bats are signing off for the night. Thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that you will continue to watch these videos and all that good stuff. So anyway, I’m out of here. Really, I’m leaving.

Get, what is that, Ferris Bueller? Get, go. Go.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Function Rewrites In SQL Server

A Little About Function Rewrites In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the fascinating world of function rewrites in SQL Server, specifically focusing on scalar UDFs and inline table-valued functions. I share my experience from years of performance tuning work, highlighting common mistakes and providing practical advice to avoid them. By comparing the use of `CROSS APPLY` versus `OUTER APPLY`, and ultimately demonstrating why placing the function call directly in the select list can yield better performance, I aim to help you optimize your queries effectively. Whether you’re a seasoned SQL Server professional or just starting out, this video offers valuable insights that could save you time and headaches down the line.

Full Transcript

Erik Darling here with Darling Data, and in this video we are going to be talking about one of my favorite subjects, because it’s one that people run into a lot of weird issues with, and that is with function rewrites. If you’ve been working with SQL Server long enough, especially doing performance tuning work, you have no doubt taken a scalar UDF and rewritten it as an inline table valued function. What you did is that you’ve the same thing that you’ve been doing, and you’ve been doing, and you’ve been doing this. What you did next, though, may have been either great or a disaster. It all depends on how smart you are. You might be smarter than me and not make this mistake. I learned from this mistake from making it a long time ago. You know. You gotta learn somehow. So, before we talk about that, if you like this channel and you would like to say thank you to me for spending countless hours producing this content for free, you can give me four four bucks a month. There’s a link to do that in the video description. If four bucks a month is just a bridge too far for your financial situation, liking and commenting and subscribing is equally as heartwarming to me, because I don’t get out much and talk to people. So, talking to people in YouTube comments is, you know, well, not a full-fledged substitute for a social life. It’s better than nothing. Eh?

Better than nothing. Darling data. Better than nothing. If you need help with SQL Server, and you would like something better than the nothing you’re getting right now, my rates are reasonable, and I do all this stuff really well. So, you should hire me to do that. Buy my time. If you would like to purchase some time that I’ve already spent, you can get 24 hours of high-quality, low-cost SQL Server content for the rest of your life. You know, I don’t know how long your life is going to be, but you can figure that out with me, together. You can get for about $150 US dollars. If you would like to see me live and in person, doing my song and dance, live and in person, I think I already said that, you can catch Kendra Little and I, me and Kendra, Kendra and me, however that works. I don’t, um, sometimes I forget English. Uh, November 4th and 5th in Seattle, Washington at Past Data Summit, delivering two, uh, wild stallion days of query tuning.

So, that’ll be fun. Uh, and aside from that, uh, we are ready to begin our, our query tuning fiesta for this video. So, all right, let’s get down to it. Now, I’ve got some indexes in place. They’re good enough. Now, let’s pretend that this function used to be a scalar UDF, right? Uh, it’ll grab a start date thing. It’ll grab some information from the users table for a specific user, and it’ll find their total posts, and then it will calculate their average posts per day for, uh, the span of time that we care about here. Right? So, cool. Uh, that’s what the function does. You can think what you want of its utility. That’s not really the point. The point is that when it used to be a scalar UDF, we could call it right in the select list like this.

Right? Average posts per day, stick a user ID in there, and we were good. Right? It would just, I mean, we weren’t good. We were slow. Um, you know, this query, this, that scalar UDF would force our query, uh, to not be able to use a parallel execution. Plan would be disallowed because of that. Uh, and, uh, that query, sorry, that function would be called once per row, not once per query. So that would put us in a bad way. And because of that stuff. The thing is, when, whenever you rewrite a scalar UDF as an inline table valued function, the first thing that occurs to a lot of people to do with it is to cross apply it to their data.

The thing is, that’s not logically correct. Cross apply is immediately out because cross apply will restrict rows if nothing, if we don’t find matches in there. Right? And we don’t want that. That’s not going to be the case for this query based on the logic of it because we’re looking for high reputation people. And we’re not going to like miss any, we’re not going to find anyone who has a high reputation that doesn’t have like the, the average post per day thing that we’re looking at.

Uh, so that’s, that’s kind of out here, but in general, you can’t safely do that because you might restrict rows using cross apply that you’re not allowed to. So logically, if you want to bake this into a query, the only thing that you can do is use outer apply. The thing with using outer apply is SQL Server is free to take that, that outer application and make it part of join reordering and all that other kind of crazy kooky stuff that it does.

And it could put that crop, that outer apply in a really bad, bad spot in the query plan. Um, that doesn’t happen here because my indexes are very good. SQL Server pretty much has one choice, but that’s, that’s okay.

I’m not going to, we’re not going to show you any tremendous plan differences, but even without any tremendous plan differences, there is a bit of a performance difference where you want to put things. Is in the select list. So even though it’s not quite as intuitive as sticking just the name of the scalar UDF and the select list, like we were able to up here, uh, you can put this, you can put the inline table valued function in the select list.

Just like that is a funny kind of funny looking little sub query type thing. And this will get you both the logical correctness of the query by not, by not restricting any rows. Uh, and the nice thing about queries in the select list is that SQL Server does tend to, uh, save them for later.

Uh, and by save them for later, I mean, after you’ve done all your joins and your where clause filtering and all the stuff that you relationally do with your data to restrict rows and return a result. Uh, SQL Server usually sends, uh, saves stuff that the little scalar sub queries like this in the select list for like the minimum amount of rows that will be projected out of the query through the select list. So you can actually end up with things being a little bit quicker.

And that does end up being the case here, not by a ton, but you know, by just enough for me to say, Hey, it’s an okay demo that shows how this thing can be better. So in the outer apply version of the query, this thing takes 6.2 seconds. And in the, uh, sub query in the select list version of the query, it takes about five seconds.

So we save about 1.2, 1.3 seconds, uh, just by putting this, the function in the right part of the query. Now, I don’t understand why that’s not terribly intuitive to everyone. Um, I think some people may be biased against, uh, sub queries in the select list.

Um, probably they got some bad advice from someone on LinkedIn about that. Uh, like never use sub queries, always use CTE. Everyone on LinkedIn who is a data engineer and posts stuff like that should probably quit LinkedIn, quit data engineering.

There are plenty of good trade jobs out there that you could get where you could actually, I don’t think I want them in the trades. They’d probably get someone killed. I don’t know.

Uh, I haven’t quite figured that out. Maybe join your local pickleball team. Maybe that’s a safer outcome for you because, uh, the amount of just hilariously bad advice that these people dish out on LinkedIn is, uh, I, sometimes I can’t comprehend. Sometimes the LinkedIn tab closes as quickly as it opens.

But anyway, uh, for some reason, it’s not terribly intuitive or some people have a bias against putting, um, sub queries in a select list. But using apply, uh, in this, in this case, even though it would have been logically correct because the outer apply won’t restrict any rows where cross apply would have. SQL Server still comes up with a slightly better, faster query plan, uh, with the skate, with the sub query in the select list rather than outer applied to the query.

So, uh, whenever you’re rewriting scalar UDFs is in line table valued functions, which you should be doing, uh, do try to keep the, uh, the function call in the same place that you found it. You just have to replace, you just have to do some slight manipulations with the query syntax in order to get it working. So in this case, it was just putting the select in here like this, which was exactly where it was, which is exactly where it got called as a scalar UDF.

It was just a little simpler to do because we don’t need to, you know, treat it like a, uh, select because this is treat, this inline table valued function just has to be called a little bit differently. So anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope you see that I only have one tab left after this and then I get to go the hell to bed. Uh, perhaps, perhaps my, my words earlier were just because I’m a little bit cranky, but that that’s okay. Where you’re, you’re, you’ll love me when I’m cranky.

You can’t handle me at my well-rested self. You don’t deserve me at my unrested self. I think that’s, that’s what, that’s what they say, right?

Okay. Anyway, thank you for watching. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join Me And @Kendra_Little At @PASSDataSummit For 2 Days Of SQL Server Performance Tuning Precons!

Last Year


Kendra and I both taught solo precons, and got to talking about how much easier it is to manage large crowds when you have a little helper with you, and decided to submit two precons this year that we’d co-present.

Amazingly, they both got accepted. Cheers and applause. So this year, we’ll be double-teaming Monday and Tuesday with a couple pretty cool precons.

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

Here are the details!

Day One: A Practical Guide to Performance Tuning Internals


Whether you’re aiming to be the next great query tuning wizard or you simply need to tackle tough business problems at work, you need to understand what makes a workload run fast– and especially what makes it run slowly.

Erik Darling and Kendra Little will show you the practical way forward, and will introduce you to the internal subsystems of SQL Server with a practical guide to their capabilities, weaknesses, and most importantly what you need to know to troubleshoot them as a developer or DBA.

They’ll teach you how to use your understanding of the database engine, the storage engine, and the query optimizer to analyze problems and identify what is a nothingburger best practice and what changes will pay off with measurable improvements.

With a blend of bad jokes, expertise, and proven strategies, Erik and Kendra will set you up with practical skills and a clear understanding of how to apply these lessons to see immediate improvements in your own environments.

Day Two: Query Quest: Conquer SQL Server Performance Monsters


Picture this: a day crammed with fun, fascinating demonstrations for SQL Server and Azure SQL.

This isn’t your typical training day; this session follows the mantra of “learning by doing,” with a good dose of the unexpected. Think of this as a SQL Server video game, where Erik Darling and Kendra Little guide you through levels of weird query monsters and performance tuning obstacles.

By the time we reach the final boss, you’ll have developed an appetite for exploring the unknown and leveled up your confidence to tackle even the most daunting of database dilemmas.

It’s SQL Server, but not as you know it—more fun, more fascinating, and more scalable than you thought possible.

Going Further


We’re both really excited to deliver these, and have BIG PLANS to have these sessions build on each other so folks who attend both days have a real sense of continuity.

Of course, you’re welcome to pick and choose, but who’d wanna miss out on either of these with accolades like this?

twitter
pretty, pretty, pretty, pretty good

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

See you there!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Diagnosing and Fixing tempdb Contention from Spills in SQL Server

Diagnosing and Fixing tempdb Contention from Spills in SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the fascinating world of TempDB contention caused by spills in SQL Server. Joining forces with my buddy Erik Darling from Darling Data, we explore how these spills can significantly impact performance and cause TempDB to become a bottleneck. We demonstrate this through an engaging example using SQL Server Management Studio and SQL Query Stress, revealing the ins and outs of how spills affect query execution times and CPU usage. By the end of the video, you’ll understand the importance of indexing and sort order alignment to mitigate these issues, ensuring smoother database performance in highly concurrent environments.

Full Transcript

Your best friend in the world, Erik Darling here with Darling Data. And in today’s video, we are going to talk about how spills can cause TempDB contention. Isn’t that just miraculous? We can just keep finding new ways to abuse TempDB. If it wasn’t like table variables and temp tables and, you know, I don’t know, all the other stuff that hits TempDB, if that wasn’t enough, now we have to worry about spills causing TempDB contention. And, boy, does that suck. But first, let’s talk about you and me. If you have already done this stuff, if you have already liked, if you have already commented, if you have already subscribed, and I do thank you from the deep, deep, dark bottom of my heart for doing that, and you want to go steady with me, you want to take things to the next level, you can click the link in the video description just below to sign up for a low-cost $4 a month membership. To the channel, which will just say thank you for doing a good job. That would be cool. If you need help with your SQL Server, and you think, boy, Erik Darling sure would be useful doing any of these things, hit me up. My rates are reasonable.

If you need me to do anything else with SQL Server, the same thing applies there. We can talk. My rates will remain reasonable just for you. Just make sure that you said, make sure when you hit me up, you say, hey, I’m coming from your YouTube channel where you said your rates are reasonable. Otherwise, I’ll have no idea who you are, and I might say something unreasonable. If you would like to get some high-quality, low-cost SQL Server training that is good for the rest of your life, so the longer you live, the more it’s worth, you can get about 24 hours of it for about $150 at my site there.

You can use the discount code SPRINGCLEANING for that. If you want to catch me live and in person, I will be in Seattle, Washington with Kendra Little doing two days of SQL Server performance tuning witchcraft, wizardry, warlockery. And it would be cool to see you there because it’s my birthday and I’m making goodie bags, so you have that to look forward to.

With that out of the way, let us begin our SQL Server partying. Now, let’s go over to SQL Server Management Studio and let’s make sure we have no indexes here. The first thing I want to point out is that if we run this store procedure on its own, it’ll run for about 1.3 seconds, right?

We spend about 200 milliseconds scanning the clustered index and then another second or so over here in the sort, right? Okay, so this runs pretty quick right now by itself with nothing else going on. Let’s come over to SQL Query Stress and let’s run this.

And we’re going to give this a few seconds to warm up. What we’re going to see while this starts warming up and doing stuff is that the queries start running for longer and longer over here, right? No longer are we at just the 1.3 second mark.

Some of these have been going for almost 4.5 seconds. What you’re going to see over in this column is a lot of stuff being runable, right? So runable is going to be SOS scheduler yield related.

And if we keep looking over here and running this, all of a sudden we’re going to start seeing queries taking a lot longer over here. Now, classic tempdb contention signs would be seeing stuff like in the wait info column like pagelatch underscore UP or pagelatch underscore EX. And sometimes you might see that in here if you have enough of the spill contention going on.

That could still totally happen. But we’re not seeing that in there. What we’re seeing are queries running longer and longer because they’re getting bogged down in tempdb all trying to do the CPU work to deal with the spills. Now, keep in mind, this is only 50 threads, right?

It’s 100 iterations, but it’s only 50 active threads at a time, right? All the results from spwhoisactive will say about 50 rows down here in the armpit zone for anything that’s actively running. So, like, not a lot in here is, like, you know, like, I’m not exhausting worker threads as part of this.

If I ran a lot more threads, things would get a lot worse. But you can see in this column that now stuff is taking, like, 9, 10, 11 seconds in here. And, indeed, if I run my store procedure with the rest of the workload going, it’s going to take a while, right?

That was, like, 1.3 seconds before. Now this thing took, well, just about 10 seconds. Or, sorry, just about 9 seconds.

8.675 seconds now here. 726 seconds here. If we look at the weights for this query, right? If we come over here and look, the weight stats for IO completion won’t be that bad, right? This is the typical sort-spill thing in here.

And I think that there’s been some improvement in the row mode algorithm for that. Really, what we get bogged down in is the SOS scheduler yield weights over here. There was about 6 seconds of those for the 8-second query.

So, about 6 of the 8 seconds that we spent waiting for this thing to finish were just, like, waiting on, like, CPU, cooperative CPU scheduling. And, again, this isn’t a ton of stuff going at once, right? I’m not, like, you know, throttling the server with 1,000 active requests.

This is just 50 queries running. Granted, they’re all doing the same thing and they’re all spilling. But, you know, that’s just kind of funny to see, right? So, yeah, it’s like running this over and over again, things get kind of wonky.

And we, so, this ran for almost three minutes and we barely completed, a little over 900 total copies of this query finished in that time, right? So, like, on average, about five and a half seconds of CPU. The, like, some of the iteration average in there, 17 seconds.

So, things were getting really slow on the server itself. Now, with nothing running on here, if I come back and run this again, tempDB goes back to normal. We’re at about 1.3 seconds for this.

Cool. Let’s look at how we might fix that. Now, in-memory tempDB stuff might help, right? Depending on what kind of contention you’re seeing.

If we were seeing the page latch contention, then in-memory tempDB stuff might be good. You know, but I think, you know, for a lot of the queries where I see, you know, something like this where, you know, there’s a little spill from a little sort and we could fix it pretty easy with an index, we should do that. Now, one thing that might surprise you a little bit is that creating this index on reputation does not fix the spill.

Or, rather, does not get rid of the sort. It does make the sort smaller, right? Or, it does make the sort faster because we’re reading from a much smaller index.

But, it doesn’t actually get rid of the sort. To do that, we need to actually match the sort order of the reputation column here in the index to the sort order of the query. Now, a lot of people might write stuff like, does index sort direction ever matter?

No, it’s stupid. Leave it alone. You’re an idiot. Leave it. That’s not true. There are lots of times when it does matter. This can be one of them.

Another time when it matters quite a bit is when you’re writing windowing functions. And, the windowing function specifies, I mean, not in the partitioning clause because the partitioning clause doesn’t have an ascending, descending. But, in the order by clause of a windowing function, you might specify some column descending.

And, then, your index sort direction matters quite a bit. Okay. So, with this index created, at least I’m pretty sure it’s created, if we run this, we’ll see that this query finishes just about instantly. And, now we have a top here but without the sort involved.

So, now, if we come over here and we run SQL query stress for, again, 100 iterations on 50 threads, we won’t even have time to get back to SQL Server management studio because this will have completed in about 15 milliseconds. So, that’s a pretty good deal there. I like it quite a bit.

And, I think you should, too. So, anyway. With that out of the way, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that you are now able to go out into the world and resolve TempTB contention issues from spills because it’s a good thing to do. You’ll probably be a hero to someone if you start fixing these sort of problems.

Now, granted, these aren’t the sort of, like, big query problems that you might see from, like, reports where you go from, you know, 30 minutes to 3 seconds or something. But, this is the kind of, like, you know, highly concurrent workload stuff that you have to start thinking about and analyzing and addressing when you’re dealing with big, highly concurrent OLTP workloads because this kind of stuff can really sneak up and bite you. So, you know, as usual, spwhoisactive is a great tool to start tracking this stuff down.

And, you know, the more you can do to, you know, monitor and get an idea of what’s going wrong on servers when things are in a highly concurrent state, the better job you can do of starting to tune things so that you don’t have to worry about the server falling over. If I were to, if I really wanted to make things awful, I could have thrown way more worker threads at this and things would have been just terrible in here, right? 50 active users, right, obviously doesn’t take much to bog down a SQL Server.

So, you know, you can see some really profound effects even from small amounts of really highly concurrent spilling activity. So, anyway, that’s about it for this one. I got a couple more videos about rewriting functions and function inlining and stuff.

So, we’re going to get to those. And then, I don’t know, I think we’re going to, I don’t know what we’re going to do, to be honest. We’re just going to close these tabs out and I’m going to get on an airplane tomorrow.

Then I’ll figure out what to do when I get home. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

More Annoyances With Local Variables And Optimize For Unknown Hints In SQL Server

More Annoyances With Local Variables And Optimize For Unknown Hints In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the frustrations and challenges that local variables and the `optimize for unknown` hint can bring to SQL Server performance tuning. While these techniques are often touted as best practices, they can significantly complicate the process of diagnosing and resolving performance issues, especially when trying to reproduce them in a testing environment or during code reviews. I demonstrate how using local variables and `optimize for unknown` hints can obscure crucial information from execution plans, making it nearly impossible to pinpoint the exact conditions that led to poor query performance. By highlighting these annoyances, I hope to encourage database professionals to reconsider their use of these practices in favor of more transparent and maintainable alternatives.

Full Transcript

Erik Darling here, with Darling Data. Don’t you just love it? Don’t you just love every minute of it? I sure do. Except the odd-numbered minutes, and some of the even-numbered minutes. A few of the seconds are great, though. Milliseconds really stand out. In today’s video, I’m going to talk about some annoyances that I have with local variables and optimize for unknown hints. Now, you’re probably used to hearing me complain about these from the problems that they cause with cardinality estimation, not using the smart part of the histogram to figure out how many rows might qualify for a WHERE clause. We’re going to look at something a little bit different in this video, because it’s something that I run into quite a bit when I’m trying to help clients with things, and it makes it really, really difficult to figure out Well, if I say it here, it’ll give away the whole video. So, I’m going to wait and say it. Let’s just say I’m sufficiently annoyed with local variables and unknown hints that we need to talk about them from a completely different angle this time around.

But first, let’s talk a little bit about money. Like, if you want to give me $4 a month, you totally can. There’s a link in the video description where you can join the channel with a membership subscription thing, and it’s cheap and it’s worth it, because you get a lot of free content out of it. If $4 is beyond your means, maybe your rates are a little too reasonable, and you can’t afford $4. That’s totally cool. I like likes, I like comments, and I like new subscribers. I like new faces showing up in the comments saying, Hey Eric, how you doing? It’s about all it takes to make me happy. All the other stuff just, you know, buys me wine, I guess.

If you need help with your SQL Server, if something that I talk about in one of these videos strikes you as something that you think that I would be particularly valuable to assist you with, any of these things, my rates are reasonable. If you want some high quality, low cost training, for life, forever, you can get about 24 hours of it for about $150 US with the discount code that’s also in the video description.

So the more things you click on there, the better off we all are, right? In reality, just clicking on the links in the description, and it will improve all of our lives greatly. If you would like to see me live and in person, you can do that. November 4th and 5th in Seattle, Washington. I don’t know if there are any other Seattles. I’m unaware of them. But I’ll be there for my birthday.

And I’ll be hanging out with Kendra Little for two full days of performance tuning magic and wonder. It’s going to be like you were a kid again, right? Watching someone do like one of those finger tricks where they, ah, where’d it go? I wasn’t flipping anyone off, I promise. I can’t blur this later, so screw it.

Anyway, if there’s an event near you that you think I would make a good speaker at, a good addition to the lineup, let me know what it is. If they’re accepting pre-cons and I can, you know, cover the cost of travel to get there, I’ll happily show up and teach people live and in person. That’s my favorite. So with that out of the way, let us begin our festivities. Let us start the party.

We will commence with the fun. So let’s go over to Management Studio. And I’ve got a store procedure here, just to make sure that we are thorough in our investigation.

I’ve got a store procedure here that does three things. It’s got one query that just takes some regular old parameters right there, right? That’s just the normal parameter way of doing things.

And then I have a second query where I declare a couple variables and I use those instead of the parameters here. And then I’ve got a third query where I use the variables, but I add this optimize for unknown hint at the very end. Now what I’m going to do is I’m going to run each of these.

And when I run each of these, we’re going to get results back, you know, fairly quickly. This doesn’t have to be like miraculously fast. What I want to show you here is really not really the cardinality estimation thing.

Like we can see the cardinality estimation thing here where like, you know, for the first set of compiled parameters, we got a good estimate and everything was cool, right? Well, I mean, really up here is where it matters where we seek into the index where the thing is and where it falls off is when we look down here, we’ll be at, oh, there’s only 123 rows. And that’s the same for both of these.

And that gets, you know, I don’t know, it becomes a thing for the other queries where, you know, the second time we compile with the actual parameters, we get the cardinality estimate from the first one. And that also is true of, oh, stop moving. Gosh, darn it.

You silly thing. That’s also true of the two optimized for unknown things where we get the, you know, the actual number of rows, but of the much worse guess, right? So this guess just is stupid for everyone.

It doesn’t make any sense. But, you know, the parameter sensitivity aspect of the first one is why a lot of people end up doing the up, the, the local variables or the optimized for unknown thing. I’ve had a number of people tell me that they thought it was a best practice in SQL Server, at which point I feel like this hand get real slappy.

Not, not in a way that would gratify anyone. So what really bothers me with this stuff, though, is just to clear things out a little bit so I don’t have six plans staring at me. What really annoys me is when you run a store procedure like this one and you get the actual execution plans, what you’ll see if you go to the properties of this.

Now, let’s say you’re in a situation like me, where you’re a young, handsome consultant and someone has hired you to help them tune their queries and their store procedures and all that other good stuff. And you’re trying to help like trying, you’re trying to reproduce a performance problem with a store procedure. If you use actual parameters in your store procedure, but what you’ll end up with over here are the compile and runtime values for both of these.

Right. So there’s the parameter compile value. There’s a parameter runtime value. We see that repeat for both of the parameters that were passed in. So if I wanted to, I could very easily take these out and I could take the compile values out, run the procedure and see if it’s slow, see if we can make any improvements. Where that changes a bit and I’m going to highlight this so it doesn’t disappear on us.

Where that changes a bit is if we look at the local variable version of this, some things start to disappear on us. All of a sudden, all we have are the runtime values. Now, if you have, if you’re looking at it, if you’re able to execute a store procedure, the runtime value should be obvious to you because you just put them in.

You just told SQL Server what to do there. Right. And that’s going to be the same for down here as well, where all of a sudden all we have is, well, the runtime value. Right. Just zero and two there. So that’s OK.

But this isn’t really my problem. My problem is if we look in query store or the plan cache and we’re like, wow, that store procedure sure was slow. We should probably do something about it. If we look at what happened in there, I’m going to show you two things. Actually, I’m going to show you all three things. The first query up top should be the optimized for unknown hint.

And if we look at the query plan for this and we say, well, this was the slowest thing in there. Let’s see what we passed in there. And let’s try to figure out how we can reproduce this performance issue so we can start fixing it. Well, we don’t have a parameter list over here.

Right. For, you know, for a little bit of what do you call it? A little bit of contrast on that. Let’s look at the query plan for the third one, which has nothing on it.

If we look at the query plan for the third one, we have a parameter list here. And we can see what the compile time values were. Right. So we can get the compile values for both of these parameters.

And we could plug these into the store procedure and we could say, oh, this is why you are slow. We can fix that. But optimize for unknown and the local variables strip that stuff out.

They completely hide that information from you. So you have absolutely no idea what happened in there. This is the up. This is the one with the local variables.

And we don’t have that parameter piece in here so that we can reproduce this. Now, on the one hand, I think this is partially Microsoft’s fault. Even though that even though these are local variables and optimize for unknown hints, you still know what the what the values of the of the parameters they were compiled with were, even if those parameters didn’t really come into play for cardinality estimation.

Remember, when you use local variables of the optimize for unknown hint, SQL Server looks at the different part of the histogram that tells it how unique it thinks the column is that you’re looking at and how many rows are in the table. And it multiplies those together.

So even though the values weren’t used for cardinality estimation, the compile value should still be in the query plan because they might be useful. Amazing. I know. Right. Why like why wouldn’t they be there?

But this is just another crappy side effect of people following bad advice or people following awful, like, non-existent advice that optimize for unknown and using local variables to substitute parameters in is a best practice. You end up in situations where it’s impossible to figure to reproduce anything to pull it out.

And then you’re then you’re stuck trying to figure out like, OK, is there application logging for what people did? Like what are common values for things? For me, with the Stack Overflow database, it’s very easy to like look at the reputation column and look at the uploads column and just come up with a couple values.

In real life, if you’re trying to find like customer IDs and order IDs and dates and all sorts of other things that could really make or break you being able to reproduce a performance issue, it really screws you badly. So please, please, please, please stop doing this. You know, recompile hints are almost the opposite because recompile hints will cache the actual values, but not like in the parameter section.

You have to like look at where you have to like look at where you touch the index to see what actual values are applied in there. So recompile has its own sort of issues. But typically, once something has a recompile hint on it, it’s usually fast anyway.

So I’m kidding. Recompile doesn’t fix everything. It just fixes crap like this. So that’s that’s nice, too.

Yeah. Anyway, please stop using local variables as parameter substitutes. And please stop using optimize for unknown hints because they make my job harder. My job’s already hard.

They have to fix your code, your indexes and your databases on your servers. It’s not easy, but it’s OK. My rates are reasonable.

All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About FORCESEEK Hints In SQL Server

A Little About FORCESEEK Hints In SQL Server



Thanks for watching@

Video Summary

In this video, I delve into the world of tuning queries that use OR clauses in joins, specifically focusing on a scenario where SQL Server’s natural query plan choice is far from optimal. I explore why SQL Server might choose such a suboptimal plan and demonstrate how forcing index seeks can significantly improve performance. By walking through detailed examples and explaining the reasoning behind each step, I aim to provide practical insights into when and how to use hints effectively in your queries. This video is part of my ongoing series on query optimization and performance tuning, designed to help you navigate the complexities of SQL Server and ensure that your database operations run as smoothly as possible.

Full Transcript

Erik Darling here with Darling Data. In case you’re wondering why I am recording at night, well, it is because it is Wednesday, September the 4th? 4th. 4th. And tomorrow morning I fly out for Data Saturday, Dallas, because I have to be there to teach a pre-con and stuff. And I just want to make sure that I have sort of a clean slate of stuff when I come back from that, because I’ve just got a few videos that I want to get nailed down before I leave, both to make sure that I have the blog queue and the video queue amply done, because whenever I’m gone for a few days and I don’t have time to record and do stuff, I need to make sure that my fiduciary duty to you is complete. And tonight’s video, I think this is the 10,000th video I’ve recorded tonight, we’re going to be talking about tuning or clause joins.

And the reason why we’re going to do that is because I have a few videos about joins with or clauses in them, and some of the pain they cause, and some different ways of approaching them. But this one’s a little bit different, because in this one, rather than doing this sort of, what I would consider the Darling Data standard rewrite, where I do a union all of things and work off that rather than working off of, and really doing anything else, it’s pretty much just that. We’re going to look at things in a slightly different way.

The slightly different way is going to assume that we have pretty good indexes in place, because without them, using what we’re going to use in this video is going to be pretty fruitless. But before we do that, as usual, if you like this video, if you like this channel, or this video, or whatever enough to start a membership, you can do that for the low, low cost of $4 a month. Absent $4 a month, you can contribute other parts of your body and soul and mind by liking, commenting, and subscribing. All very noble things to do.

If you need help with SQL Server in any of these ways, or I don’t know, maybe any other way, just don’t call me about like a problem with replication. My rates are reasonable. If you would like some high quality, low cost SQL Server training that can take you from your obvious beginner level to your next intermediate level, and then to expert level and beyond, you can get all 24 or so hours of mine for about $150 for your entire life. Or until I stop paying the bills, because no one gets a membership to the channel. We’ll see what happens.

But I’m kidding. It’s prepaid for like a decade or something. And in 10 years, if you still haven’t watched them, I’m not the one who’s messing up there. That’s you. So, yeah, there’s that.

And of course, this November 4th and 5th, I will be at Past Data Summit with the amazing Kendra Little. And we will be tag teaming two days of performance tuning. What’s a good T word? I don’t have one off the top of my head.

Performance tuning toughies. That didn’t go well. And of course, if there is an event near you where you would like to see me live and in person, tell me what it is so that I can submit to it.

Because if they choose me to do a pre-con, I will probably show up relatively sober. With that out of the way, let’s begin our voyage into performance tuning these ridiculous queries. Now, the reason why I care about this stuff is because these queries are notoriously slow on their own.

This is a join with an OR clause where I have no hints in this query. And SQL Server will every single time naturally choose the most god-awful query plan. And I’m going to break this down a little bit, but not this one, because this one only runs for about 30 seconds.

So, like, depending on the size of the tables, like, this is joining users to posts. The users table is fairly small. So nothing too, too awful happens here.

30 seconds is, you know… Oh gosh, that’s a slow query. I mean, it does just about reach the, like, usual application timeout threshold of 30 seconds. So, you know, it has that going for it.

I couldn’t put that in its dating profile. But the one that I would much rather focus on is down here a touch. And this is where I have joined… I don’t know why you decided to refocus, Management Studio.

This is where I have joined comments to posts. And these are two much larger tables. And things get much, much worse here. All right.

So what I want to point out a little bit about this query plan pattern that I find so noxious is… Look at… We do, like, what I would consider one too many joins.

All right. Not that I have, like, a number of joins in my head where I’m like, Oh, there’s too many now. It’s more that, like, for the particular query that we wrote, There’s a stupid join in here that we just shouldn’t have.

Right? Because we have a nested loops join here that goes to another nested loops join here that goes to the comments table. Okay.

So how did we get from… Oh, sorry. Oh, over here, where I can’t quite reach with my arm because of screen limitations, all the way down here to a seek into the comments table that took six minutes? Good question.

Well, we started by taking all 17 million rows from the post table. Right? That’s this number over here. 17, 1, 4, 2, 2, 0, 0.

17 plus some. And we took all 17 million of those rows and we broke them up into two parts. There’s a constant scan here with 17 million rows and there’s a constant scan here with 17 million rows.

And each of these constant scans represents the… What I’m going to say is the different join criteria that came out of that. So if we look at the query that got written, it’s going to be this one.

So it’s where the p.ownerUserId equals the userId in the comments table or the last editorUserId equals the userId in the comments table. Okay?

So if either one of those columns from the post table matches that one column from the comments table, we need to figure that out. So one constant scan is all of the ownerUserIds. The other constant scan is all of the last editorUserIds.

All right? So coming back over to the plan, you will see 17 million rows come out of this one. And 17 million rows come out of this one. And SQL Server slaps them all together.

So we have 17 million times two right here. And then SQL Server decides to spend 20… Let’s just say about 20 seconds sorting.

Both of those inputs. Why did it spend 20 seconds sorting all of those inputs? Well, because it was trying to remove some rows from those inputs.

So we start out with 3, 4, 2, 8, 4, 3, 3, 8. That is an 8-digit number of rows. And we end up with 3, 0, 7, 6, 6, 4, 2, 9.

That is still an 8-digit number of rows. Granted, we got from an 8-digit number that started with 3, 4 to an 8-digit number that started with 3, 0. But I don’t know if that was quite worth the now about 25 seconds of time that we spent doing that.

And, you know, perhaps, right? Because, you know, that would have an impact on this, right? We spend 8 minutes.

So, like, this goes from 25 seconds to 8 minutes. And we know we spent 6 of those minutes seeking into the comments table down here. Right?

So all of that nested loops join, that’s an 8-digit number of nested loops. That’s 3, 0, 7, 6, 6, blah, blah, blah, blah, blah. And if we look at this thing over here, and we look at the number of rows that ended up per thread, that is just a ghastly amount of work.

In this case, the parallel nested loops join was a pretty rough trick on us. So in all, this whole query, once we finish, you know, with this other nested loops join, we add about another 2 minutes on there.

And we end up taking almost 11 minutes to complete this whole query. The reason why this is so absolutely frustrating is because SQL Server could naturally choose a much better plan, but it just doesn’t.

So one thing that we can always try, assuming that we have adequate indexes in here, and by adequate indexes, I mean stuff like around this makes about sense, where we have the, you know, the tables and columns that we’re joining on all properly indexed for stuff.

We can run these two queries. And the only difference between these two queries and the two that I ran up front, where I stuck a force secant on the post table here and on the post table here.

So remember that first query took about 30 seconds and the second query took almost 11 minutes. But just throwing a force secant on these two queries, you know, that does, does the Lord’s work. By the Lord’s work, I mean makes them faster.

By the Lord, I mean me. I’m the Lord of SQL. I suppose that involves some dancing. So for the first query where, again, this is a plan shape that SQL Server could have completely validly chose. Right.

But it just didn’t. Something actually, you know, I’m kind of interested in doing. Kind of, kind of want to see what’s, if something that I forgot to look at actually was the costing. Right.

So let’s look at the cost of this 30 second query was, oh, why did you go away tooltip? Uh, 2883.55 and the 30 second query was 28. So just around 2800 query bucks a pop.

If we come look at these, these plans, look at the estimated subtree cost of this. 30,000. Right. Look at the estimated subtree cost of this. 11,000.

So this is where things get really annoying. SQL Server think, like costed these two plans out of existence. Even though the first query was 30 seconds and we got that down to four seconds with a query that cost 30,000 query bucks. And then we got this from 10 minutes down to four seconds with a query plan that cost 11,000 query bucks.

But SQL Server won’t ever choose those plans naturally. SQL Server, just the costing for this thing sucks. Right.

And this is why I spend a lot of time either applying for sequence or rewriting queries with the union all technique that I’ve described in other videos to make these queries faster. Right. So just throwing a force seek on it on this saves.

Oh, I don’t know. Uh, let’s just call it about 10 minutes and 38 seconds here. Right. So that’s, that’s pretty good.

And, uh, it, you know, saves about 25 seconds here. So don’t be afraid to play with hints when you’re tuning queries. If a query plan looks stupid to you, ask SQL Server why it didn’t do that.

That’s what the point of these hints are. Uh, if you, especially if you have queries that are joining with an or clause and you, you know, you have good indexes in place. Stick a force seek hint on one of those tables and see where it gets you.

A lot of the time you will get a much, much better plan than SQL Server is willing to come up with naturally. Um, again, it’s the costing mechanism behind this, which is so totally boned that you like, you really have to like, uh, you really have to intervene here. The other thing that I want to bring up with this is that, you know, something that I’ve, I’ve tried to get across to people who watch this channel and people, anyone who will listen to me or read what I say or anything.

Uh, is that, um, never look at query costs to figure out if a query is fast or slow, or if one query is better than another. Never look at operator costs to figure out what the slowest part of your plan is. Always get the actual execution plans and validate what you see.

Because remember for all of those costing mechanisms, there’s no actual counterpart. All the costs are pre-execution estimates and none of those, some of those estimates might make no sense. They can be the entire plan cost.

Like for these two query plans, they cost way, way, way, way more than the crappy slow plans that we got that were costed much lower. Right? Granted 2800 query bucks is perhaps not low to most people, but of 30,000 and 11,000, that’s definitely not low to most people. But it doesn’t matter here because those much higher costed plans are much faster.

SQL Server just estimated the work wrong. Right? And this isn’t, this isn’t necessarily an intelligent query processing thing. And beyond that, this isn’t necessarily something that would have been helped with like batch mode or anything.

Because nested loops doesn’t support batch mode. Even in an adaptive join plan, nested loops happens in row mode. Only the hash, only the, only the adaptive hash join could happen in batch mode.

So this is really something that I think Microsoft needs to work on. But until then, until they stop stapling clown nose features like dot feedback and stuff on SQL Server, it’s up to you and me to make sure that we have good indexes, that we try good appropriate hints in our queries when we detect BS from the query optimizer. Remember, query costs are useless.

Query costs are how we got to these plans. Right? Operator costs are how we got to these plans. And us saying, no, those costs are wrong. I just watched what happened.

That, that’s how, that’s how we, that’s how we get good at tuning queries. That’s how, that’s how we make real impacts on our SQL Server workloads. So anyway, hoo-wee.

Sometimes you just get, get talking and you can’t stop. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video. I haven’t decided, I think I might get these, might get some of this function stuff out of the way first.

I don’t know yet. We’ll see what happens. It’s going to be, going to be fun. I think I’m going to save the sort spills one for last because I need SQL query stress for that. And I’m a little lazy right now.

So I’m going to, I’m going to, I’m going to do these other ones and then we’ll get there. But anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Performance Tuning Semi and Anti-Semi Joins In SQL Server

Performance Tuning Semi and Anti-Semi Joins In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into tuning semi-joins in SQL Server, specifically focusing on `EXISTS` and `NOT EXISTS` queries. Semi-joins are powerful tools that can simplify complex queries but often introduce performance challenges due to row goals and unexpected query plans. I discuss the importance of proper indexing and demonstrate how adding a hash join hint or creating targeted indexes can significantly improve query performance, sometimes reducing execution time from over a minute down to just eight seconds. The video also touches on the nuances of sorting and the use of batch mode for large datasets, providing practical solutions for optimizing these types of queries in your own work.

Full Transcript

Erik Darling here with Darling Data. Getting a little bit more night recording in. It’s making my green screen lighting a little bit weird, but we’re not going to let that stop us. Apparently the daylight that normally comes in through my office window over there that you can’t see makes a pretty big difference in not casting strange shadows behind me. Who would have thunk it? More light is better when it comes to green screens. But anyway, in this video, this is a video, right? Camera’s on, microphone’s on. Remember to hit record this time. We’re good. We’re going to talk about tuning semi-joins. Semi-joins, of course, generally crop up when you use exists or not exists. And you can have either a semi-join or an anti-semi-join. A semi-join is, of course, for exist queries, and an anti-semi-join is for not exists queries. And both of them are wonderful, spectacular things when used correctly and appropriately. And I’m a big proponent of using exists and not exists because they are wonderful natural extensions of the SQL language that you should be taking advantage of. All the things that you can do to attempt to replace them.

You know, like writing a join with a distinct, a derived join with a distinct in it, or writing a left join where, you know, correlated and then where, you know, the left join to table filtered with like some primary key or not nullable column is null to find rows that don’t exist. Ah, boy. It’s just a lot more work and effort than it’s usually worth. So before we get into that, we of course need to talk about you and me, our future together. If you would like to say thank you for all of the video content and get in on the ground floor of many great and wonderful Darling Day to things that are set to come up in the new year, you can sign up for a membership to this channel for the low, low cost of $4 a month. Not bad.

Not bad. Even with inflation. If you are unable to scrounge $4 out of the couch cushions this month or any month, you can of course interact with my channel in many meaningful, ingratiating ways. You can throw me some likes on the videos. You can throw me some comments on the videos. I do love interacting with each and every one of you almost on a daily basis. And of course, if you subscribe to the channel, you will get these wonderful, wonderful, near instantaneous notifications every time I publish a video. If you find yourself needing help with SQL Server and you need any of this type of stuff done, I am very good at these things. I’m quite good at these things at this point in my career.

So if you are looking for SQL Server help for any of these things, my rates are reasonable. If you need help with something else dealing with SQL Server, I can guarantee you my rates will remain reasonable. If you would like to get some high quality, low cost training, you can do that. You can get mine for that. You can’t get anyone else’s for $150 for life for 24 hours of performance tuning content. Good luck with that.

I don’t even think Pluralsight can beat those numbers, but I don’t think anyone cares about them anymore. Right? A little dicey. But if you use the links in the show description, either to sign up for a membership or to buy my training, those are other good things you can do both for me and you. I will be speaking live with a whole lot of gumption in Seattle, Washington, November 4th and 5th at Past Data Summit, co-hosting two fantastic, glorious days of performance tuning pre-cons with Kendra Little.

And of course, if there is an event nearby you where you would like to single white female me, you can do that by telling me which one might need a pre-con speaker. Because doing the pre-cons, doing the payday and covering some of the travel stuff. I do not make an extraordinary amount of money from pre-cons.

Just covering the travel stuff is a good way to get me to show up somewhere and, you know, I don’t know, whatever you want to do. Smell my hair. Follow me around.

Steal my silverware after I use it. Whatever it is, I don’t know. I don’t know what you’re up to. I don’t know what goes on in that deviant brain of yours. You can do that.

But until then, let’s talk about these here semi-joins. Now, one very, very interesting thing about semi-joins is that they often introduce a row goal into your query. So you might notice if you look at the query that I’ve written here, we have an outer top one for the comments table.

But when we look at the query plan, we’re going to see a rather mysterious top operator in our query plan. Perhaps one that we did not plan on seeing in our plan. An unplanned plan.

So if we look in here, we’re going to see we have our outer top out here, right? That’s our presentation top. We have an order by to make that correct. We have an order by on creation date descending, which is a non-unique column.

And then we continue the order by, extend the order by with the ID column from the comments table, which is a unique column. So you have that tiebreaker in there to make sure that we get consistent results back. If we did not have that in there, especially with parallel execution plans, we could see all sorts of strange things happen.

Now, something that we’re going to wrestle with in all of these plans is sorts. But before we wrestle with sorts, we’re going to wrestle with one of my least favorite plan shapes of all time. And that is when you have a top above a scan.

Right there. We spend 55 seconds in the top above the scan. Now, remember when I said that exists and not exists introduce row goals.

That row goal is there because with exists and not exists, we care not about duplicates. We either find something or we don’t find something. And quite often, SQL Server will use sort of an injected top into the query plan to do that.

We just keep finding the top one over and over and over again. The problem is that every time we run this top one for a row that comes out of here, we have to scan and scan and scan in here. So we end up doing so even though, let’s get this right.

Even though there are only this many rows in the votes table, this is how many rows we end up reading over all of those top scans. Right. That is a fairly brutally large number.

Right. And that is absolutely no fun whatsoever. If we go and we look at the properties of this and we look at this, we will see some really big numbers on all these threads adding up to this really big number on this thread. So SQL Server did not have a good time in here.

Now, part of the reason for this is that we don’t have a good index to support any part of this. We’re going to get to that in a second. But before we do, what you should know is that oftentimes when you end up with a plan of that wretched nature, you can fix it pretty quickly and easily. Without doing any further optimizations, you can fix it pretty quickly and easily.

Now, let’s remember, this thing ran for about a minute without any intervention. If we run this with just a hash join hint, SQL Server will no longer have that nested loops with the top above the scan. We’ll just do a big scan of comments and a big scan of votes, and we will end up in much better shape here.

So big scan over here, big scan over here, hash join in here. The whole thing takes just about eight seconds. So going from one minute to eight seconds is a pretty good improvement right off the bat.

Let’s experiment with indexes a little bit, though. So the first thing that I want to index, because it’s the simplest index to create, it’s the only column that we care about on the vote side of things, is the post ID column.

It’s not a unique column. Obviously, because people can, you know, cast multiple votes on a post. So we can’t make a unique index here, but we can at least create an index to make this part of the query, you know, give that inner side of the query a little something to work with.

The trouble that you run into a lot, though, is once you add a good index, SQL Server starts thinking a little too much about things, and they aren’t good thoughts. So let’s run, well, actually, let’s run this query.

I don’t know why I got the estimated plan there. I was thinking about something else for a minute. So let’s run, get this going. And this runs for about five seconds. We shaved another, like, three seconds off the initial query, but it’s kind of stupid.

The reason it’s kind of stupid is because we fully scanned that nonclustered index over here, and then we fully scanned this clustered index over here. And even though it’s a little bit more efficient, you have to kind of wonder why SQL Server wouldn’t seek into the index that we just created, because that, right, if we create an index, we now have a seekable thing for that exist clause.

So let’s see what that query plan might look like. Let’s get the estimated plan here. All right, so now we have this one here with a top above a seek.

Let’s see how this goes. All right. I am mostly happy with this. We put this down to 3.5 seconds, right?

But now we have this sort of interesting thing over here. Most of our problem in this query is SQL Server needing to order by creation date descending and comment the ID column in the comments table ascending.

Okay. Well, well, you know, it’s kind of weird, right? Isn’t it? Put this stuff in order from the comments table.

Okay. Well, I mean, I guess we can do that over here. We could do that over here after we, like, found stuff. But, you know, SQL Server doing this over here. Sometimes that’s a good plan. All right. So let’s create an index on the comments table.

And what this index is on is it’s on post ID, right? Because post ID in the comments table is what we’re correlating on to the post ID in the votes table. And then we’re going to put creation date second.

And the hope here is that, you know, if you’ve watched any of my other videos about how indexes hold data, that for every post ID we find, which is within a quality predicate, the creation date column will be in order. And since this table has a clustered primary key on the ID column, that ID column, and we have a, this nonclustered index is not unique.

The ID column from the comments table will be a hidden third key column here. So we would technically have post ID and creation date and ID in this index in the order we want. All right.

So let’s, let’s create that. Let’s see what happens here. Let’s go with this. Let’s roll with all this fun stuff. I’m going to create this index and see what happens.

It might be a, might be a real fun time, right? Okay. Here we go. You got it.

We got our new index in. All right. And let’s run this. Let’s see here. This appears to be, this appears to have gotten worse with an index, doesn’t it?

Sure did. 14 seconds. What happened? Well, you’re going to see what happens.

It’s SQL Server chose a serial merge join plan. Look at this garbage monster monstrosity that SQL Server has chosen here. Because we now have both of our join keys in order, SQL Server was like, oh, I don’t need to sort anything.

I’ll just, just do a merge join. But even SQL Server sometimes knows that a parallel merge join is the worst thing in the world. And so it gives us this, this serial plan.

Even worse, what do we get over here? We still have a, the votes table is still on the outer side of the query with an, with an index scan on it. And the comments table is in here with an index scan on it.

We didn’t even seek to any of the stuff that we cared about. We just use the ordered nature of things to, to, to, to, to, to give us that serial merge join plan. Now, this is where things get kind of annoying is that if we, if we just, if we had a force seek hint to the comments table and we just try to get an estimated plan, SQL Server says, no bueno.

We, we are out of buenos. You can have no buenos. You, you, you, no buenos for you.

The query processor could not deal with this, which, you know, it’s kind of weird because without the force seek hint, you know, we had the comments table on the inner side of the join. And you would think that SQL Server could just take, you know, stuff from here and seek into it with the, the post column over here. But apparently not.

Now, if we add a force seek hint on the inner side and we run this, this is what our plan turns into. But we still have this sort here on the comments table. All right.

So this is okay because we’re down to about three seconds now. We’re still, we have about 1.5 seconds reading from the comments table and now about 1.5 seconds on the sort. And the past, the sort had spilled a bit and things were a little bit weird.

This is about as good as we’re going to get. All right. Honestly, as far as query tuning, this thing goes. And this, this took kind of considerable work to get from, you know, a minute to eight seconds, which is so far the biggest jump. But to get from eight seconds to like three seconds, you know, we had to add two indexes and now we need a force seek hint.

And things are just, things are just a little rocky. If I’m going to be honest with you, if I’m really tuning a big query that does all this stuff, what I want is batch mode. All right.

But just use, like get batch mode involved somehow. That’s really what I want to go after. But in this one, we’re going to focus just a little bit on some of the indexing pitfalls that can happen in here. So because we need to sort data a little bit differently, and I have, I accidentally deleted that from my index definition.

Sorry about that. All right. We’re going to change our index a little bit to be on creation ID descending, ID ascending, and then post ID as the last column in the index.

All right. And we’re going to drop our existing index with the wonderful underused drop existing equals on index option. And now we’re going to see how things go with this plan.

Let’s get the estimated plan here and see what happens. Now we have a serial nested loops join plan with no sorting. And if we run that, we finish things up just about instantly.

So if you’re tuning queries that use exists and not exist, right, they’re going to have a semi join of some type. Exist will be a plain semi join. Not exist will be an anti semi join because you are finding stuff that isn’t there.

They sometimes take, they sometimes need some extra help to get them to be reliably nice and fast. Sometimes that extra help is just throwing a hash join hint on there. Sometimes that extra help is just getting batch mode involved.

If you have two very, very large tables, batch mode is going to do you a hell of a lot more good than all of the indexing in the world. These tables aren’t quite big enough to qualify for that. Batch mode does do really well on them.

But, you know, I’d rather show you a little bit of the indexing stuff. The other thing that you can do with the indexing is pay really attention to really close attention to the query plans. If you end up with these types of plans and you have all sorts of sorting going on in them, you might need to adjust your indexes in a way that is a little bit counterintuitive.

And what I mean by that is you might need to sort of index towards the sort with these columns rather than index towards the join with this column. Remember that when we added this column in, things got a bit better.

Not like spectacularly faster. We saved like two and a half or so seconds on things. But it wasn’t really like spectacular. And then when we added a force seek hint in, we ended up with that awful serial merge join plan.

So sometimes depending on what the problem, like really, if this video were to have like a great title, it would be how to index for what the problem and the plan is. The issue is that this video would take me, that video would be months long.

So this is just one example. So this is just kind of indexing to help you tune exists and not exist queries. And some of the things that you might see in query plans along the way.

In my case, for the query that I’m running, my best set of indexes was, of course, one on the votes table on the post ID column, right? Because that gives us a good, clean, ordered path into the data that we’re trying to figure out if it exists or not.

And then it was the final index that we created down here where we geared our index towards sorting the data the way that we needed it and then having the column that we care about for correlation in here. Because remember, when we tried to put a for seek hint on the comments table, we got that optimizer error anyway.

So there was no way SQL Server was seeking into the comments table and doing anything helpful with an index that led on post ID. But there was just no use in that. So with the index that we have down here, once again, we get a fairly simple, easy query plan, just a plain serial nested loops plan.

And we end up finishing this very quickly because now we don’t need to sort data. And now we have a really efficient way to locate the data that we care about on the inner side here. So no sort, top above a seek, simple top for the presentation, and boom, we have a fully tuned not exist query.

So with that out of the way, once again, from me and bats, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I hope that if we ever meet, you bring me Pez because my Pez dispenser is empty. I’m just kidding. I don’t actually eat candy. I have no sweet tooth whatsoever. If you brought me a salt lick, I would be far more appreciative.

That’s my problem in life. The salt. Savory and the salty.

Anyway, I’m going to get going because I obviously have some other tabs to close out. So we’re going to get those recorded, and I will see you over in the next video. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Join Me And @Kendra_Little At @PASSDataSummit For 2 Days Of SQL Server Performance Tuning Precons!

Last Year


Kendra and I both taught solo precons, and got to talking about how much easier it is to manage large crowds when you have a little helper with you, and decided to submit two precons this year that we’d co-present.

Amazingly, they both got accepted. Cheers and applause. So this year, we’ll be double-teaming Monday and Tuesday with a couple pretty cool precons.

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

Here are the details!

Day One: A Practical Guide to Performance Tuning Internals


Whether you’re aiming to be the next great query tuning wizard or you simply need to tackle tough business problems at work, you need to understand what makes a workload run fast– and especially what makes it run slowly.

Erik Darling and Kendra Little will show you the practical way forward, and will introduce you to the internal subsystems of SQL Server with a practical guide to their capabilities, weaknesses, and most importantly what you need to know to troubleshoot them as a developer or DBA.

They’ll teach you how to use your understanding of the database engine, the storage engine, and the query optimizer to analyze problems and identify what is a nothingburger best practice and what changes will pay off with measurable improvements.

With a blend of bad jokes, expertise, and proven strategies, Erik and Kendra will set you up with practical skills and a clear understanding of how to apply these lessons to see immediate improvements in your own environments.

Day Two: Query Quest: Conquer SQL Server Performance Monsters


Picture this: a day crammed with fun, fascinating demonstrations for SQL Server and Azure SQL.

This isn’t your typical training day; this session follows the mantra of “learning by doing,” with a good dose of the unexpected. Think of this as a SQL Server video game, where Erik Darling and Kendra Little guide you through levels of weird query monsters and performance tuning obstacles.

By the time we reach the final boss, you’ll have developed an appetite for exploring the unknown and leveled up your confidence to tackle even the most daunting of database dilemmas.

It’s SQL Server, but not as you know it—more fun, more fascinating, and more scalable than you thought possible.

Going Further


We’re both really excited to deliver these, and have BIG PLANS to have these sessions build on each other so folks who attend both days have a real sense of continuity.

Of course, you’re welcome to pick and choose, but who’d wanna miss out on either of these with accolades like this?

twitter
pretty, pretty, pretty, pretty good

You can register for PASS Summit here, taking place live and in-person November 4-8 in Seattle.

See you there!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Problems When Joins Have No Equality Predicate

SQL Server Performance Problems When Joins Have No Equality Predicate



Thanks for watching!

Video Summary

In this video, I delve into the world of non-equality predicate joins in SQL Server, specifically focusing on nested loops joins without equality predicates. You’ll see how these can lead to significant performance issues, even when indexes are present. I explore why parallel nested loops queries with guaranteed one-row outer parts perform much better and discuss the downsides when this guarantee is broken. Additionally, I share practical solutions, such as introducing computed columns or using hash-based joins, to mitigate these problems and achieve more efficient query plans. If you’re dealing with similar performance challenges in your own SQL Server environment, be sure to check out my training resources for expert guidance on tuning and optimizing queries.

Full Transcript

Your friend, Erik Darling here with Darling Data. And today, actually I should say tonight’s video. We’re doing some night recording for reasons that I don’t have to explain to you. We’re going to be talking about da-da-da-da joins with no equality predicates because, gosh, have I seen these just cause a disgusting amount of performance problems in my time. Part of the problem with joins with no equality predicates is that it’s a disgusting amount of performance problems in my time. So, the problem with no equality predicates is that it’s a disgusting amount of performance that I’m using in my time, but also I want to be talking about. And what we’ll find is that even with great indexes in place, there is just absolutely no helping some queries. The other thing we’re going to find is that sometimes we need to introduce some sort of hacky, hashy, equality-predicate type thing, so that we don’t have to be able to do that.

we can get reasonable performance out of our queries. But before we dig into all that, you know what time it is. It’s time for me to tell you how to give me money.

You can sign up for very low-cost memberships to say thank you for all the videos, all the time and stuff that goes into these miraculous things that has bit my tongue. If you can’t, they’re like $4 a month, and there’s a link in the description that you can click on to join the channel.

If you’re all out of $4, you can always like these videos. You can engage and interact with me in other ways that I find appealing. You can comment on the videos.

The comment sections have been great lately, so thank you to everyone who continues to do that. And you can also subscribe to the channel for the absolutely low, low price of $0. and you can get notified also for the low price of $0 every time one of these things goes live.

If you are the type of person who says, gosh, we could use Erik Darling’s help with our SQL servers, these are the types of things that I excel at.

I’ve done over 700 of them now just with Darling Data as an independent consultant. Well, I mean combined, not like each. That would be nuts.

And that’s not even my entire consulting career. Imagine that. So you can hire me for all those things. If you would like some low-cost, incredibly high-quality SQL Server training, all the performance tuning in the world, really, I have all of it.

I mean, what else is there other than beginner, intermediate, and expert? I can’t think of anything. And you can get it all for 75% off, which means the cost is about $150 USD.

And what do you call it there? That’s for life. Yeah, you don’t have to resubscribe to any of the stuff you get in this bundle. It is everything for all eternity.

So that’s nice for you. As far as upcoming events go, where I will be live and in person, shaking you upside down by your ankles, emptying out your pockets like a schoolyard bully, I will be at Pass Data Summit, November 4th and 5th, co-hosting two days of performance tuning grandiosity with Kendra Little.

If there’s an event near you that could use a pre-con speaker, let me know because I got pre-cons and I will travel. So with that out of the way, let us engage in festivities.

Let us have fun with our SQL Server selves. For some reason, the click wouldn’t work, so that all went black. That was fun.

So let’s make sure we have no indexes. And the first thing I want to show you is that, and this is actually kind of like probably a little bit more than you may have bargained for when you started watching this.

But there’s something really interesting that can happen to parallel nested loops queries or parallel nested loop query plans when you are guaranteed to have one single row on the outer part of the nested loops join.

And this is not something that happens when you don’t. If you have more than one row, I mean, if you have zero rows, I guess you don’t have to worry too much about it.

But if you have more than one row, things can get a little weird. But first, let’s take a look at this query plan. So this finishes relatively quickly. And the reason why this finishes relatively quickly is because there’s exactly one row in the user’s table with an ID of 22656.

Now, granted, user ID 22656 is Mr. John Skeet. You have a million plus reputation points on Stack Overflow. Even in the 2013 copy, he has that many.

So I think, anyway, it’s a big number. But if we look at the query plan, something really cool happens. At least I think it’s cool.

You might not think it’s cool. But if we look through this, we have sort of a typical guaranteed one row nested loops, parallel nested loops plan, where we get our one row from the user’s table.

And then we distribute that row out to our streams. And this is, of course, broadcast partitioning, which means that the one value from that row ends up on multiple threads, well, I mean, on dot threads, really.

And what happens on the inner side of the parallel nested loops join, and the reason why this is unique is because our row is unique. And when we look at the number of, the way that the rows get distributed on parallel threads, they do not, normally you would see 17142169 on every thread, because that’s what the inner side of a parallel nested loops join does.

It runs dot copies of whatever you do on the inner side of the nested loops join, which can be awful sometimes. Other times, when you’re guaranteed one row from the outer portion, you can end up with a much more, usually a much more efficient query plan, where those rows do get spread out.

So if you added up all those numbers that I just stuck my hand into, like a weirdo, you would get 17124169. So that’s cool.

What happens when you don’t? When you don’t guarantee one row? So the first thing I want to show you is that if we look at all the users in the users table, whose IDs are between 22656 and 22666, there’s only one other of them, right?

There’s only one other ID in there. But if we look for this, where ID is between 22656 and 22657, SQL Server is still going to expect two rows to come out of there.

Because of that, our query plan is going to change drastically and dramatically. It’s going to be awful.

And I don’t know, something about night recording is making the shadows over here a little weird. I don’t know how to get that to be any less weird right now. I can just close my arms so you can’t really see behind me. And then I can just do weird little robot T-Rex type arm things.

And I don’t know, that won’t look awkward at all, will it? No, not one bit. But if we look at the query plan, it changed quite a bit.

It took quite a bit longer. So this took about 22 and a half seconds. Yeesh. Yeesh. That didn’t do good. And just about all of that time is spent in between these two things right here. Scanning the clustered index to get new rows out and spooling those rows into a lazy table spool.

I’ve got lots of videos about how lazy table spools work. The short of it is that SQL Server takes the values that come out of here and it sends them over here. And then the first time this executes, SQL Server runs the clustered index scan over there.

Let me zoom in so my pointing is a little bit more effective. SQL Server runs the clustered index scan and gets whatever rows it needs for the predicate that gets passed over here. When it’s done with that, it truncates the spool and the next row that comes in, it repopulates the spool by scanning the table or usually scanning.

Generally, if you have a seek over there, you won’t see a table. Sometimes you will, but in general. Hitting this, let’s just say, hitting this object to get the next set of rows out to populate the table spool with is what happens.

This just takes a very long time. And part of why it takes a long time is because we end up with very, very lopsided parallelism. If you look at this, all of the rows end up on a single thread.

This is quite abnormal a lot of the time, but it can be very normal when you have these sorts of plan shapes. The reason why this happens here is because we’re looking for a single thing at a time. And we just get some really unfortunate data distribution stuff happening in the post table.

So we end up just putting all of our work on one thread for the two iterations of the nested loops joint. Because even, well, really the one iteration, because only one row comes out, even though we expect two rows. But SQL Server sort of like getting ready to defend itself against two rows means that we lose that guaranteed one row on the outer side of the nested loops thing.

And that sort of messes us up a bit. You’ll see that the partitioning type in here changes from broadcast to round robin. Meaning that this thing just, you know, puts things on threads as they come out.

And that’s, you know, kind of not good for this situation. Of course, having an index on the post table for last activity date is pretty helpful in both scenarios. One thing that is kind of a downer, though, is that with that good index in place, the original query slows down a bit.

This goes from finishing just about instantly with a parallel plan to taking a bit longer with a serial plan. Now, if you remember, the parallel plan spread out, you know, those dot threads pretty nicely. And we did all this work just about instantly.

Now we lost a little bit of efficiency here, right? This is about 1.7 seconds now, which isn’t great. But I think, you know, generally the efficiency that you gain with this query, where this thing no longer takes like 22 seconds to run, this thing takes another 1.7 seconds to run.

It makes it worthwhile. What you have to watch when you are writing queries like this that do not have a direct equality predicate, right? The only thing that our only predicate in the join clause is where the last activity date on the post table is between the creation date and the last access date on the post table.

You have to be really careful that your join keys are well indexed for this sort of thing. And, of course, the, you know, bigger and, you know, more involved your queries are, the harder that gets to, you know, get the harder that gets to really index for. So let’s get rid of these indexes.

And let’s talk about a slightly different kind of range query. Now, I can’t actually run this one to completion. This thing basically never finishes.

I’ve never, never spent too, too long trying to get this to run. I think the longest I let it run for was about 20 minutes. And it really just, you know, really just made the room hot. The laptop was sizzling.

I could have used it like a griddle. But this will basically never finish. It’s a real unfortunate sort of situation. And it’s made even more unfortunate by the fact that, you know, SQL Server asks for some kind of silly indexes.

So if we ran this multiple times, SQL Server would eventually suggest an index on display name. And I’m going to, the original index it wanted was on post type ID, comma, last enter display name. I’m just going to put this in the where clause here to, you know, shortcut having to do all that stuff.

Because it effectively gives you the same thing. But what’s really rough is that unless we write our query with an equality predicate like this. Right?

Like let’s say that we, let’s actually give you a slightly better example. Let’s do this first. And let’s look at what SQL Server comes up with for a query plan. Seeks into the post table.

And then does all of this sort of weird work. And then nested loops join to hit the users table. This will never finish either.

If we change, if we stick a force order hint on here, and we look at what happens, we’re going to see a query plan that looks a lot like the one that we saw up above before we had an index. The problem is this one basically never finishes either. This one, you know, just does really poorly.

And yeah, it basically just never finishes. A lot of the problem is that you have, you know, two point, like I said early on in the video, when you have a join without an equality predicate, you are, you are basically at the mercy of nested loops. So a parallel nested loops here where you are not guaranteed one row on the outside means that you’re going to spool a whole lot of rows in here.

Right. And that’s, that’s pretty painful. So you have 2.4 million something rows here.

That means 2.4 million rows are going to end up on each thread. And if you look at this number, you can see why this is going to take a very long time. Right.

SQL Server, like the estimate here is actually kind of close to reality because it’s that 2.4 something million number times eight. Right. So this actually is how many rows end up on each thread every time you go into this side of the join. And that table spool just does not buy you anything there.

The only thing that I’ve ever found that helps at all with this is to give SQL Server some kind of a quality predicate. Now, if I were doing this, like with a client to really help them get a query running faster, I would probably make computed columns that do something like this so that I can have a sargable equality predicate for this join.

But in this case, this actually gets us good enough performance. The problem is that if we don’t add that force order hint in, if we just do this, we end up with this awful execution plan again. With the force order hint on there, we end up with a much more favorable execution plan.

Now, SQL Server, because we have that equality predicate, we can at least get a hash. Let’s join here. Now, the logic of this may not be 100% correct because I’m saying where the display name equals a display name and the display names like this.

There are all sorts of ways that you might need to logically look at this to figure out if what we’re doing here is exactly correct. One other thing that you could do is find like the min length of display names in the users table and just match on the minimum length of one being equal to the minimum length of another, which is like, I think it’s like three characters over in the users table.

There are all sorts of ways that you can do this or look at this or approach this. You might even try hashing some of it or you might even try hashing the display names or something like that. But the goal is to just get something that’s an equality predicate into the join clause so that when you run this, SQL Server has other options for join types.

And when you run this, it actually finishes pretty quickly with those indexes in place. Is this result correct? I don’t know.

You know why I don’t know? Because none of the other queries finish. So that’s fun there. But it was my attempt to kind of give you some advice on how to start tuning these queries and how to start getting at least some semblance of a reasonable query plan back. Like I said at the beginning of the video, or I don’t actually forget when I said it, joins without equality predicates have really caused a lot of performance problems that I’ve seen over the years, especially when there are really large tables.

And almost without fail, there’s no good indexes to support what you’re actually doing. And almost without fail, it takes a lot of effort to get these queries to do anything reasonably fast. So I’m going to wrap this one up because I’ve got a few other videos that I need to record this evening.

So I’m going to get this one on the books. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I realize that the outcome of this video is a little disappointing because there really is a lot of thought that has to go into what makes sense for any sort of a quality predicate in here. And that can take a lot of tinkering and tweaking. And that’s just a lot of time and effort that you’re going to have to spend, you know, reasoning with the data that you have in your tables.

For me, this is probably a close enough approximation that actually gets some results back and gets the query to finish. But, you know, you might have a completely different set of requirements that make this absolutely useful. So, from me and Bats Maru, my only friend in the world, thank you for watching and I will see you in the next video.

Have a good night.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.