A Little About Windowing Functions And Joins In SQL Server

Do Best

Video Summary

In this video, I delve into some common issues in query writing and indexing that can significantly impact performance. Starting off with a few personal updates, I recently got a new haircut and am experimenting with different socks—though you won’t see them in the video, I’m trying to figure out which level of subscription is best for revealing my sock choices! Moving on from the fun stuff, I address a query scenario where using `ROW_NUMBER()` in conjunction with joins can lead to performance bottlenecks. By examining how SQL Server handles this situation without indexes and then creating an appropriate index, we uncover some interesting insights into parallel execution plans and the importance of proper indexing for windowing functions. This video is inspired by a real-world issue I faced recently where a query was running extremely slow, taking up to five minutes before optimization. Through better syntax and strategic indexing, we see significant improvements in both performance and execution plan efficiency.

Full Transcript

Erik Darling here with Darling Data. And I just want to give you a couple personal updates on me. One is that I got a haircut. Feeling good about the haircut. Two is I’m wearing a new pair of socks. You can’t see that. I haven’t quite decided which subscriber level I want to start revealing my socks at, but we’ll figure that out eventually. From a technical perspective, well, a few things. Got a rather brusk comment the other day. Someone saying that they could hear me breathing and they didn’t like that I cleared my throat and that I was full of myself. So, I’ve taken some steps to remedy all three of those things. One, as a fellow misophonia sufferer, I understand the… as well.

So. it, as sortwa- not because I believe in juice cleanses but just because I am trying to be less full of myself and so I am replacing myself with various cold-pressed juices that’s going fairly well I am very very much not as full of myself after a few days of that but in this video what I’d like to talk about is a common I think well it’s a mix of it’s a mix of problems all right it’s sort of a query writing problem and then when you once you write the query correctly you might even uncover an indexing problem and the problem we’re going to talk about is related to when you need to do a join that involves a row number or a windowing function that you’re filtering on you can see in my query here I have row number generating some some numbers over rows and then I’m filtering on that row number outside of the the join so if we have no indexes on the table and I’m pretty sure that I already dropped indexes but we can validate that there if we get rid of all the indexes on the table now this is the index I’m gonna create and I’ll talk to you about why of course because I care about you and I care about you learning for free what we have here when we do when we run this query I’m gonna try not to scroll down any further and give away too much about the the rewrite but we’ll stick with this for now when we run this query it’s gonna take about two and a half seconds at least it did before it might be might be a little bit longer oh I was so close I was so very close 2.553 I was almost on the nose there and I mean that’s not horrible but I’ve written I’ve engineered this query a bit to not be so horrible so that I can record a short video and not sit here for a half hour waiting for a query to run this video was inspired by the fact that I actually just fixed a query like this today it was running for four and a half five minutes so it can get kind of gnarly out there anyway so what this query does is it goes parallel which is understandable because this query has quite a bit of work to do when multiple CPU threads and cores working on this thing all at all at once it’s probably a good idea and SQL Server asks for an index to support this query it says an index would be very useful here I want an index on post type ID and score and it wants us to include owner user ID which is I guess like a an okay junior DBA index it doesn’t take into account several things I mean well it is focused on on the where clause that we have inside of the join where we have an equality predicate on post type ID and an inequality predicate range predicate on score it doesn’t really take into account the fact that we need to partition by owner user ID order by score here order by score here that we need to join on owner user ID here so well it would help the where clause wouldn’t be very useful for any other part of the query this is another another big reason why over the years I’ve grown to really just trust missing indexes don’t think I don’t think they’re often worth prescribing to people outside of rather narrow circumstances where there’s sort of just blood everywhere and you need to you need to like there’s just no nonclustered indexes on any tables and you need a starting point and you can you can tune queries more individually from from there on out but uh so that’s not it’s not a very good missing index request but it will at least get you somewhere right wouldn’t wouldn’t get you all you know across the finish line but it would get you at least it would at least get your sneakers on so you could you could make you could make the run a better way of writing this query is usually to use uh the apply syntax in this case we have a left join up top so we’re going to use outer apply here to make sure that our query is semantically correct which is a I hear it’s important people want accurate results at least sometimes mountain no lock hence I see makes me wonder why people are like oh well I don’t think the results are right like well were they ever good question but I’m not going to run this as is because if we run this as is it’s going to take a long time and the reason it’s going to take a long time is well if you’ve been paying attention to my youtube channel or my blog for any uh any period of time you may have seen me post or talk about eager index spools and we’re going to hit one of those here the reason why we’re going to hit one of those here is because when we use the outer apply syntax sql server is going to choose a nested loops join to uh execute this query uh where is above we used a hash join and nested loops join because we take a row and we go do some work well if we don’t have a good index we’re going to end up scanning that post table a whole lot so sql server I want you to pay close attention up here because there is no green text along here like there was in the query above saying hey we could use an index sql server just creates one for you and if we look at what happens we talk about this a little bit now I know I have another video on eager index pools somewhere somewhere along my channel but you know I’m just going to talk through it a little bit because I don’t want to make you go jumping around from video to video searching for things I don’t know it doesn’t seem like fun to me so uh there there are a few fundamental issues that I have with eager index pools one is that in the context of a parallel plan they are absolute liars uh they are built on a single thread you can read from them uh multi-threaded but uh reading all of the data out of the post table will happen on a single thread that’s like a 17 million row table building a 17 million row index on the fly is not fun especially because uh spools in general which are built over in temp db don’t really have any of the optimizations that creating temp table selecting data into inserting data into creating indexes in temp db have uh they are uh built a row at a time which is also another rather unfortunate scenario uh for the eager index pool um if you find that a weight on your server called exec sync e-x-e-c-s-y-n-c all one word is pretty high you should start going through query plans looking for large eager index pools small ones probably won’t make that big of a difference if the table is like you know 10 000 100 000 500 000 rows you probably won’t have too much of an issue building uh an eager index pool in a plan and i mean if you have to build it multiple times that’s a different story but um usually it’s when tables tend to get on the larger side i think 17 million rows qualifies at least in this case for being on the larger side uh the bigger the table is the more painful building the spool gets um and of course sql server doesn’t offer you a missing index request to say hey i’d like to stop building this index every time the query runs i’m just going to build it every time it runs and i’m going to make fun of you behind your back so this is how uh using better syntax can uh sort of uh unveil indexing issues uh in in the uh in the in the database and uh if we hover over the eager index pool we can of course see the definition of the eager index pool that’s equal the the definition of the index rather the sql server creates uh the output list is irrelevant in this case because the score column is represented here already so we don’t need to care too much about that uh what we do need to do is take into uh mentally take into account that we are seeking on three columns owner user id post type id and score all right so for most eager index pools you want to take the seek predicates and create an index with a key on the seek predicate columns and then if there are any columns that are in the output list that are not already represented in the key of the index and we’d want to add those to the included columns cool we got that we’re good there so again i’m not going to run that because it would take a while to run but i want to i want you to keep in mind that when we run this query i’m just going to take again about two and a half seconds 2.6 this time shocking uh when we run this query we get a fully parallel execution plan we scan the clustered index and we filter to where the row number equals one here then we come out here and we sort our data here we also have another sort inside the inner part of the nested loops join and because we don’t have an index that helps us with the uh with the row number function again that’s owner user id partition by and score descending ordered by uh we have to we have to sort that data for all of the results that come out of the post table here right that that that sort is going to ask for 490 megs of memory to to run and do that so let’s create our index and let’s see how the plans for these change now the index i’m creating leads with owner user id and it’s going to lead with owner user id because i’m tailoring this index to the apply syntax if i were trying to tailor this syntax this index to the left join syntax i would probably put post type id score first and then owner user id because we can’t do the filtering for this query until we hit the join with the apply syntax and we when we get the nested loops join we can get uh the apply nested loops version of a nested loops join and push the owner user id column into the inner side of the nested loops where it acts as a predicate which i’ll of course show you when we get there eventually so let’s run these two queries now and we’re going to see how the execution plans for these change now i know we only looked at the estimated plan for the apply syntax before but again i don’t i didn’t want to sit there for you know a couple minutes waiting for that to run first thing i want you to notice is that sql server uh asks for a different index on the well actually that’s for the same index on the post table uh because it still wants the filtering in here first because it has to join on this later but again you know owner user id is a join column i don’t understand quite understand why we wouldn’t want that to be um we wouldn’t want that to be uh in the key of the index so this plan is single threaded now uh and this is what happens a lot when you’re indexing for queries especially that use windowing functions in the real world is something that i see happen all the time you create a better index and even though this query is has a cost of sorry an estimated cost of 108.731 query bucks a sql server does not choose a parallel execution plan for it perhaps sql server is catching on to the fact that parallel merge joins were a mistake and because it chooses a merge join here we don’t need we would we would need to uh we we sorry we run the risk of uh getting all sorts of uh intra-parallel query uh uh thread dependencies and deadlocks because of potential ordering but um one thing that is kind of interesting about this query plan is most of so uh most of the time um you know you like the merge join of course expects sorted input and since we have owner user id sorted we can do the merge join without a uh without without a sort operator in the plan and double that with the fact that you know uh the row number function is kind of putting owner user keeping owner user id in order we don’t need to sort the data any any further to support the merge join it’s a little interesting that uh sql server did not join a did not choose a hash join here or it did not and that it did not choose a nested loops join here considering we have a pretty prime index to support a nested loops join and we would have uh at least a pretty good scenario for a merge join we could if this plan went parallel we used a bitmap we could probably do a little bit more work there or a little bit less work there especially because we end up you know it’s a merge join we have to scan this whole index anyway all right we have to do an index scan of the nonclustered index i’m going to frame that up because i’m kind of standing in the way of it all right and this is why sql server wants the missing index request on these two columns again here because it has to do this filtering first and then way over at the merge join is when we can finally evaluate if all right a little bit of screen lag there’s where we can finally evaluate if owner user id equals the id column in the users table so the the second query the outer apply query is in a slightly different shape uh rather than taking 4.2 seconds this takes about a little under 500 milliseconds you zoom in on that hello screen lag 483 milliseconds this does get a parallel plan now i could have nerfed this to get a max stop one plan or a single threaded plan but i chose not to because i want you to see what i encounter a lot of the times in query tuning and that is that with a like you know better indexing in place oftentimes the the query pattern above that i showed you with the left join to the the select from the post table then joining to the user’s table outside and also filtering to the row number function outside uh is slower right a lot of the times because sql server will choose a serial execution plan for that now part of the beauty of using the apply is that we don’t do one big scan of the post table we take a pretty small set of rows from the users table right we take about 13 000 rows from the users table we go into our nested loops join and for each iteration of the nested loops join we seek into uh the index that we created on the post table which you know that’s not bad for the amount of rows that we have to go and do go and like find over there and go do stuff with uh we generate our window function and of course over here we filter out our window function at the end which is good enough i think bring the query from you know uh the the original form with no indexes to support it to uh the the form where we do have an index to support the query and then uh writing the correct syntax we made it we made enough of a dent in this query i think to to say hey i think i think i think we’re in good enough shape here there’s not really a whole lot else you could do now keep another reason why um over time i’ve grown to distrust the misinformation that are that is missing index requests is now sql server for this query is saying oh we need an index on the users table on reputation and display name well i guess we guess we could shave 76 milliseconds off this thing maybe i don’t think that’s really i don’t think that’s really what i would call a victory at least i wouldn’t i wouldn’t i wouldn’t i would never go to someone who hired me and say hey we’re down 76 milliseconds here’s the invoice doesn’t seem doesn’t seem like a good plan to me anyway uh to kind of recap what we talked about here because we are getting up around the 20 minute mark and i must my mustache is a little bit itchy uh when you’re writing queries where you’re going to use a windowing function to uh determine the top you know one thing for something we could you know again if i was working on this for a long time i might experiment with writing the query as a top one uh or something like that it all kind of depends a little bit might even might even try like a max query here right get the see how that works so it kind of depends on the data and the indexes a little bit might also depend on you know if i’m even able to create indexes maybe i maybe i’d want to do just like a big aggregate where i get batch mode or something but when you’re writing queries that need to find the this specifically where you’re looking for the row number you’re filtering to the row number and uh you need it to be fast generally the better i the better prescription for writing this is to use the apply syntax to encourage the optimizer to use a nested loops join and also to have a decent supporting index for what the query is uh well in this case the join columns the filtering columns and of course the the windowing function columns are all very important things to consider um one pretty big deal that uh that i find is uh if you don’t have an index that perfectly supports the windowing function and you end up like we saw originally with with a sort on the inner side of the nested loops to put the data in order for the windowing function uh that one big sort on the inner side is typically a whole lot more painful than doing one sort per owner user id on the inside right because you you you would you would you would only need to sort data that you found for each owner user id rather than sorting the entire table that you’ve that you found after filtering out these predicates by owner user id so i forgot what i was saying must be the juice cleanse cleaned out my brains apparently i don’t know what that says about where my brains are anyway uh thank you for watching if you liked this video even in the tiniest bit like an iota if you like this video uh please also hit the thumbs up button to like it uh if you like me my haircut you want to see my new socks uh or you appreciate my uh advanced noise gate technology or you no longer have to know that i am a biologically viable uh entity you can subscribe to my channel for more noise-free sql server content uh we hit 21 minutes on this which i did not intend to do so i’m going to uh i don’t know maybe i’ll go faint anyway uh thank you for watching

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Do Best

Video Summary

Full Transcript

Going Further

Related Posts