Streaming Week: What Else Can SQL Server Indexes Do To Improve Performance?

WELL.

Video Summary

In this video, I delve into how indexes can significantly enhance query performance by providing SQL Server better join choices. We start with a simple query that joins two tables: `posts` and `users`, where we count the number of posts made by users with a reputation over 100,000. Although the initial execution plan shows a quick scan of the `users` table, leading to only 613 rows, it highlights how SQL Server reads about 2.4 million rows from the `posts` table to retrieve these results. Despite this, the overall query takes around 1.86 seconds, with most of the time spent on reading those additional rows. As we progress through the video, I explain that while creating an index might not drastically improve this particular query’s performance, understanding how indexes work and their impact on execution plans is crucial for tackling more complex scenarios.

Full Transcript

Thank you. Thank you. Thank you.

Thank you. Beep.

There we go. We are unmuted. live and unmuted I had to wait to unmute myself because there was this really really terrible honking going on outside it was because as soon as anyone was allowed to do anything ever again in their life everyone started honking again everywhere it’s terrible it’s disgraceful absolutely disgraceful absolutely disgraceful alright here we go some people starting to file in thank you far too kind hanging out with me today this afternoon I hope it’s not too late for anyone in Europe I know you have very strict schedules about when you need to start drinking I appreciate that about you I really do I really do do do do do a little bit longer longer than they have to it’s no fun waste of time waste of time sitting here listening to me blathering banter on hello Kapil how are you always happy when when there’s some chat message that way I know that like there are actual live people out there and not just bots not just bots because if I’m just getting watched by a bunch of bots then I don’t know they’re going to be like I think bots would finally like figure out a way to kill themselves like that’s enough end end they’ll figure out how to quit them and that’ll be it I’ve heard that’s a funny joke getting rid of some stuff getting rid of some stuff so how’s everyone doing today hey how’s it going what’s different today today’s stream is well yesterday was me working through the deck and demos for a future presentation it was it was me working like just trying to figure out what I want to do rehearsing a little bit today is a presentation that I’ve had I’ve had done it in the books for a while that I just really liked it’s about indexes it’s not not quite as I don’t know it’s a little bit more advanced than yesterday’s stuff it’s going to going to dig into some more interesting things with query tuning not just kind of like what indexes are you know not just just that today’s going to be a little bit a little bit more advanced I have to figure out how to follow people on twitch whenever I search for people I can’t find them it’s craziness craziness out there alright let’s see here do do do do do do do yes it will be an adventure it will be an adventure of sorts we’ll see what happens hopefully it’s a an adventure free of failing demos that one yesterday is still bothering me I want to know I want to know when I made that change it messed everything up it’s like huh what happened what happened to you what happened to you you went all cuckoo on me it’s just me and brent on twitch yeah that figures figures trying to be I don’t know boldly go where no dba has gone before I don’t know if this doesn’t work out then I’ll stream me being terrible at video games on twitch which I don’t know maybe that would be more interesting to people I’m not sure I’m not sure I don’t know a lot of other I mean other people who are supposedly good at SQL uh stream things and do things I don’t know I don’t know if they stream on twitch or anywhere else but apparently apparently things happen I don’t know apparently things happen it’s it’s crazy it’s insane people keep wanting to learn things good for you good for you alright we’ll give it one more minute here and then we’ll get started I think there are there are just about enough people for me to say thumbs up let’s go I wish I was I wish I was good at more things like whenever whenever I I talk to people who are uh good at SQL Server they always have like all these other interests that they’re like like very like very much into very good at uh like just like profoundly talented people in many ways and I I feel like a one trick pony a lot of the time because I’m like I know kind of a bit about databases like yeah but what do you do for fun and I’m like I don’t know fun forget what that is I don’t know it’s like this and like uh this and like lifting weights back when back when gyms were allowed I I was pretty good at that too but I don’t know if that’s really a talent that’s it’s just insanity all right it’s 10 past we’ll get going uh we will start talking about one of my favorite subjects in the world that is indexes and the reason why uh I am I am so profoundly attracted to indexes in SQL Server is because um I love being able to uh get performance better without having to change code or more importantly not without having to change the logic of code whenever I have to change the logic of code I get very nervous about getting incorrect results back because you know despite the fact that every query in the world I see has no lock on it already I’m worried about results being wrong so I’m like look we’re gonna try this we’ll see what happens but I don’t know it’s very very tough to write logically equivalent queries sometimes right when people are just like we need to rewrite the query and I’m like okay to what like like what are you gonna do like what’s your big plan for a rewrite it’s like it’s like a query with a few joins what are you are you gonna reorganize the joins you’re like what are you gonna do what’s your solution here for rewriting the query now there’s all sorts of domain knowledge stuff that comes in there too right like like you might find a query looking at someone’s server and be like this query is a piece of crap who wrote this we gotta fix this immediately and they’re like oh that should have gone away we don’t even use that module anymore and they’ll like like spend a few minutes looking through source code and be like delete it and be like gone and like to me that’s like a perfectly tuned query is one that just never executes right it’s just gone it’s never gonna run that is the fastest possible query you can ask for they’re like I hate making like big structural rewrite changes to queries because I’m like I don’t know if I’m gonna get this right and then like there’s it’s always very difficult and confusing when you’re doing that because if you get different results back you’re just like what did I what happened it’s very frustrating so I love indexes because usually adding indexes or you know even not just adding a net new index sometimes just changing the way an existing index is arranged can just have such a profound impact on performance make things a whole lot better and that’s the kind of thing that I love because I’m a consultant and the less time I have to spend actually doing something the more my time is worth which is awesome so if I can make quick index changes or if I can identify indexes it’ll help like a bunch of queries and I’m like sweet I don’t have to I have to do far less work a lot of the times when I come in and start working with people the situation will end up being something like no one has ever really looked at like SQL Server as index DMVs for anything and I can identify like a couple few indexes from there there’s like like they may not be perfect because the missing index requests aren’t perfect but they might be good enough to get like a like a whole bunch of queries to a better place so if I can add in like you know somewhere between like you know five and ten indexes and I can get like an entire workload to be better off then I’m psyched because I don’t have to go in and actually start tuning code and things I’m very happy when I can just like adjust indexes a little bit and have like the workload overall be better right yeah and people love that’s the kind of stuff that end users are like wow like it has like a good effect on people like you know like working with people when you know they like they have no interest in databases they have no interest in SQL Server indexes anything like that they’re just sitting there pressing a button they’re like I just want to know why every time I press this button I waste 30 seconds of my life I want to know like if I have to press this button ten times in a day those 30 seconds start adding up right if I have to press this button a hundred times in a day forget it I’m never going to leave work all I want to do is press this button and not waste 30 seconds of my life and sometimes indexes really can do that there’s like all those like you know like consultant like glamour stories or just like let me tell you about this one time sit down everyone sit down there was this queer there was this report and it ran for 48 hours it would take all weekend I came in I added an index and it finished so fast everyone thought that the process was broken because it finished so fast and you’re like wow cool what was the index and it’s just like like the most boring index in the world but that’s how good they are that’s how much indexes really do help things that’s how like important indexes are to a workload when I talk about indexes and databases you know you do have to spend some time getting like the right amount in there because indexes to databases are a lot like salt to food right you want to make sure that you have enough salt in there that you can taste it but not so much that you end up like pre-hypertensive right because it’s not a good look either right just getting getting sweaty and out of breath constantly is not not a good look for databases so they are in the same way it’s very easy to tell when you look at a database if there is too much or too little salt in there because it’s the same way when you take a bite of food and it’s just like like this doesn’t taste like anything or if you take a bite of food and you’re like like I I just had a heart attack thinking about it it’s wonderful wonderful Wes says I’m so glad that I don’t know the data model for the databases I manage I get to look at the indexes and not the bad queries that the reporting team writes yeah absolutely like the less you know but the big like if you know nothing and you can still have a big impact that’s awesome because you can spend like the rest of your brain like the rest of your brain space on more important things right it’s like I don’t know people’s birthdays like non-SQL jeopardy facts it’s great it’s great to have so anyway let’s go and we’re gonna so the way this thing is the way this talk is set up we’re gonna start on the easy side and we’re gonna work in and things are gonna get a little bit more challenging as we go along and we’re gonna solve tougher problems as we move along in this right so we’re gonna start off with like some pretty bit like a couple some basic stuff to make sure that everyone kind of understands what where we’re at and then we’ll get on to harder stuff so don’t worry if you feel like the first demo is not really like the most advanced thing you’ve ever seen in the world you’re right it’s not it’s just kind of a warm up just to make sure that we’re all cool here so let’s get going and let’s talk about how indexes can give SQL Server better choices for how we join tables together right so let me actually just go back up here let me make sure that we are starting from fresh because I don’t want to start from not fresh so if I start from not fresh then demos are going to get confusing when I run them so what we have here is a pretty simple query it’s just one join in there and just one part of the where clause we’re going to join a table called posts to a table called users users make posts and based on how good posts users make they get higher reputations if you know anything about Stack Overflow Stack Exchange if you post questions and answers on there you’ll know that it’s really really hard to get a high reputation like you have to really post a lot in order to get a lot of upvotes or you have to have like really good posts to get a lot of upvotes rather so it’s not it’s not an easy task to have a reputation over 100,000 if I stick a comma in there that is the right number that very rarely happens to me I’m very proud of myself for that so what we’re going to do is we’re going to get a count from this join just for how many posts do users with a reputation over 100,000 have and that’s not a whole lot of people but this query is a little weird this query takes a little bit longer than I would want and if we go look at the execution plan we go look at the details we can zoom in a little bit here let’s look at what happened well we started off with a scan of the users table and this didn’t take long at all this was pretty quick right we scanned the users table we get the 613 rows that we care about again not a lot of people have a reputation over 100,000 if we go and highlight over this arrow and I will zoom in actually do I have to kill that this is crazy I don’t remember if I restarted my computer yesterday that’s how much of a goldfish I am and I didn’t know if I still had zoom it running which it doesn’t look like I do so I should be able to just zoom inside of this VM for free so if we look at this SQL Server did a pretty good chunk of reads here we didn’t pass on all the rows but we had to read about 2.4 million rows just to get 613 out of there but this was fast this was fast this was okay at 47 milliseconds even if I created a great index there this wouldn’t fix the overall performance of the query because this query if we go over here and we look at how long it took we can see that it took about 1.86 seconds if I cut 47 milliseconds out of that I’m no hero right this is not where the index kicks in and it’s just like you did it no that’s not where it is where it is is down over here and I love new versions of SQL Server and Management Studio where we start to get this kind of feedback back from execution plans otherwise people just like they get all caught up in costs and like you know just like weird metrics and they’re like oh we gotta fix this we gotta fix this but like the real root problems become much much more obvious when you have these operator times in there these operator times if you’re new to this stuff it’s just these numbers here that come up under the operators in newer versions of SQL Server when you go ahead and get the actual execution plan so all these things now a couple little differences here if you if it’s a query that only has row mode operators in it remember there’s like this batch mode thing that happens with column store and on and sometimes on SQL Server 2019 the times are going to accumulate going from right to left so if we look across here this was 47 milliseconds this was this didn’t take any time this didn’t take any time and then if you look down here that was 1.78 1.78 and then we spent like we didn’t spend another 1.8 seconds here it was the like 1.8 minus the 1.778 down here so like the times accumulate going from right to left so that’s the that’s the way this thing kind of works there are there is some weirdness with it but and it’s not like totally perfect but for the most part that’s the behavior you can count on so this takes 1.8 seconds and we can see looking at this a lot of the time was spent in this clustered index scan on posts right and that we have this here because we don’t have an index on the column in the post table that we’re joining to this I know is rudimentary stuff but let’s think a little bit more about why there’s a hash join here right because that’s because really the root of it is is not exactly the like like that there’s a scan over here the root of it is kind of like why a hash join why SQL Server would you choose to hash this well think about the SQL Server’s two alternatives the other types of joins that we can have is that we could have is a merge join right so if I run this and we look at what happens when we tell SQL Server to give us a merge join oops I got a little excited there but we’re telling SQL Server give me a merge join here remember the other query finished in 1.8 seconds with a merge join we finish in 5.7 seconds right so that’s that’s that’s not an improvement SQL server was right about that merge join a key difference in the execution plans when we add when when she when if SQL Server chooses a merge join is it needs sorted input going into the join I kind of messed up on the zoom there there we go that’s what I wanted so this merge join over here is going to expect sorted input from both sides the users table is already in a useful order for us because it’s on the the ID column that we’re joining from the users table is indexed it’s a primary key it’s a clustered index so we have that data in order we don’t have the owner user ID column in order so SQL Server has to sort it and prior to sorting it SQL Server is like well if I have to sort all these duplicate values the first thing I’m going to do is aggregate them down to the smallest possible list I can this adds a bunch of overhead and we actually end up spilling out to disk a little bit I’m not saying the spills here are like the cause of the performance problem but you can see that SQL server had to do extra work to implement a merge join here the other option would be a loop join and loop joins really make the most sense when you have both sides of the of the of a join indexed on the join call like with like the like the the join columns in the index somewhere kind of depends on what your where clause might be up to but let’s just say that you know your join column should be indexed at some point if we look at what a loop join does it’s not really any prettier than the merge join or the hash join right like this is kind of bad news this is taking this is taking a while too this took nine seconds what’s up Lee how’s it going so this took a while too and if we’re looking at what happened here like this isn’t like this is about what we expected but then you know SQL Server starts picking up a whole bunch of steam on here and then things just don’t go so well so SQL Server in order to do a merge join or a loop join would have to do a lot of extra work a lot more work than it does then it we’d have to do with the hash join right the hash join actually turns out pretty well comparatively but the hash join does require a scan on both sides and you know it’s it’s a lot of work I think because like going back to this a little bit it’s a lot of work for a for a tape for a join that we know is only going to produce 613 rows here so if we have to give SQL Server more choices if we want to give SQL Server easier choices about using different kinds of joins then we need to make sure that the date that the columns that we need to relate from one table to another have indexes on them and like again I know this is pretty rudimentary stuff you know very smart people out there but you know everyone everyone has to start somewhere and so if there’s anyone in there who who maybe is not you know like the absolute of it like like you know hasn’t been tuning queries and indexes for I don’t know a while like it just needs a little sort of basic guidance on this stuff usually you want to have columns indexed and once we add an index and this is again very simple index just because our query is very simple this isn’t the kind of index that I would go and create in real life because it’s just a single column non-clustered index that’s not going to be helpful to a ton of queries but with but for this demo with this index in place if I just have that index right there are known or user ID SQL Server all of a sudden has lots of choices and lots of cheaper choices for coming up with the join that it wants so now not only do we have a much faster query with this thing taking 343 milliseconds rather than you know 1.8 seconds or you know darn what six seconds or 10 seconds with the it with it without the index and the other joint types this ends up much better we still scan the clustered index of the users table but now we’re able to to use a nested loops join and we’re able to go and seek into the post table for just the 613 rows that we care about that come out of the users table remember that they’re like even though the users table is pretty big at around 2.4 million rows there are only 613 rows coming out of this that we care about that makes it very easy for SQL server to say I have 613 of these yes I am going to seek 613 times but boy is it going to be a lot faster than 1.8 seconds are there any questions on the joins part before we move on it’s totally fine if there’s not I’ll give you a couple seconds to uh to say something just because I know there’s a little bit of lag here and I I don’t want to like get started and have like a great question come up to go back because boy is that embarrassing boy would that be embarrassing be sad right sad I would be sad I would be sad if I had to do that all right so indexing uh indexing join columns pretty pretty normal thing to want to do um without the index really hash was the only sensible join strategy merge would have to sort and aggregate and then nested loops would also have to do some pretty ugly stuff with the sorting and aggregating too because there’s really no other way to get those joins to be fast without doing that work and especially for merge where you need that sorted input not having an index there just really makes it suck actually I I should show you the difference when I when I force the merge join with an index on owner user ID this here will be a lot better choice too like it’s not going to be like the fastest thing in the world but if you remember the other query plan notice that we no longer have to sort any data coming out of the post table we have that we have the data in the order that we want it in right so we don’t have to hash that data and we don’t have to to sort that data but we do still choose to aggregate that data just to make the joint a little bit more efficient all right cool so let’s move on to the next bit the next next next thing is one of my absolute favorite demos in the world because it deals with one of my least favorite query plan operators in the world and that is an eager index pool and rather than like try to talk to you about it first I’d much rather just talk through exactly what an eager index pool is and what it does so let’s clear out our old indexes because we don’t we don’t need these anymore and what I want to do is run this query with a top 38 here all right so I’m going to get the top 38 rows from users and I’m going to cross apply to badges to get the everyone’s most recent badge right so I’m getting the top one name from badges correlated on user ID and ordered by date descending so I’m going to run this query with a top 38 and this isn’t going to be too bad it’s not spectacular it’s not spectacular but it is it is not too bad comparatively so just think about just memorize the shape of the plan memorize the shape of the plan a little bit we don’t need to memorize the entire plan we don’t need to make a big fuss out of what exactly happened but notice that we have the users table we have a sort we have a loop join we have a sort down here and we have a clustered index scan down here too all right so just the sort of general shape we scan users we sort it we go into a loop join we can we sort stuff coming out of the badges table to pretty nice pretty simple now let’s look at what happens when we bump that up to a top 39 when we bump this up to a top 39 things are going to slow down pretty considerably and while things are slow let’s let’s let’s chat a little bit let’s see stefano says nice ascii art thank you oh you know what i’m going to put a link to where i make all my ascii arts into chat so that if any of you want to also ascii up your presentations or whatever you can do that let’s see here lee says my nemesis of late you have a very easy job because you can those are problems you can solve real quick uh hi david nice to have you here great to catch you live too beats the alternatives right anyway so this took 22 seconds 22 full seconds all right 21.884 seconds but you know what this is horseshoes this is hand grenades this is close enough to 22 for me i’m not going to split too many hairs here now we’re going to go back to using our old friend operator times and we’re going to remember that reading the query from right to left operator times aggregate from operator to operator so we can see that this part here was quick right this was not slow and we can even see that this clustered index scan on badges was not slow that took about a second what took a long time here what 21.829 seconds was building this eager index spool now eager index pools are interesting for a couple reasons one they uh will they represent sql server calling you a very lazy person they’re a very passive aggressive operator because sql server has chosen to make an index for us while the query ran sql server has chosen to take the entire badges table spool the columns that we care about to tempdb into if what’s effectively a temp table with a clustered index on it but it gets more interesting it gets a little bit more interesting because when we if we get the properties now one of the most important things in the world one thing that i cannot stress enough to you about how important it is always get the properties of these different operators if we get the properties of the clustered index scan even though sql server is telling us that this went parallel we have our little parallelism badge we have a little erasing stripes there even though sql server is telling us that this went parallel all eight million rows are going to end up on a single thread they might like which thread they end up on might change but the fact that all of the table gets read single threaded to make create the index is always true this is eight million rows and this takes 21 seconds so uh really the big the big thing here is that even in a parallel plan this scan before an eager index spool is going to be single threaded and that’s that’s just kind of bad news now i used to think that this was microsoft like really just kind of enforcing a weird limit because like when you create indexes in enterprise edition or developer edition what do you get you can get parallel index builds right good stuff standard edition you don’t get that so i thought this was like microsoft being like uh-uh standard edition people you get nothing nothing like that but it actually turns out um but i would talk to a very very uh smart person at microsoft who told me that uh there was a very very high likelihood of deadlocks if we build indexes in parallel on the inner side uh or within an eager index pool so apparently there’s a good reason for it but you know who am i to judge now we can read from the index pool in parallel if we look at what happens here it is maybe not the prettiest most even spread in the world but it’s not terrible this isn’t the end of the world here so we can read from the index pool in parallel but when we bring data into the index pool that’s going to be single threaded the good news sort of is that there is a weight type that gets associated with this if you have a parallel plan like this and an eager index pool gets built if you again this is why it’s always important to go looking at the properties and the properties of on newer versions of sql server we can see the weight stats that a query generated while it ran well not all of them not all of the good and important ones but we can see a pretty good chunk of them and if we look at uh weight stats that got generated for this query we can see like some pretty pretty normal ones that we might usually see on a sql server sql server memory allocation cx packet but then we have this sort of oddball one down here this oddball one down here is called exec sync now there are other things in sql server that can cause exec sync weights to happen but normal like normally when i go and look at a server and i you know use uh it doesn’t matter if it’s paul randall script if it’s uh sp blitz first whatever weight stat script you like if i see exec sync weights on a server that are you know like fairly high and especially if they have an like a long average duration like if you look at this one this one this this this this this weight actually generated 65 seconds of wait time because remember it’s generating in parallel there are multiple threads waiting on that eager index pool to get built there are this is a dot for a query so there are four quid four threads waiting on that so we have this exec sync weight coming up if i see like this is a pretty long weight on average if i had this query running a lot it would always be waiting 20 seconds on the the index pool to get built because remember they get thrown away as soon as the query is done sql server doesn’t keep them around for anything but if we have this exec sync weight and it’s we see that like it has a pretty long average duration then i might be paying special attention looking uh for looking at poorly performing queries for eager index pools now if this plan was sick was serial if this was a single threaded plan we wouldn’t have a weight that was helpful to us but in parallel plans that thing shows up and is a pretty good uh pretty good red flag about what might be going on now if we want to fix eager index pools we can’t count on help from sql server there’s no missing index request here there’s no sql server like sql server is not sitting there saying oh green text green text we we need help danger will robinson uh there’s nothing there and it’s not like this this is just sms ssms being broken there’s no missing index in the xml either i’m not going to spend a lot of time scrolling through that but you get the point sql server isn’t telling us that it needs an index sql server is just hauling off and building this index and it’ll do it every time the query runs it’ll toss out the result the the the the index data later it’s up to us to look at what’s going on and kind of figure out a good index now the easiest way to do it if you don’t want to spend any time on it if you look at the pro if you look at the eager index pool right and i’ll zoom in here there are going to be a couple different parts of the eager index pool well one part is optional one part is always going to be there the part that’s always going to be there is going to be a seek predicate and you can think of the seek predicate as sort of like what the key of the index would be uh would the sort indicate help i’ll cover that in a second so the seek predicate here is going to be what sql server built the key of the index on now uh sql server of course when it creates an eager index pool the object up in tempdb is pretty much like a temp table with a clustered index on it we already have a clustered index so we would have to build a nonclustered index to satisfy this query but we don’t right now on the badges table we don’t have an index on this user id column and the user id column is what sql server wants the key of the index on we also have two other columns involved here we have the name column and we have the date column because remember what we’re doing is selecting the name of the badge for a user ordered by date descending so we’re getting their most recent badge so we have three columns that we would need in an index in order to make this thing go away so if you wanted to do the like the easiest most basic thing in the world it would be to look at this look at this tooltip create an index with a key column on user id and then create uh or then add name and date as uh as included columns that would be very easy but remember something remember what the query looks like we’re ordering by date descending so what we might want to do and this is only something that you can do if you if you look at the query and you look at the execution plan what we might want to do is also have date as a key column so that we can have the date column in order for free when this runs and i’ll show you what i mean we don’t need to have the date column first in the index we just need we need to have user id first in the index because that’s going to be what we seek to and then what we’re going to have next is the date column this doesn’t even have to be descending what i’m going to show you but then we’ll have name included here so what i’m going to do is create this index and the nice thing about creating this index is that it doesn’t take 21 seconds to create does it it doesn’t take 21.8 something seconds no creating this index nice fast in parallel takes 1.7 seconds what the hell was sql server doing for another 20 seconds when it created it on its own i don’t know apparently there’s a whole bunch of stuff about spools that just like didn’t get the message about certain optimizations that the rest of sql server did i’m not going to get into it here because you know we only have so much time but this is a like they’re just that like they kind of fell behind a little bit like when like we have like the no child left behind thing in america we should have the no operator left behind thing because some of these operators were left very far behind so now that we have this index on user id date and include name if we run this query asking for the top 39 we’re not going to have that index pool anymore sql server already has data or has the data that it uh it wants in the right order right so when we we seek into the badges table this takes nothing and when we look at the top right again we don’t need to sort we don’t need this isn’t a top end sort it’s just a regular sort the only sort that’s happening is actually on the user’s side kind of kind of shocking kind of shocking i don’t know why it’s shocking but it’s kind of shocking it’s a little shocking we have to up here we have to order by reputation descending because that’s part of that’s what we’re doing there right so it’s no no no funny tricks that’s that’s ordering by reputation but now this finishes a lot faster and now this is actually something that i was i was trying to trying very hard to write a demo to uh to do yesterday but um now i have one magically today sql server now we have a missing index request right when this query took 22 seconds and we were building an index in like while the query ran no missing index request now that we have this query down to like under 200 milliseconds sql server is like oh i got it i got the solution i know how to fix it i can do it i can do it it asks for an index on on the reputation call on the users table it says give me an index on reputation and include display name what is that gonna fix like it like cool it’ll take off i don’t know what 100 milliseconds 85 milliseconds we already did all the work screw you sql server why why are you chiming in now with this nonsense so whenever you see uh eager index pools and execution plans they are a giant red flag that we have that we just have some yet there is some very obvious index for this query that does not exist uh on the table currently and that adding and though sort of and remember always use as much feedback as you can before you go and make these changes like for for this query was very very obvious that we needed an index because it was taking 22 seconds and most of that 22 seconds was spent building the eager index pool if you see eager index pools and plans and you’re able to get the actual operator times but the eager index pools aren’t the thing that’s taking a long time to like build they’re not they’re not like really the reason why the execution plan is slow then it might not be worth it to fix them but in general when you see your index pools and plans that is something that you should focus on at some point in your investigation because i guarantee you that there’s an index in there that could help especially if you’re dealing with a long-running query and especially if the table that the eager index pool is being built from is a rather large table i think that’s probably the best rule of thumb think about the size of the table that uh the that’s being fed into the eager index pool and if that’s a big table then you then that’s probably something that you’re going to want to fix because the bigger a table that feeds into an eager index pool is the longer building that eager index pool is going to take so uh it doesn’t look like there’s any questions so i’m going to move on and we’re going to talk about a slightly different kind of spool of course if you have any questions just chime in and chat i’ll i’ll backtrack and answer them but uh for now i’m going to move on and we’re going to talk about table spools table spools are sort of like index spools except that they don’t have any index on them they’re just sort of like where where uh an index pool is sort of like a temp temp table with an index on it table spools are like a temp table with no index on it um they get used for different reasons but uh you know again implementation details are kind of boring to get into but table spools you know they they get they happen for different reasons and they sort of have a different uh sort sort of have a different usage pattern now i’m going to well i’m going to create these two indexes and while they create i’m going to talk a little bit about the query and i’m going to talk a little bit talk a little bit about why uh the last query for eager index pools and this query uh for table spools have some things in common now uh one of my absolute heroes in sql server is a guy named adam mechanic and i used to always see him writing cross supply queries and i said man those cross supply quits are awesome i get to write cross supply queries and whenever when i started doing that i started running into very adam mechanic problems i started understanding the things that he would talk about more and more uh and like like whenever i would see him talk about it’s like oh that makes that makes total sense now thank you adam uh but this is a this is a cool one uh so the reason that cross supply tends to work well for table for uh demos that have spools in them is because cross supply is just about always optimized as a nested loops join whenever you have a nested loops join you have a loop and whenever you have a loop you have something that is going to happen repetitively that is what a loop is you have the part of the query that’s like you have what’s called the outer side of the loop right which is up here and then that is you know kind of like the thing that happens once and feeds into the loop then you have the inner side of the loop and the inner side of the loop is a thing that needs to happen over and over again spools as spools usually happen because sql server wants to do something less repetitively if we think about the last query that we ran that uh where the the tipping point between 38 and 39 rows in the top at 39 rows sql server said i don’t feel like scanning the badges table a 39th time i’m going to build an index once and i’m going to use that index 39 times for this query you’re going to see something kind of similar you’re going to see sql server use it like we’re going to have a nested loops join because of the cross apply and sql server is going to try to cut down on repetitive work by using a spool to only get data only get new data sometimes and i’ll explain to you exactly what that means uh when the when we look at the execution plan ronald says do you ever use live execution plans uh occasionally occasionally i’ll use live execution plans only if um i am completely dissatisfied with how long i’m i’ve been waiting for a query i might kill it and run use a live execution plan to see where things are getting gummed up but a lot of the times if you turn on actual execution plans and you use like sp blitz who or sp who is active they’ll give you a snapshot of the live execution plan so you can kind of see where things are at anyway live execution plans are totally fun to watch just sort of like you know as trivia but they’re not something that i rely on for query tuning regularly so what we’re going to do here is uh we’re going to stick some data into uh this this temp table and this temp table doesn’t have any indexes on it’s just a single column and what we’re going to do is we’re going to stick in user ids for anyone who has this badge called popular question the thing about this badge is that you can get this badge many times and we’ll look at that in a second but now let’s run a query that uh takes data from the temp table and we do some work to figure out like someone’s like the most popular uh question by score that someone asked who had a popular question right so it’s a pretty i don’t know kind of a useful query right so we we we we take people who have this popular question badge and then we do some work to figure out what their most popular questions are right so it’s a pretty good thing to do we say okay you have a lot of this badge what do you got what’s in there sport like what what do you got going on what’s what’s so great about you this will run for about 11 seconds all right so again 10.8 it’s close close enough for me it’s close close enough to 11 for me i’m not gonna again horseshoes hand grenades and eight i’m gonna round up a little bit and you know again backtracking reading execution plans from right to left or at least at least looking at the the operator times from right to left we can see that we build up a lot here right and we spend a lot of time in here remember this is 1.126 seconds and this is 7.429 seconds so like the majority of the query time is spent here now we talked about eager index spools last time lazy table spools are a tad different and they’re a tad different because where eager index spools or rather where eager spools in general are different from lazy spools is eager index spools will read the entire result set into the spool like at the get-go lazy spools are different because they only go and get data when something changes and i’ll show you how you can look at that whenever you look at the properties and again always be looking at properties a b c always be looking at properties if you look at the properties of a table spool which is especially helpful and actual again especially helpful in actual execution plans uh we’ll have these two things here we’ll have rebinds and rewinds the thread spread on them isn’t really the most important thing what’s important is understanding what they mean so if we have a rebind that means that we reused data inside of this spool we went and got data from all from all the way over here we populated the spool with it and then we reuse data in that spool when we have a rewind that means oh i’m sorry i have that backwards and we have a rewind when we have a rewind that means that we reused data inside of the spool you can think of that like a cache hit when we have a rebind that means that we went and got new data and put it in the spool you can think of that sort of like a cache miss lee says i noticed the estimates are out of whack that’s connected to the table spool too right uh you know i haven’t really looked at that so much but yeah generally you know um the since the table spool is an operator that executes multiple times it’s i think when you get into situations like that it becomes more difficult to sort of forecast exactly how many rows are going to happen from one to the other uh so you know sql server might make like a best effort estimate based on the uh the distribution of values that it comes up with when it looks at the um the uh the histogram for the temp table so yeah generally it’s going to be connected to the spool but uh you know it’s a very i would say it’s a very it’s a very difficult thing to figure out uh for the optimizer and it’s like i wouldn’t focus on necessarily the the um the estimates being way off for it i would more focus on why the spool is happening in the first place so we can see that this spool is doing a fair amount of work so every time we need to rebind we had to rebind 109 000 times we need to we needed to take the data that was in the spool truncate it and then rerun all of these operators to fill it back up this happens because there are a bunch of duplicate values in the temp table sql server if like let’s just say that this had the numbers one through ten and the numbers one through ten each happened ten times so we had ten ones ten twos right on down to ten tens sql server takes the number like let’s just say that the first one first id that comes out of here is a number one so sql server says i’m going to scan you i have the numbers one through ten and i have them ten times i’m going to put them in order so i have my ones and i have my ten ones in a row my ten twos in a row and so on then we go into this nested loops join sql server says i have a one table spool what’s in you the table spool says i’m empty the sql server says go get all the values for id one the table spool executes goes and does all this stuff and it brings back all the data for id one and then for the next nine times that the the nested loops join takes the number one like from from the from the sort it’ll just reuse data inside of the spool so it’ll hit this spool ten times in total then the number two will come around come along nested loops join will say i have a two index pool or table spool what do you have and the table spool will say i have all the values for one and sql server will say nope go back and get me all the twos and then sql server will come over here get all the twos come back repopulate the spool with all the all this all the twos and then the the data in the spool for the tools will all get reused nine more times so that’s why spool that’s why spools like this exist the thing is there is almost no reason for us to have duplicates coming in here because the purpose of the query is to just get a sum of scores for people based on their high scores right it’s we don’t need to look at someone to look at a single user’s high scores over and over again now if we look at what’s actually in the temp table so we’re going to look at like what got in there we’re going to look at user ids and how many exist and then we’re going to look at a count of distinct records overall we’ll see that you know for the number of records that are in there or the number of distinct like the number of user ids that we have in there versus the number of distinct records are in there it’s like a lot of these a lot of the reason that we have 368 000 rows because we have hundreds of entries for other user ids right and that’s that’s not something like we don’t need to know user i the sum for user id 4653 623 times you just don’t need that this is not going to do anything for us so what we can do is rather than put all that like additional data in there is we can stick a primary key on our temp table and we can only distinct only select distinct user ids into it and we’ll end up with far fewer rows in the table overall right so i’ve dropped that old temp table i put new data in it and now when we run this query it’s maybe not going to be the fastest thing in the world like there’s probably still some index tuning that we could do but i think three seconds is a lot better than 11 seconds and that’s that’s 3.3 verse 10.8 and if we go and look on the drag that over there bring that plan back if you go look there is no eager there is no lazy table spool on the inner side of the nested loops seagull server just goes over here scans the clustered index and we do a very simple loop and we just get the rows that we need so well the spool does help us with not doing repetitive work between you know the theoretical here and here we do have to touch that data inside of the spool over and over and over again which is kind of a bummer so with uh just like you know not even sticking a different index on the post table just sticking uh the right index on the temp table we can have a much better query we don’t have to use that spool we don’t have all those duplicate duplicate values in there we can solve a lot of problems uh if there are any questions on this go ahead and stick them into chat uh i’m gonna you know i’m gonna i’m gonna move on but at the same time if you have questions i don’t mind i don’t mind backtrack backtracking a little bit it’s always always a pleasure to answer questions so now let’s talk a little bit about how indexes can help solve blocking problems all right so if we again clear out clear out our indexes here and what i’m going to do is run this update here now i’m going to use begin train to exacerbate things i know that begin train isn’t like i know that like begin train and waiting a while isn’t the most like uh um you know realistic thing that you might see in production maybe it is if it is i’d love to come help you out there’s a contact from my on my website we could chat about that i’d love to help you with that but let’s just say that you know this this was a long-running update for other reasons so we’re going to run this update with the begin train and we’re going to come over here and we’re going to try to run two queries right we’re already using the right database if i run this query this query finishes instantly all right good stuff there so we have this data back this this query was not blocked if we run this query though this query will start taking a suspiciously long time this query will not finish instantly this thing is still down here spinning away if we go and look at again if we go look at sp who is active we can get some we can get some interesting information about what locks got taken here again well first off we can see that this query has been blocked for about nine seconds and it’s still going it’s like because that begin train is open this thing is just going to sit around waiting all day long and i know this is exactly why people use no lock who the hell has time for this but if we go and examine things a little bit come over here the query plans really aren’t all that important here but what is sort of important are the locks that get taken so right now sql server has taken an intent exclusive lock on the object which if you were here yesterday you know that this does not mean that the object is locked this just means that uh we can oh hello party time this just means that sql server has taken an intent exclusive lock at the object level so that it can lock uh things at a more granular level david says is there anywhere we can grab the sql that you use in your streams um sometimes uh this stuff not really uh uh it’s i mean if you have the stack overflow 2013 database it might be interesting but a lot of the times people don’t have that and don’t feel like getting it uh if you really want it i can publish it somewhere but uh you know a lot of people just don’t like i i have bitly links to this stuff and i look at the download from them like before and after i give presentations and they hardly go anywhere you too so i mean if you really want them i can i can give you a link to them but uh yeah so we see here is um that sql server did take some page locks and those are exclusive locks right so that did happen and we can see that there are nine of them so we just got very unlucky that the the id that we wanted 1317729 was just happened to be on a page that was blocked by that update if we go and run sp who is active again we can see that i mean i don’t want to figure out what that number is but it’s a it’s a long number so let’s kill this off and let’s think a little bit about what’s happening all right we should probably we should probably roll this back before we go and do anything else right so what we have is an uh is an update against the uh the badges table now if i run this and i just get the query plan for it we can see pretty simply that this thing has scanned the entire badges table uh and then we have to update the badges table based on that so uh what we’ll see is that this thing we have to scan the badges table and now something kind of quirky now this isn’t i want i want you to understand that this is not like an always and forever like pinky swear sort of rule in sql server but sort of generally if you have a modification query and you start with a scan you’re going to have either page you’re going to start with page locks if you have a modification query and you start with a seek you are most likely going to start taking row or key locks right this kind of the same thing but some people call them row locks some people call them key locks there’s not a lot of not a lot of other terminology out there for pages that’s sort of a general rule it’s not a perfect rule but it’s something that you can keep in mind if you see a scan uh where the data is coming from in a modification query before you come over to the right you’re most likely going to start taking page locks either one of them might escalate to an object lock right you might go from like the row slash key lock to an object or a page lock to an object you don’t go from row to page keep in mind that doesn’t happen but if you uh if you go from uh you can go from like the row or key locks up to object or the page locks up to object so in this case we didn’t escalate locks the entire object wasn’t locked that row just happened to be on a page that got locked because sql server said i don’t really know where this data is i don’t i don’t really have a good way to find this data even though even though i’m searching by id like sql server is like i don’t know where to find these dates these dates could be anywhere but with a good index on the date column what we’ll do is create that so now sql server has a very good way to find dates and since we’re updating user id we’re not updating the date column we can begin tran over here and when we run these two queries now they both finish instantly the difference of course is the type of locks that sql server took when this thing was rolling along now if we go look at the locks column things will be a little bit different we’ll see intent again we’ll see the intent exclusive locks on pages and we’ll see the intent exclusive lock on uh the object but the only thing that actually gets exclusive locks it only gets this x are the keys of the primary key right so only key columns there were 1447 of them which is more than you know the page locks that we got but that’s okay because it it finished really quickly and if if we weren’t you know being knuckleheads and doing a begin tran then we wouldn’t we wouldn’t have a problem and if we go look at the execution plan now we’ll see that we have done a seek into that index to find the right dates and because again general rule of thumb like there’s a lot of sort of like nice sort of cloudy stuff around this because we started with the seek we most likely started taking row slash key locks rather than taking page locks so that’s a nice thing to do there so always whenever you’re trying to make whenever you’re trying to solve locking problems one thing that i always want people to be looking at is uh their modification queries if those modification queries have a join or a where clause it might be a very is a very very good chance that we are just missing an index to help those queries find data david says i always use a stack overflow database when following run streams yeah so just to make sure that you understand this is stack overflow 2013 this is not the full-size stack overflow database so these might not translate totally fully over to you but uh if you have 2013 then they should work pretty well yeah so let’s see mr shah says so is proper indexing the correct answer to what i should use instead of no lock uh it really i mean it depends a bit on the situation um uh proper indexing is generally a very good place to start uh for these things uh or another thing to consider is that you might have you might just have modification queries that uh are just doing too much at once and i’ll i’m gonna what i mean by that is if you have a modification query that even if it has a perfect index so it can find the data it needs if it’s updating like tens of thousands or 20s of thousands or millions of rows that thing is always going to try to escalate and there’s always going to be blocking kind of like the more stuff you have to change in one go the longer that query is going to take a friend of mine he’s canadian don’t hold it don’t hold it against him too much he’s he’s nice uh let me get the url uh but he wrote a wonderful blog post about scripting batches to um to uh to reduce the amount of locking be nice here to transaction logs not have like that specter of lock escalation uh popping at you so this is a very good blog post about that i would say generally when i’m trying to solve blocking problems one of the first things i’m going to do is a i’m going to figure out if the good if the database is a good fit for an optimistic isolation level uh like read committed snapshot isolation or if i just need to target specific queries and maybe snapshot isolation second might be trying to figure out if my modification queries uh are could use an index to help them find data third might be okay how much like how many rows are we typically modifying should we rewrite this process to batch the modification query so that we’re not we’re not taking uh like as excessive locks for as long uh and then after that i would be totally cool with you using no lock hints because you have exhausted all your sort of sane and rational options so there is that uh chris says i’d also love the data love to have the database well if uh if you want the database uh you can go uh let’s see i know i have that somewhere give me a minute if you want the database why are you asking me to log in oh you knucklehead uh i forget how i log in here there we go all right give me a second let me grab uh uh i i have to warn you it’s not going to be a fast download because it’s it’s it is a decent sized database so let me copy this link and let me come back over to chat so uh if you want the database you can go there and i think i have i do oh that’s an old version though yeah i have an older version i have a link to an older version of this in bitly let me see if there’s a newer one because then i might be able to just throw you the scripts real easy oh you know what uh i can’t find it quickly so i’m going to move on but uh yeah if you want to grab the database that’s right there uh if you want the script so i will i’ll put it out on twitter or something when i’m done or i don’t know if i can add a link to like the the data like the the description of this thing and i’ll do it there but anyway let’s move on from this and let’s talk about uh something else that indexes can help with now this is where problems start getting a little bit more difficult this is where problems admittedly start getting into like the oh crap the server’s broken type of problems like these are like not just like oh i have a locking problem oh the query’s slow oh we could fix this oh this is no this out of the other thing this is where things this is where things really start to hit the fan this is when like you start getting alerts about like like red alerts from your monitoring tool users start calling you there’s all sorts of like like like like you’re like your big important line of business application is down people like i can’t connect like nothing will run these are the kind of problems you can sometimes run into so um what i’m going to do let’s see here oh there’s a question my environment application takes rollox and another update query needs that row so there’s always blocking is that solvable by indexes uh if you if you need the row then no so if you have an update query that’s locking let’s say id one and another query comes along and needs id one then no that’s not solvable by indexes that’s that’s only i would that’s only correctly solvable by using an optimistic isolation level either read committed snapshot isolation or a snapshot isolation those are two kind of big big big areas uh they’re big topics to talk about so i couldn’t really i couldn’t do them i couldn’t do them right by talking about them here uh because there would it’s a again it’s a very very big topic but uh it would be that would be what you would want to do it’s better than no lock they’re better than no lock because what you get back is the previously committed versions of values in that row so those values were correct at some point you don’t subject your application to the uh like the like i if it’s not a word it’s a word now to the potentiality of dirty data so you skip over like the phantom reads stuff you might see like if a transaction is currently in flight that’s kind of the better way to go there if you if you start falling into using read uncommitted or no lock which are effectively do the same thing no lock is a terrible word for the hint no lock no lock really should be no respect because what it does it doesn’t no lock doesn’t mean that you don’t take locks no lock means that you don’t respect locks taken by other queries so whenever you have a right query that takes out locks and says i need to modify this data please don’t read it because i don’t know what it’s i don’t know what it’s going to look like the re-cray is just like whatever pal i want it i’m going to take it so uh yeah it is i think drop indexes is the very fast or procedure it’s it sets a record but yeah so if you want to if you need to if you have a problem where you know you truly have modification queries that are touching the row they need but another query needs that row your only really good valid option is going to be an optimistic isolation level all right so let’s look at uh how indexes can affect memory we don’t need these windows open anymore so i’ll clean those up a little bit nothing nothing in there would uh we let’s see uh john wouldn’t it be possible that problem roll lock queries are tunable if they have poor plans uh no because the problem that he’s he’s having or that sorry that they are having is that uh the row that another query needs is locked specifically right so if you have a update that’s taking row locks and another query needs to read those rows it doesn’t matter if your other query has a good plan or a bad plan it matters that those the rows that you actually need are locked and if you need rows that are locked your really only valid uh uh option is to you know uh use an optimistic isolation level sure you could tune the if you have a very very bad plan for those modification queries and they’re taking longer than they should you could tune those queries to be faster but that sort of that that that infers that the that the queries are a bad that they’re taking longer than they should and that this that the the modification queries that are going in and taking those locks aren’t taking them and holding on to them for a reason there’s a lot of processes that people have written into databases that that rely on locks like that to set up a proper queuing of things and it’s not always possible to say oh if we just tune this query to be faster or you know release this lock earlier we can you know we can we can resolve blocking problems because a lot of the time those when people are write things that specifically take row locks and they need to modify this row but something else needs to work with it that’s always like every time almost every time i’ve seen that it’s been some like some weird queue that someone has written into the database yes no i understand if it’s if it’s locking them for too long then yes you could tune a query to do it but that doesn’t sound like it’s a situation here if if if they had asked a different question if they had asked about tuning the the queries but it sounds if it sounds very much like this is a pretty simple situation where uh that’s not it’s like like the query is not going to get any faster we’re taking that lock and holding it for other reasons so i agree there are there might be a time when uh if the lock was being held on for too long because the query was long running that’s one thing but it doesn’t sound like that’s it yeah exactly some some type of queuing all right so let’s talk a little bit about memory we’ve got we’ve got indexes cleared out here and i’ve got this store procedure now the gut of guts of this store this this store procedure does not take any parameters and i don’t want it to take any parameters because that would ruin the fun people get all caught up talking about parameters and parameters have nothing to do with this now we’re going to do a couple things let me crack open oh yeah yeah every time i do this sequels are windows is like i don’t know where rml utilities are i can’t find rml utilities but meanwhile if i go right here i can find the rml command prompt and everything is fine it’s like i’m being messed with i’m being messed with constantly if i say rml like oh i can’t find it but it’s right there like stop being stupid windows like like every time i do this i’m like why why not linux windows is such a dummy such a dummy i can’t take it can’t take it some days but i’ve got this uh store procedure and the guts of the store procedure run this kind of big goofy looking query and what i want to do is stick this into a new window this is my store procedure uh sp pressure detector and i believe everything’s spelled right and it should run good stuff but what this does is uh two very specific things it is not meant to compete with sp who is active or any of the blitz scripts it is sort of its own thing and it and you can choose to run one of two queries or both all right so what you can choose to do is uh what you can choose to do is either look at memory pressure or like things that might cause memory pressure or things that might cause cpu pressure we’ll do both here but uh right first we’re going to look at memory so what this what this gives you back when you look at memory is if we had any queries running that were asking for memory they would show up in this top space if we had any queries that were or rather sorry if we hadn’t if we had any queries or asking for memory we would see some of the numbers down here start to change a little bit we can see that looking at some of these numbers going across sql server says right now i am willing to give out just about 38 gigs of the 50 gigs max server memory i have this is a vm with 64 gigs of memory uh i’m willing to give out just about 38 gigs of memory right now two queries that might need it for memory grants okay fair enough sql server fair enough let’s see just to keep up with chat uh it’s already rcsi so if it’s already rcsi in the and you have queries taking roll locks and they’re kind of getting held on to there um i would also suspect to see i would also suspect like or rather i would also expect to see other locking hands like maybe hold lock or something in there uh so that you know things are kind of held on to but not sure and i don’t know uh yeah fun stuff but we can see right now but we haven’t given any out right so sql server is like i would give out about 38 gigs i haven’t given any out now nothing’s going on that’s totally cool if i go and run this query and all i need is the guts of the store procedure i don’t need to execute the entire store procedure but if i go and run that and i look at what sql server does i don’t need this to keep running we’ll let that let that die gracefully since i just executed it we can see that this query has asked for about 9.4 gigs of memory that’s 9 474 megs of memory which is just about 9.5 gigs right that’s a lot all right that’s a lot now ideally or rather um what we see down here is where sql server has taken as total memory or rather available memory there we go available memory is now down to 28 because we have granted out 9.4 gigs right so the total memory now is still the total memory is still the same total memory is like we’ll still get about throughout 38 gigs but we we have much less of it available because we gave some out now i don’t know if any of you are good at math i’m not good at math but i’m going to tell you what happens if we try to run a bunch of copies of this query all at once so i’m going to use this i’m going to use the the o stress uh utility or rather the o stress uh program from rml utilities if you want rml utilities you can go over here and get them it is a pretty neat spiffy fun tool i like it um i also like sql query stress but uh sql query stress kind of crashes a lot sometimes especially if you ask it to use a lot of threads so just kind of for safety i uh i like to use i like to use uh o stress uh narav says let me guess bad things happen yes indeedy duty so what i’m going to do is i’m going to run four copies of this store procedure all at once we’ll get that going over here yeah no whammies excellent come over here run this and while that happens i’m going to kill this off because i don’t want my laptop to get too angry with me so we we have the we have the information we need we don’t need any more information right now so what we have is three copies of this query that run and run pretty well or rather they they run they sql server gives them the memory that they need or that they ask for whether they need it or not is complete completely besides the point but uh sql server is just like yeah for these three go ahead you can take it that’s all you you can have it i don’t mind but you but you session id 57 you get nothing the grant time down here is null the granted memory is no right it is still asked for nine point well it’s about 9.5 gigs of memory but it hasn’t gotten it right now this query is off waiting in a queue waiting in a queue to get memory so that it can start running while it waits in this queue you will generate this very very silly weight up here remember that all the queries that have gotten their memory grant and are off doing things have cx packet weights but our poor query down here that has not gotten memory that is waiting for memory is getting these resource semaphore weights notice it doesn’t have any workers it hasn’t doing anything we still have we have a query plan for it that’s not really the point but we what we don’t have is enough memory for this query to run which is kind of funny because if we look at how much memory this thing has asked for that’s exactly how much memory we have left to give all right so we we have nine point four nine nine four seven four point seven five lots of zeros left but and we have we have this query asking for but it’s not it’s not getting it it’s not getting it because our query would need nine point nine nine point five plus half of nine point five in order to run right so you need to have not only when a query wants to run this asking for a memory grant the available memory not only has to cover that memory grant but also another half so that if anything else comes along it needs a memory grant so you’ll sort of be like yes you can have that memory well this thing waits because it is terrible so yeah this is this is not a good situation and you can run into this when you have a like either says something like this where you have a few queries that ask for very big memory grants or you have lots of queries that ask for kind of smaller memory grants and they all sort of add and stack up now this very simply this is three queries that have asked for about 28 and a half gigs of memory you could totally write a query that wanted like one or two gigs of memory and run like 14 or 15 of them and have the same hell break loose it doesn’t matter all those memory grants have to come from the same place query memory grants uh come from typically there are of course a couple few outliers nothing nothing like nothing that i think is so constant that you need to be concerned about it but query memory grants typically come from two places or two specific operators in a query plan they come from sorts and they come from hashes this query has a little bit of both in them all right we have a hash join up here and we have a sort right about here the thing is hashes when you see a hash they typically build a a hash table based on whatever input goes in the hash table over here isn’t all that gigantic right it’s only for a couple million rows and if we uh i forget if this is going to be completely accurate if you look at the memory fractions here uh the closer you get to the number one the higher the the higher the fraction of memory you get is so this this hash join actually got a very very small amount of memory rather got a very very small amount of the 9.4 just about 9.5 gig grant that sql server gave to this query the majority of the memory will and in most cases will always go to sorts sorts can sorts can really just ask for a size of data as far as memory goes it’s pretty crazy what sorts can ask for if you look at the memory fractions for this sort this memory fraction is very very close to one so you can see that the sort got almost all of the memory now operators and query plans can reuse memory like if they finish they can pass memory along that’s why memory fractions are often kind of weird and don’t always line up perfectly it’s because memory might get like you know reused and shuffled along but it’s a little bit more complicated than i want to cover here because this is after all about indexes this is not about memory of course so stuff to keep in mind now let’s go look at what or rather before i close out let’s look at what was sorting we have an order by on user id ascending and score descending and we have an output list that is just about every other column in the comments table most of these columns are harmless and by harmless i mean they are numbers and they are dates dates and numbers don’t really have a big memory footprint now you might have a very very long data set right and we have a very very long data set sure you might have a bigger memory grant because you have more things to sort but what really really starts to chew into memory grants and like get sql server to really inflate memory grants are going to be string columns this column is called text it’s not a text data type or an n text data type i think i want to say it’s an nvarkar 700 so it’s not even gigantic but the way that’s the optimizer guesses memory grants or how much memory it will need when we have to know now i’m going to show i’m going to show you something important in a second so don’t get too carried away but whenever you need to sort text columns remember we’re only so we’re ordering by user id and score we’re not ordering by the text column but the text column is part of the result set so what we need to do is we need to write user id and score down in order in the order that we’re asking for but all these columns that accompany user id and score i mean i mean aside from user id and score so like id creation date post id and text we have to write those down in the order that we sort user id and score in sql server will estimate for string columns that every row is half full so let’s just to make it very simple let’s say you have a varkar 100 column sql server will estimate that every single row for that varkar 100 column has 50 bytes in it meaning that it’s half full that gives it some fudge factor if some are very full and some are not that full then we kind of meet in the middle and you probably have a proper memory grant the bigger your columns are regardless of how full they are all the way up to max data types which will ask for a big honking memory grant uh the bigger your the wider your columns are your wider your string columns are so if this was like an barcar 1400 or 38 400 or 2800 or whatever whatever numbers add up sql server would ask would assume that that column is half full and ask for more and more memory that sucks that’s not good it’s not a good situation so whenever people have like you know those those big long presentations about right size and data types and like no don’t use date time if you only need a date and don’t use a big int if only need an inch sure sure knock your socks off what i care the most about are those string columns because people will always get carried away and they say i have no idea how much data can it can get in there i better make it a max and whenever they do that the next thing i see is queries like this that start asking for 9.4 gigs of memory to sort that text column by a couple other columns now what the query itself is doing over here right if we go and we look at what that query is actually doing right we don’t have an order by here and we don’t have an order by here what we have is a windowing function and this windowing function is partitioning by user id so that was the first remember when we looked at the execution plan it was order by user id then order by score descending so we’re partitioning by user id which means we have to like you can almost think of partition by like group by without a grouping what it does is sql server just looks at the results and every time it gets to like like for every like bunch of duplicates we’ll say this is like group one this is partition one we find another group of duplicates that’s partition two we find like a single row that’s a new one we’ll say that’s partition three and so on uh let’s see narav says it’s worse when people use car versus var car yeah so for car it assumes complete complete fullness well car car is different because it is completely full anyway right so for car if you have a column that’s car 100 and you only have like three or four characters full in it the other like 96 97 characters is going to be like just padded right it’s like that anti-padding stupid setting or whatever so like yeah it can totally get weird in there but car car is worse and and and var cars of course worse because it’s unicode and it’s double byte uh double byte encoded so you so if you had a nvarkar one uh 100 sql server would count it as a would would you know of course multiply it by two so you get a 100 byte guess there uh with a plc index help here yes but you are jumping the gun lee damn it you’re banned from all my future webcasts but yeah so uh what we have here is a row number function on user id and score descending and since we’re selecting a bunch of other columns sql server needs to write down all of these other columns in order i’m not a huge fan of excel and also not a huge fan of comparing sql server to excel but i think explaining it like an excel file is a lot more relatable to people let’s say you crack open an excel file and it’s full of data going across and you click that little button up in the top left hand corner that highlights all the rows and columns and then you choose to sort data by one column the entire spreadsheet will flip to match the order of that column and that’s a lot like what order by does and when you in a query you don’t really you don’t just order like you don’t just order that column that you’re saying order by you have to order the entire result set for that like for that column so you’re not just like sorting that one column in memory you’re sorting that one column in memory plus all the columns you’re selecting uh if there’s bandwidth issues i don’t know sometimes no it’s it’s it happens sometimes sorry about that there is there is relatively little i can do i can all i can do is for uh for uh my streaming service to to catch up i apologize uh these things happen uh going live is is crazy uh if you know what you know what i’ll do though i’ll wait a second until people say that things have calmed down and then i’ll uh then i’ll then i’ll pick back up yeah this this happened a little bit yesterday too uh it was back at the beginning of the stream when it happened uh today it’s happening at the end of the stream for some reason i don’t know i guess i’m just lucky uh you know perils of the perils the perils of live broadcasting so we’ve got this query here and really what’s what’s what’s asking for the memory is this part of the query where we’re generating a row number where we’re saying or partition by and order by looks good now sweet let’s get back to business then so what we’re going to do is um we’re going to create this index and this index is going to help our windowing function do its work so we’re going to have user id as the first column because in the when we looked at the sort operator user id was sorted by first ascending then we’re going to put score descending as the second key column so our in our index will be ordered by user id and then score descending which is a great index for this query and since i i have relatively little caring uh for what’s going on on this thing because it’s it’s all just a funny uh dev server to me i’m going to include all of the other columns that we need in here in order to cover the query entirely because remember we’re all we’re you know we’re selecting basically everything in the table and then we’re having this row number generate over these two dads right so this is i mean this is i don’t know is this an index that i would create in production if this query was important enough sure but it also might be a little nervous about it but what’s cool now is if i go back to rml utilities and i’ll clear the screen and i’m going to rerun this if we come back over here now and we look at sp pressure detector well nothing’s happening we haven’t given out any memory right there’s no memory given out and there’s no memory granted out here we have granted out a big whopping goose egg no queries are showing up no queries are showing up here because no queries are asking for memory grants this specifically will check for queries that are asking for memory grants one way we can see what’s going on is if we rerun that and we go look at sp who is active this will show us what’s what’s running like oh i should probably not have query plans on for sp who is active should i what that would be scary but if we look over here we’ll see four queries running those are our four queries that we care about and if we look across we can see that they have used very very little memory that’s like what not much that’s a four that’s nothing at all and if we go and look at did i get the execution plan i didn’t let’s go get the execution plan from over here that sounds like a good idea to me so if we run this query and we get the full execution plan this should finish in about six or seven seconds if it doesn’t um then i owe you all a drink the next time i see you whatever you drink it doesn’t matter i hope you like water okay i lied it’s probably about double that okay i lied it might be triple that okay i lied it’s returned a lot of rows okay it didn’t actually take 19 seconds it took about seven and a half seconds up here all right good for us seven seven point four seconds okay close enough close enough but notice that we don’t have a sort operator down here we no longer have to sort data and we no longer have a hash join we have a merge join since everything is in the order that we wanted in which is lovely too i like i like that is this a perfectly tuned great query well you know it runs seven and a half seconds it returns a crap load of road like we might want to do something about this eventually because it this runs for 19 seconds like most of that time is returning three million rows to stupid ssms and the query finishes in seven and a half but what’s important is that if we go and we look at the properties uh if we look at the memory grant info this query isn’t isn’t asking for any additional memory every query plan every query that runs in the world is going to need some memory it’s going to need a little bit of stuff for like operator state and other things but it’s not going to ask for an additional memory grant that other query or rather this query that other query plan which was this query without the index went and asked for nine and a half gigs of memory this one’s not asking for that extra memory grant but you still always need a little bit of something to have the query run because memory is pretty essential for everything so we have that query asking for a little bit of memory but we don’t have it asking for the big grant now sorts are one of those things that can ask for crazy crazy amounts of memory i’m not saying you always have to index to fix them but what i am saying is you should if you have queries that are asking for large memory grants sorts are probably the first thing that i would look at before hash joins or hash aggregates i’m not saying to rule them out entirely i would just usually focus on the sorts first look at the columns we’re selecting like look at like look what we’re ordering by and sorts don’t only happen with like you know windowing functions or just putting an order by in there sometimes you might put a select distinct in there and sql will choose to order by all sorts of things sometimes sql server might choose to implement a merge join or a stream aggregate where we don’t have the data in the order we want and it’ll choose to sort that data going into those operators even sometimes for key lookups it’ll reorder data going in so there’s all sorts of reasons why you might see a sort in an execution plan that have nothing to do with the way you like you writing the query with an order by or something in it that might not ever show up but there are all sorts of things in databases that ordered data is helpful for all right uh i don’t think there’s any questions on that hopefully everyone is still alive uh i guess it looks like a few people left when things got choppy thanks for braving the storm with me it’s not nice make me feel bad now all right anyway let’s go look at one last kind of fun interesting thing let’s go look at how sql server or rather how we can deal with another awful crappy crazy kind of weight that has to do with threads would you ever create an index with descending uh yeah sure i created one up here with it was descending score descending uh it’s it’s a funny trick um so sort of generally uh there is a very funny thing in sql server where uh backwards scans of indexes can’t be parallelized uh and sometimes if you if if you see if you see like uh like a something looks kind of funny let’s like let’s say you have a query plan that is like 99 parallel but you have like one big index scan that for some reason is serial and then like right afterwards it’s a distribute streams or repartition streams and you see that you might want to like again look at the properties of the operator you can look at the properties by either i don’t think i have a query plan open right now but uh let’s just grab this get an estimated plan if you like either you know right click and get properties or hit f4 uh you can see the uh scan direction in the in the in the query plan operators you don’t it doesn’t show up in the tooltip i don’t yeah it doesn’t show up in the tooltip it only shows up in the properties but if you look at this scan direction it ever says backwards and you have a mostly parallel plan except for this one thing that’s uh that’s serial then uh that then you know you what you’re doing is you’re scanning the index in a backwards order and that and that cannot be parallelized right now in sql server that’s supposed to have been done for like i don’t know 10 years or something but just no one’s ever done it so yeah it’s something that i would do um you know uh in certain circumstances but uh and a lot of times it has sort of limited usage so that’s fun let’s see uh sql server licensing oh god i don’t want to talk about that you have any idea what that’s like a that would only be valid for like six months would not be um could not not be useful like in like a year from now like no one would care about it it would be awful uh and i would get sued i would get sued so much i would get sued i would get sued constantly and say i heard eric darling say that we only need a license blah blah blah and then i would say oh boy uh sorry about your audit i’m sorry about the state of that all right so let’s move on a little bit now and let’s talk about how sql server can uh or let’s look at a crazy situation with a different kind of crappy weight this one’s going to be all about threads now if i run this query this query actually finishes relatively quickly relatively quickly it’s a relatively quick query thankfully like i’m happy i’m happy with how fast this query is it’s good that it’s fast right we like fast queries but regardless of how fast this query is this query does something kind of crappy so let’s go look at properties over here and let’s go look at thread stat so this is something that i’m going to say new but by new i mean this came out in sql server 2012. um let’s see stevano says i reckon one can scan through the execution plan yes uh so actually there’s a check in sp blitz cache under expert mode that i wrote that will find backwards scans uh that does show up in there and that is something that you can look for but it’s under expert mode because the number of times that i’ve found it being the root cause of a performance problem have been pretty slim um you know i’m not saying that it’s something that you never want to uh look at and i’m not saying that it’s not something microsoft should like should fit like listen microsoft shouldn’t fix it but uh at the same time like very rarely have i been like aha it was the backward scan that did it we fixed the problem it’s always been like there’s like a thousand things wrong here the backward scan was just one of many and usually if you like you know if you just create an index that is like most of the way to being helpful then the the backward scan just becomes less of an issue um so just just my experience i’m sure someone out there can show you where a backward scan has been like the most awful red-handed red-headed culprit for a query performance but uh you know uh it just it hasn’t been my experience that that that has often been the root cause of an issue but yeah blitzcache will find it if you run expert mode and uh it happens to pick up on a query where that happens so parallel queries are kind of funny parallel queries or rather queries in general uh well if you join tables together if there’s a key lookup you’ll have these branches all right so by branches i mean this is like one branch and then down here this is another branch if we scroll over to the right left a little bit uh you’ll see another branch that sort of diverges off that main path there and in a so when people talk about parallelism settings when people talk about max stop and cost threshold for parallelism cost threshold for parallelism is pretty easy to explain if you hover over the select operator and we zoom in a little bit we can see that the estimated subtree cost for this query is 860 query bucks and so this was a fairly expensive query at least it was expensive enough for sql server to uh or rather the so this is where things get even funnier this query this parallel query cost 860 query bucks sql server chose this parallel query because it was cheaper than the serial query if we just you know let’s just you know for uh for some fun and giggles let’s just stick an option max stop one here and let’s just get an estimated plan real quick if we look at how much of that the serial query cost it’s 991 query bucks a lot of people will look at parallel plans and say i don’t get it why did the why is the parallel plan less than cost threshold for parallelism because the parallel plan doesn’t have to be worse than cost threshold for parallelism the serial version of an execution plan that the optimizer comes up with first that’s what has to be higher than cost threshold for parallelism so at 990 query bucks that’s way higher than what i have it set to on my server on my server my my server which is like 50 query bucks now if you go look at the the the parallel version of the plan all right we’ll go run this real quick again we’ll get rid of that hint and we’ll rerun this if we go look at the parallel version which cost 860 query bucks that is about 130 query bucks cheaper so that’s why sql server chose a parallel plan here totally fine totally like easy pretty easy thing to explain to people max stop is where things get a little bit trickier a lot of people will say well it limits the number of cores that a parallel query can use which is mostly true like some some of my friends might yell at you that it actually limits the number of schedulers that a query can use which is apparently different from cpus in a way that i have never been able to to expunge on them as much as i try to ring them ring these smart people out for details they’ll never budge on it but what maxed up also controls aside from the number of schedulers that you can use is the number of threads that each concurrent parallel branch can use i know that sounds tricky and it is so in this query there are indeed you know three branches off here there’s a little bit more technical detail behind what what defines a branch it’s really the space between any two parallel exchanges so over here we have a repartition streams and down here we have a repartition streams and over here we have another repartition streams so technically this is one branch right technically these operators between a repartition streams are a branch ditto some of these over here like when we go past that repartition streams we might we might call all the operators between the gather streams and the repartition streams another branch which should be all these operators now there’s a repartition streams down here too so things get things just get all sorts of weird but this query does have this query has three parallel three branches that could run concurrently you might see parallel queries that have many many more branches than show up down here what the what branches measures here is not the total number of branches but the total number of branches that could possibly be running at the same time that could be running concurrently there are again going back to memory consuming operators hashes and sorts those are also what we call stop and go operators which means all the rows that they need to process need to show up there before they can start running that’s one thing that can sort of separate which branches can run concurrently in this query plan aside from the fact that we have repartition streams operators that that define branches we have hash joins like this one right here all right there’s a hash join right here which means that all the rows from this part of the branch have to show up here so we can build the hash table all those rows have to show up and then once that starts kicking off this can start running while that’s running rather well while that’s running uh and rows start coming out of here and passing through the hash join here then all sorts of stuff can happen in here up into the point where we need to start building another hash table here and then once that hash table is built and we start probing in down here all the rows that start passing out from down here can start coming out up here so at any given time we can have three concurrent parallel branches and that gets borne out when we look at the thread stat properties of this plan now i have max stop set to four so my degree of parallelism for this is four the number of threads that this branch uses is 12. i want to say the reservation matches here it does so we’ve reserved and used 12 threads so we have dot four right we have max rear parallelism set to four so we can use four schedulers to run this query but we can use four threads per branch for this query meaning all together this query will reserve and hopefully you and hopefully use all 12 threads right three times four you got it i just i just like to explain it the long way because a lot of people when i just if i just said three times four they would say huh but yeah three times four is 12 so we have 12 used threads 12 reserve threads and we have three parallel branches for a dot four query so i promise you it all well it doesn’t add up does it it all multiplies up good that it multiplies up but and this is where things get tough if we look at the number of schedulers the number of worker threads that my uh i just like saying schedulers but if you look at the number of worker threads that my server has it’s 576 total now this is a microsoft supplied algorithm if we take 512 plus the count of schedulers that are visible online minus four times 16 that is how many worker threads you have on your server unless someone has messed with the max worker thread setting that is the way it will happen i want to say things change a little bit beyond from 64 cores and up but this is what we have and what’s what’s terrible is that it’s it’s shockingly easy to run out of cpu of cpu of worker threads so what i’m going to do is and i’m going to i’m going to hope that this works because it worked last time is i’m going to change the connection here and i’m going to use the admin connection and it’s going to say that it failed which is okay but i’m using the remote dac right now and but you can tell that i am because i have this admin connection down here and the reason that i’m using this is because when i do this cpu demo and i start to run out of worker threads it becomes very very difficult very very difficult to connect to the server and even run simple dmv queries so you can see i have 576 threads right now sql server in the background just for whatever is using 28 threads for other things right bad and what things i don’t know they might be bad they might be good i don’t i don’t really don’t know now what i’m going to do is go back to o stress and i’m going to run a whole crap load of queer of copies of this query that take 12 worker threads i’m going to exacerbate the situation sure i could try to play with it so we got like the exact number and took out the exact but blah blah blah but i just really like to kick the crap out of it here so people understand what things what things will look like when when everything kind of goes goes south on them so i’m going to run this and we’re going to immediately have exactly what i want i’m going to run this one more time for good luck and then i’m going to kill this off so sp pressure detector we’ve changed us to be from memory to look at cpu we’ve connected via the remote dac so that we don’t get thread pooled out of running these queries but this is what happens we had we had a whole bunch of copies of this query running taking 12 worker threads for each well i want to say taking 12 worker threads for each copy that’s not going to totally be true and i’ll show you that i’ll show you what i mean in a minute but if you look at what happened here we have 576 total threads we’ve used 612 which gives us a grand total of negative 36 threads we have 221 threads waiting for a cpu and 96 requests waiting for threads those request those 96 requests are all going to be in this section if we look down here we’ll see that for the section of queries or for that section we have 96 rows right so that’s the 96 requests waiting for threads requests waiting for threads is where we have thread pool weights if we look in here this is where we’re going to see those 96 rows now where things get very very tricky and very very ugly is when you have a script that goes to show you what’s currently running on your server what’s it what’s the first thing it’s going to do it’s going to join in some order dm exec requests to dm exec sessions and it’s going to join them together on session id the problem with that is that if it’s an inner join you’re not going to see any of this because this session id column is completely null there is no data there is no session id in here to join those tables together on.

When SQL Server runs out of worker threads, we can’t even give queries a spid to run on. That’s how completely out of CPU resources we are.

We are completely shot. When I ran this the second time, queries had been waiting nearly six and a half seconds to get worker threads to execute.

All right, and if we scroll down this list, we’re going to see all 96 queries. This one was waiting almost three seconds, but from almost three seconds up to about six and a half seconds just to get threads to run.

If we weren’t connected via the DAC and I tried to run this query, it would time out, SSMS would throw an error message, and we would say, I don’t know, server’s down, restart it.

We would restart it, and things would probably go back to normal. But that’s no fun, right? That’s no fun. I like to give you the information that you would get if you had listened to me about setting your server up right.

And you would enable the remote DAC. That’s all I want. Now, the third set of results from SP Pressure Detector is going to give you CPU-intensive queries.

Now, these are all queries that at least, I mean, do they all have session IDs? I forget. Yeah, these queries all have session IDs. So this is 243 rows.

So these 243 queries all got spids, and we’re all able to start running. But there is a big but here, and this is the kind of big but that Sir Mixalock does not care for.

If we go over here and look at what happened to some of these queries, we’ll see that very many of them are running at DOP4 and have reserved those 12 worker threads that we talked about.

But if we start scrolling down a little bit, we will eventually see that change. What happened?

DOP is two here, and we have six worker threads. DOP is one here. And this is going to be null for the worker thread count because we can generally infer enough that a DOP1 query is only going to need one worker thread.

We can do that. But what’s super sucky, crappy, just plain mean is that SQL Server won’t tell us that these queries were actually running at DOP1.

So what we have here, I’m going to bring only the DOP1 queries into the screen. This is all DOP1. And if we go all the way across to where the query plans are, did I go too far or not far enough?

There we go. So these queries were all running at DOP1, but the query plan that we see from them is going to tell us that it’s running in parallel. So SQL Server, when you start running out of worker threads, one thing that SQL Server will do is start downgrading DOP for your parallel queries.

Remember, we have this one query up here that was running at DOP2, and then all of a sudden it drops off to DOP1. So SQL Server, the first thing it will do is say, parallel queries, sorry, I don’t have room for you.

We’re going to run you at DOP1. If you don’t like it, talk to the person who bought the CPUs, or talk to the person who tunes the queries. You know, I can’t do anything for you.

No room at the end. But we’ll see that a whole lot of these were actually running at DOP1. So SQL Server actually did a lot, a lot, to try and avoid running out of worker threads. But at the end of the day, we just completely overwhelmed it with queries, and we had 96 of them that just couldn’t even get a worker thread.

When you can’t get a worker thread, you end up showing this thread pool weight. We’ve looked at two pretty terrible weights today. We looked at resource semaphore, and we’ve looked at worker threads, or thread pool rather.

If you see a lot of these weights on your server, your server is in tough, tough shape. He says, is there a way to set that threshold where it starts limiting threads? No, I don’t think that’s something you can set even in resource, Governor.

I think that’s just something that SQL Server fires off internally at some threshold. I’m not sure what it is or where. There might be a weird trace flag for it. I don’t know.

I wish I knew. I wish I was that smart. But yeah, I don’t know. But no, I don’t think there is a way to do that. I think that’s like a judgment call from SQL Server. It’s like, at what point does the pain here mean you call 911?

I don’t know. I’m sure that threshold is different for different circumstances and different people. But pretty cool there. But of course, the main thing with this query, if we think about what that query is doing, this query runs pretty quickly.

It’s parallel. But if we throw enough copies of it out there, it’ll suck. But if you go look at the query, we’re joining stuff together. We have a where clause on reputation.

But at the same time, we don’t really have indexes that support this query very well. He says, I would be more conservative than SQL. Well, you know, I don’t know if that’s easy.

In some regards, it’s easy. In some regards, that’s hard. It’s a hard decision to make. But really, the problem here is that we just don’t have very good indexes to support this query that we’re running.

So let’s create some indexes. Because remember, the whole point of this session is about problems you can fix with indexes. And this is another problem that you can pretty reliably fix with indexes.

Because with the right indexes in place to help queries find data and join things together and sort of get things done in the right way, we can end up with a query that runs pretty quickly and doesn’t need to go parallel.

Right? So this thing finishes in 849 milliseconds. There’s no parallelism going on here, which means that we are running at DOP1, which means that we could run 500 copies of this without SQL Server really, you know, go and crap things out.

Let’s see. Ronald says, can a sleeping session still use a lot of CPU? So the way that the DMVs work is they just tally things up. Right?

They just tally things up as things run. So a really awesome thing to do if you have queries that are sleeping that have a high amount of CPU is you can do exec SP who is active and you can do at, I want to say it’s delta interval equals 5.

I might be, oops, I might be wrong about the delta interval. No, I’m right. And what this will do is it will look at how much progress, I’ll put that into chat, it’ll look at how much progress or how much resources your query has used over a span of time.

Right? So for this one, I ran it for 5 seconds. So what SP who is active will do is it’ll go through and do like the initial run of SP who is active. It’ll wait 5 seconds. Like if you’ve ever used SP blitz first and done like the second sample, it’s totally like, it’s almost the same thing except for who is active.

And what it will do is it will go through and it will add these columns in called delta. And these delta columns will tell you how much resources that query has used over a span of time.

So if you have a sleeping session and that sleeping session has a high amount of CPU, run who is active with a delta interval and see if it’s still accumulating CPU.

If a session is asleep, that usually means that the query has finished and either the application has forgot to close it or some weird connection pooling thing is happening.

Maybe there’s a leak, but the connection hasn’t closed. So typically a sleeping session isn’t going to keep accumulating resource usage if it’s been like, you know, sleeping for a long time. You know, usually something has to be like running or runnable in order to get more stuff.

Let’s see. Wes says, can’t I just tell my director that I need more resources and ignore all this? Yes, up to a point. But, you know, it depends on where the budget for more resources comes from.

So if the budget for more resources comes from your, like your bonus or your paycheck, I might want to, I might want to fix a problem. Max stop one is the ultimate solution.

Yes, if you want everything to be terrible. Right? You want everything to be terrible, max stop one is a good solution. I like, I like parallelism. I love parallelism.

It’s one of my favorite things. Okay, let’s see here. What else do I have left? I think that’s it. Is that the end? That is the end. Wow, that’s the end. Good for us. So, what did we learn?

All sorts of ways to fix problems with indexes. We looked at how to solve performance problems, how to fix spools, how to fix blocking, and how to fix queries that just use way too many resources.

I think it’s a pretty good run. It’s been almost two hours since I started. So I’m going to stick around for another like five or so minutes right until the top of the hour if you have any more questions, comments, anything like that.

Stick around for that. But other than that, I could use, I’m starting to get this, so I could use some liquid, and I could use, I don’t know, I would like to sit down again, to be honest with you.

But if you need, if you have any other questions, anything else you want to ask about, I know there was someone on earlier who asked for a copy of the script. I’ll put that up in the resources either on Twitch or on YouTube.

I just have to, I have to make sure that I have the most recent version of everything all zipped up, and then I’ll get that out there. Let’s see. Tell your director that, then fix the issue.

You’ll be swimming in resources. Yeah, yeah. But, and then you could, you know what you could do after that too, is you could tell the director that you fixed the problem and give a bunch of resources back and say, look at all the money I saved you.

Give me a raise. This is something that people love doing during pandemics. Says, this is great. Your streams are top notch. Thank you.

I hope I can keep, I hope that I can keep up the top notchness. I’m thinking about bringing back the office hoursy type thing because that seems, I don’t, I don’t think anyone or really anyone’s doing that either on Twitch or YouTube or anywhere right now.

So, you know, maybe I’ll start doing something like that once a week again. I don’t know. Yeah, I’ll get the scripts out as soon as I can.

I’m gonna, I’m gonna go faint for like 20 minutes when I’m done and then, then I’ll come, I’ll get the scripts ready. All right. I’ll start doing them again.

It was tough. It was tough last time around. There’s, I don’t know, it was only on YouTube and the setup was rather unsophisticated and, you know, the schedule got tough because, you know, once, once, once like client work really picked up, I couldn’t always do the Friday thing.

So, you know, ah, well, Mr. P. Shaw, I will, I will pick it back up. If you come to me day one for a 64 core box, you buy it.

In day two, you say, by the way, I tune the code, we can run on four cores. I’m not gonna be smiling. Yeah, I guess, but just think of all the room to grow you’ll have. Think about, think about the upward expansion that you have there.

Think, think about that. You have 60 cores that you can grow into. Oh, yeah, I know. It’s, it’s so tough. Sometimes services get, sometimes services get, one service is blocked, another one doesn’t.

If I, if I do it, I’m gonna do it just like these and it’s gonna be YouTube and Twitch. I don’t really have any other places to stream things out to. So if anyone has like suggestions for other services they can stream, that I can stream to, that would be, that would be it.

Gina R. Fria says, I have simple queries running 10,000 times an hour. How do I fix this? That, that, that is something you can really only fix with developers. Mixer.

I don’t know. I’ve never heard of Mixer. I’m, I’m old, but I’ll, I’ll look into it. But that, that is really something that only developers can fix because things that are running that frequently, are they’re not running that?

Well, okay, I’m going to take that back. I’m going to walk that back a little bit in a minute, but if you have queries running that frequently from the application, then that’s something that developers, developers would have to fix either by, you know, not, I’ll say something obvious, not running those queries so much or having those queries hit like some caching layer and only refreshing the cache for those queries like less frequently.

The other times that I see simple queries running constantly like that is when you have scalar valued functions or multi-statement table valued functions that are called in queries that return lots and lots of rows because one kind of hidden issue with, with, with functions like that is that they don’t run once per query.

They run once per row returned by the query. So that can, they can really sort of, they can really run quite frequently if you call them in queries that return or process lots of rows.

It could also be like, you know, it could also be like cursors or something that, you know, you know, couldn’t fix, but there’s no, there’s no like no button you can push in SQL Server that says, you’re only allowed to run a hundred times an hour.

I wish there was, but that might be dangerous. If only SQL Server had a caching layer, I guess, I don’t know. I don’t, I don’t want that.

If anyone from Microsoft is listening or watching, I don’t, I don’t actually want that. But yeah, that is, that is typically a developer problem. That’s typically not something that a DBA can, can fix or limit on their own unless it’s happening from functions.

What I would say is if you want to, if you want like sort of a decent way to, oops, I did that all wonky. You want a decent way to track that down. Oh my God.

Hmm. I can’t type today. This guy, this is why I type everything ahead of time because I can’t type, especially when people are watching. If we do SP, which let’s cash and we do sort order equals XPM.

XPM stands for executions per minute. And what we can catch with executions per minute is stuff that executes quite frequently. And it just might be a sort of easy way to catch high, high frequency execution things.

So unfortunately, I don’t have anything all that interesting in here now. We could also just do plain executions and run that.

And this would sort the plan cache by executions. And this is another way to kind of find things that execute frequently that, but, you know, just may not use a ton of resources.

So like this has executed 243 times. Executions per minute isn’t something that the plan cache directly tracks. It’s something that Blitz cache tracks by looking at the number of minutes between when the plan was created and last executed, and then doing some division to see how many times it was executed in those minutes.

So it’s coming up as zero here because of how fast those 243 executions happen. But generally, using either XPM or executions, you can catch what’s happening.

If they are functions, then they might be things that you could rewrite and fix. But if they are, if they are, you know, just pure like, I don’t know, like entity for ORM queries, something like that coming in, then it’s not anything you can do.

Uh, Kryron says, when we use max.dop8 and index, how is the index structure? Uh, I only use it so the index, uh, creates a little bit faster.

Um, there’s, there’s really nothing about the way the index is structured that, uh, it changes or helps. Um, at least for rowstore indexes, for columnstore indexes, dop can affect compression and typically the lower dop, the lower dop is, the better compression you get, or rather the few open, like row groups or whatever you can have.

But for rowstore indexes, I only do it here. I really only do it here during demos so that the indexes create a little bit faster. It’s really not, um, really not anything like, it’s really not any special magic, like, you know, God mode, ID clip, you know, contra code thing to get extra index lives or anything like that.

All right. I’ll give it another minute. Any other questions? Anyone else want to talk about anything? This went on longer than I thought it would.

I didn’t realize I had so much to say until I started talking. And then I, and I just kind of went off the rails. And now I have nothing.

Now I’m just brain dead and have nothing to say, which is probably nice. Like some people probably are grateful that I’m brain dead and have nothing to say. All right. Anyway, uh, thanks for showing up. Thanks for hanging out.

Thanks for watching. Um, if you have not already done it and you are the type of person who enjoys SQL Server content, um, if you follow me on Twitch or as, as the kids say, like, and subscribe over on YouTube, uh, you’ll get notified whenever I start doing one of these things.

If you follow me on Twitter, you’ll, you’ll also get that. Um, but you know, it’s, it’s always nice to, you know, have a good crowd in here to do things. I like having lots of people, lots of questions, you know, lots of different levels of experience.

So, you know, come on back, um, you know, tell a friend, invite, whatever it is. I forget how the, I forget how all this stuff works, but you know, please, you know, follow along. Uh, I’m going to be doing a lot more of these, uh, now that I’m kind of set up to do it, and I have all sorts of fun things, uh, on my mind, then I, I hope that, uh, you find it useful and entertaining and enjoyable, and I will see you, um, I don’t know, maybe, maybe I’ll even come back tomorrow and do something, but thanks for showing up, thanks for hanging out, and I will see you next time.

Adios. Adios.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

5 thoughts on “Streaming Week: What Else Can SQL Server Indexes Do To Improve Performance?”

Pete Sral says:

July 27, 2020 at 11:44 am

Are you still planning on doing live streaming or taking a break?
1. Erik Darling says:
  
  July 27, 2020 at 11:50 am
  
  I do! I’m away with family right now though, and then moving. Regular streaming probably won’t resume until September.
Matt Bowler says:

July 28, 2020 at 5:48 pm

Great session. Entertaining. Really enjoy the deep dives into query plans, and massively appreciate you giving your time and knowledge so generously.
1. Erik Darling says:
  
  July 28, 2020 at 5:59 pm
  
  My pleasure! Can’t wait to do more!
Teshale Misganaw says:

August 3, 2020 at 4:02 pm

Thank You

Comments are closed.

WELL.

Video Summary

Full Transcript

Going Further

Related Posts

5 thoughts on “Streaming Week: What Else Can SQL Server Indexes Do To Improve Performance?”