A Little About Select Queries That Cause Writes In SQL Server
Video Summary
In this video, I revisit an old topic with a fresh perspective and improved production quality. Erik Darling from Darling Data is back to discuss why select queries that cause writes in SQL Server are worth tuning, even though they appear to be simple read operations. I walk through a specific example using the Casey Kasem query, highlighting how eager index spools can lead to significant write operations despite the query being primarily a `SELECT`. By analyzing the query plan and leveraging tooltips for index pool information, I demonstrate how to identify and optimize such queries effectively. This video aims to provide valuable insights for database administrators looking to improve query performance by focusing on often-overlooked modification activities within select statements.
Full Transcript
Erik Darling here, apparently dying. I don’t know where that old man came from. Erik Darling here with Darling Data, recording another video with Camtasia, apparently, because the last one worked really well. I got to work on the camera focus a little bit. I noticed that I was a little blurry when I got in close and, well, you know, I didn’t feel great about being a little bit blurry. I’ll work ever diligently to be less blurry for you in the future. Since that went well, what I’m going to do is go back in time a little bit. I’m going to re-watch all my old videos where people were like, hey, the audio dropped out. And I’m going to try to re-record those. I’m not going to try to. I’m definitely going to re-record those. And, I don’t know, get twice the bang for the buck, I guess. And hopefully become less blurry. In the future. So, because this one was more recent. And, when I re-watched the video, quite frankly, I was appalled at how much of the audio dropped out. You know, you spot check this stuff, but you can’t catch everything.
So, I’m going to re-record these and start with this one. So, this is a redux, a redo, redo-dux, redo-dux, of why I care about select queries that cause writes in SQL Server. Now, when you insert, update, or delete, or select into, or merge data, you at least stand the chance of doing some writes. Like, you might affect zero rows and not do any. And, you know, I guess modification queries that don’t cause writes would be something interesting to talk about. But, no.
What I’m much more interested in, when I am tuning queries, when I am looking for queries to tune, is looking for queries that have written a lot of data that are only select queries. And, the reason why is because that often means that the queries had to spill a fair amount of data. There’s either a sort or a hash that spilled a good amount of data. Or, there was a spool in the query plan. Either a table spool or an index spool. And, you know, spools will show up in modification queries, oftentimes for Halloween protection.
But, you know, seeing them in select queries often, at least to me, says, there’s something we could fix here. There’s some room for improvement. Especially when there is an eager index spool. So, I have this query here. This select top 40, Casey Kasem query, where I have thoroughly untuned, detuned the query to the point where SQL Server says, You need an index so badly, I’m just going to create it for you. I’m not going to tell you about it because I don’t want to upset you.
But, here’s an index. And, when I run this query, and this query is just a simple thing that I put together to look at a few limited portions of the plan cache on my SQL Server. I don’t suggest you try to rewrite this query, use this specific query. You probably have your favorite plan cache query in the world. It might be, yes, people, it’s cash. I don’t know. Who knows?
I’ve never written my own plan cache query from scratch because I’ve grown to hate the plan cache over the years. I find it very unreliable. I often find that the plans that I need aren’t in there, either because the plan, the cache got cleared out, or there was a recompile, or, you know, what was the other good one that I had?
Oh, yeah. Maybe some knucklehead turned on an optimized for ad hoc workloads and ended up, we just only had a plan stub in the plan cache in place of a query that we could actually fix and tune, which is great, too. I love when that happens.
Good job, optimized for ad hoc workloaders. You really, really did a solid there. Thanks for that best practice. I wish there was a setting called optimized for no-lock workloads, and we could really make a good video. But anyway, pretty simple plan cache query.
It gives me just a little bit of information here, like the database name and the last execution time, which was today, recently, how many times the query executed, the total worker time, the total elapsed time, but more importantly, this column right here, the total logical rights.
Now, I included the memory grant stuff in here, because originally I was like, oh, I’ll show them what, like, spills, too, but I got a little bit lazy. And also, cached query plans don’t show spill operators, so I don’t know.
I just kind of passed on. I said I’ll save that for another video so I can make even more content. Just kidding. I’m not going to do that.
But we get the query text back, which we can see is the Casey Kasem query that I wrote before to grab the top 40 rows from the user’s table, and then we have the query plan. Now, in this query plan is one of my least favorite things to see in a query plan.
Actually, kind of like if you watch the NOLOC video, eager index pools are a lot like NOLOC hints. I love when I see them. I love when I see lots of them, and I love when I see them coming off of big tables, because I know that I can have an immediate profound impact on some queries without having to do any thinking whatsoever.
It’s just automatic at this point, which is great, because it’s one of those things that once you learn it, once you start applying it, you start to see what a huge difference tracking these things down and looking at them makes. Sometimes if I’m looking at weight stats on a server, and I see a lot of exec sync weights, E-X-E-C-S-Y-N-C, exec sync, if I see a lot of those, especially if there’s a long average duration for those weights, hoo-wee!
I know that I’m in for a lot of fun, because I know that there are going to be a lot of parallel query plans with eager index pools in them, and I’m going to be able to tune and tweet those, and act like I’m doing a whole lot of work when I’m really just goofing off making videos.
Shh! Don’t tell anyone! I don’t think… I’ve never been sued for theft of services yet, and hopefully I can avoid that for the rest of my life.
Just kidding. One of my mentors, Kendra Little, used to have a joke about how she would look busy while she was drawing cartoons and just tell her boss that code was compiling or something, and it reminded me of that.
So this query caused a lot of writes, because we had to write all… let’s see how many rows this is. Let’s do a little count here. I want to say that 8042010…
So we had to write 8 million rows into an index up in tempdb, an indexed object up in tempdb. You can think of the eager index pool like a temp table with a clustered index on it. Close enough. And so we had to index this query on the fly, and that’s why we did so many writes for this query, this simple select query.
Now, of course, if you watched other videos on my channel, and hopefully ones where the audio worked, I don’t want to say any names here, then you’ll know how to fix these. You look at the tooltip for the index pool, the eager index pool.
You can consider these seek predicates to be the key of the index you need to create. And you can think of the output list as the included columns that you need to create. In general, going back to our query, which is a little…
Oops, I went to the wrong demo window. Oh, I’m failing you all miserably. If we go back to the query that we ran, in the query plan, if we went by that, we only would have had the date column as an included column. But since we’re ordering by date, it makes a lot more sense to have the key of the index on the badges table beyond user ID, and then date, and then only include name.
The reason for that is because this cross-apply will most likely get applied as a nested loops join. And since we will have an equality predicate for each user ID, it’s more helpful for the order by part to have the date column in the key of the index, so we have the date in order for each user ID that we seek to, each equality predicate that we seek to.
So, if you’re like me, and you do a whole lot of digging through SQL Server, and you want a new thing to look for, when you are trying to find queries to tune to impress people, make money, get dates, whatever it is, whatever fame, fortune, and glory you seek in the world, look for select queries that cause a bunch of rights.
Now, there might be reasons, if it’s a small number of rights, it might be like a stats update, sticking a plan in query store, it might be, there are a couple other reasons out there. But if there’s a small number of rights, it might not be worth your time.
But when you see that number of rights start to really creep up to, you know, be like a larger number, like a six, seven digit number, maybe not a five, probably like a six, seven digit number, then that’s probably when I’d start paying attention and when I’d start want to looking at, looking deeply at that query plan to see whether it’s spills or spools, because you could be, you could really save a lot of performance problems by fixing these sorts of issues.
So anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. Again, because it’s the second time I recorded this because Streamlabs stabbed me in the back on the first one bad. And hopefully Camtasia doesn’t do the same thing. If you like this sort of content, especially with full audio, and hopefully crisp video, I tuned things a little bit. It’s a little hard to tell because my video preview is tiny here and I can’t tell exactly how bright and shiny I am.
Let’s see, what did I cover? Like the video, subscribe to the channel. Thank you for watching. And then I will see you in another video relatively soon. I may have even spoiled the contents of another video when I clicked on the wrong tab in SSMS. So, oh, don’t tell anyone.
I don’t want to get sued for stealing my own demo. I would do that too. I’m litigious. Anyway.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.