SQL Server Performance Office Hours Episode 30

SQL Server Performance Office Hours Episode 30



Questions:

  • Hi Erik, Thank you for the great content. Why is it better to have a higher sample rate for update statistics?
  • Hi Erik! I had a procedure with a chain of CTEs and UNIONs. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues. I went in desperation and popped an OPTION (RECOMPILE,HASH JOIN) in the end of the CTEs where they were inserting into a #Table. This only took it to less than 2 seconds! How on Earth can this has such a performance effect and SQL Server could not find out by itself? Thanx
  • I recently purchased the Everything Bundle and am enjoying each module. Is there a recommended viewing order? For example, the Index Tuning module seems to reference the Hardware modules.
  • If you were curating a Paul White’s Greatest Hits compilation what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL Servers. Remember: All of Paul’s posts are at https://www.sql.kiwi/

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five of your most pressing SQL Server questions during one of our live Office Hours sessions. Whether it’s about the benefits of a higher sample rate for update statistics, optimizing performance in complex queries using OPTION(RECOMPILE) and HASH JOIN, navigating through the Everything Bundle modules, or delving into Paul White’s invaluable blog posts, I cover it all. I also share some upcoming events where you can interact with me in person, including the PASS On Tour event in Dallas and Utrecht, as well as the Pass Data Community Summit in Seattle. If you have any questions or topics you’d like to discuss, make sure to leave them in the video description or reach out directly—your feedback helps shape future content!

Full Transcript

Erik Darling here with Darling Data, and it’s time to do one of those there Office Hours episodes where I answer five of your most burning SQL Server questions, or whatever you ask me about. Anyway, five. If you want to interact with me in any way, you know, virtually, physically, whatever, look down in the video description. All sorts of helpful links. You can hire me for consulting, buy my training, become a member of this channel, and support my efforts to bring you high-quality SQL Server content. Ask me Office Hours questions on episodes just like this. You could be one of the lucky five people who gets a question answered. And of course, if you enjoy the things that I do here and you think someone else might enjoy them as well, you know, like, subscribe, tell a friend, all that good stuff. I will be live and in person a few more places through the end of the year. Dallas, September, September 15th to 16th for the Pass On Tour event there. Utrecht, same event, different place, October 1st and 2nd. And of course, Pass Data Community Summit in Seattle from November 17th to 21st, where Kendra Little and I have two days of T-SQL pre-cons in which you should go to both so that you can spend, you can maximize your time with us. And with that out of the way, let’s go answer these questions. These have been sitting around for a little bit because I pre-recorded a lot of stuff while I was on vacation.

And getting back into things. And boy, did you ask a lot of questions. So first up, we’re going to do this one right here. Let’s make sure Zuma’s working. So hi, Eric. Thank you for the great content. Hello, you. You’re welcome. Why is it better to have a higher sample rate for update statistics? Well, the good news is it’s not always better. Sometimes the default sample rate is good enough to pick up on a good general description of what data, lives in your indexes and describes that in the histogram. Other times you do need to up the sample rate in order to catch more granular things. If you are using a higher sample rate for statistics and getting better representation of your data in the histogram, there’s no reason to, you know, fear or think that you need to do that for all of them. But that is just sort of how that works. If you are in a situation where even a full scan update of statistics is not doing a good job of describing your statistics, you might try filtered indexes or filtered statistics so that you have a smaller range of data.

You might create a filtered index or statistic on a particularly interesting data point for you and have a more accurate description of just that interesting data point. So again, not always better. Sometimes you do have to increase that. You know, I’ve worked on a few different systems where a regular default sampled statistics update did not do a good job.

And there was like a I think we hit a sweet spot for some of those between about 10 and 20 percent. But I have a few demos that I do in my performance tuning stuff where I have to do 25 or 30 percent. And then there’s even one where I have to do a full scan in order for SQL Server to get things right.

So it is not a sort of across the board thing. It’s a very specific thing. And I wouldn’t recommend doing it for every index because or every statistic because that may be counterproductive to your maintenance time. But if you find that there are some statistics where it does work better, you did a good job and you should keep doing it.

All right. Next up here. Hi, Eric. Hi, you. How are you doing? Oh, there’s even a thanks on this one with an X. That’s a spicy thanks. This thing blurry. Is it just my eyeballs?

Let’s see. Can I get it? Oh, there we go. I’m a little bit more. I’m a little bit more clear now. Hi, Eric. I had a procedure with a chain of CTE and unions. You’re off to a good start. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues.

Oh, you have a high pressure job, my friend. I went in desperation and popped an option recompile hash join in the end of the CTE where they were inserting into a temp table. This only took this only took it to less than two seconds.

How on earth can this has such a performance effect and SQL Server cannot find out by itself? Well, you left a few details out of here. You know, I mean, obviously, I’m not asking to see the full definition of the query because chain of CTE and unions.

That’s that’s that’s that’s that’s pay me money territory. And the fact that you have a recompile on there makes me wonder if there are any parameters or local variables involved. It could be perhaps influencing the optimizers estimation process.

But there are lots of cases where SQL Server may misunderstand the number of rows that might, you know, come out of certain operations. You know, I get by your where and join clauses, survive group by having type stuff and make progress through the plan. And you might end up with well, I mean, you know, I’m going to say in general, you might end up with a lot of nested loops joins that execute way, way more than the optimizer anticipated them than them executing.

So without seeing the query plan, my guess would be that you had a lot of nested loops joins in there in that plan and that SQL Server was getting stuck in them, perhaps even serial nested loops joins. I guess there’s also the possibility that you had some untimely merge joins, especially if you had to sort data before going into those merge joins. But the reason why the optimizer didn’t find it?

Well, I mean, you know, it could be a lot of things. It could be that you have statistically inaccurate information in your database. It could be that you had a parameter sensitivity issue.

It could be that you were using local variables and SQL Server was not estimating things well. You could be using a table variable in there, too, and getting a bad estimation. But also SQL Server could just be doing a poor job of cardinality estimation without any of those things.

And you ended up with some nested loops joins that were completely inappropriate for the number of rows that you were managing. As for why SQL Server couldn’t find out by itself, well, you know, that’s a query complexity thing. The optimizer only has so much time that it’s willing to spend on a query figuring stuff out.

And sometimes it does not have a chance to explore or accurately or correctly cost all of the alternatives. So, you know, this is where stuff like query hints really do come into play. You could try simplifying the query a bit more, maybe removing any elements from the query that would harm estimation, local variables, table variables.

Maybe, you know, if there’s any sort of parameter sensitivity, you could find ways of dealing with that. But if you’ve got things solved with recompile hash join, I don’t see a reason for you to keep tinkering with things. Sounds like you’ve got it under control.

So, you know, the optimizer is a big piece of software and it’s just like, you know, it’s made by humans, just like most software. You know, I’m going to say, I’m going to draw the line a little bit at LLM generated software because that’s not generated by humans and that’s full of mistakes too. But, you know, the optimizer can’t do everything.

It can only make a good enough guess most of the time at a good enough plan. All right. Next question here. I have, I recently purchased the everything bundle.

Woohoo. Thank you for your support. And I’m enjoying each module. Is there a recommended viewing order? For example, the index tuning module seems to reference the hardware modules. So I think as long as you do the starting modules first and I think it’s, what’s it called, the essentials modules first, you should be in pretty good shape.

If there’s anything in the other modules that reference stuff back, you can always go watch those other modules afterwards and sort of like, you know, figure out where, where things, you know, maybe connect. But I don’t really have a recommended viewing order aside from like starting in essentials and then kind of whatever you find interesting and appealing or sort of whatever applies to problems that you’re trying to solve. So that’s about all the, that’s about all the advice I have there.

All right. Last question. Gonna finish on a strong one. Oh, my dear friend, Paul. All right.

If you were curating a Paul White’s greatest hits compilation, what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL servers? All right.

So I’m going to answer the second question first. Paul White would be an excellent sandwich maker. Paul White would make the best sandwiches you’ve ever tasted. They would leave you with a sense of peace and well-being, just like his blog posts do, that you didn’t know was humanly achievable. So that, that, that’s, that answers that question.

As far as curating Paul White’s greatest hits, Paul has a number of like, just absolutely, I mean, like the SQL.Kiwi site where all of his posts live now. This is actually something that a lot of people miss. Like a lot of people will Google like Paul White and a topic and they’ll go to like the old SQL performance site, or they might even find like a dead SQL blog link.

But all of Paul’s posts these days are on SQL.Kiwi. And I’m going to put this in the, the, the, the video description because it’s that important. All of Paul’s posts are there.

You can find lots of one-off posts that are amazingly long, detailed, full of, you know, great information. But I think if you were to, if I were to curate Paul White’s greatest hits, it would be any post that is part of a series of posts. Because I know when Paul White writes a series of posts, that man has got his head around something.

And that’s where I think the best stuff comes in because, you know, you have sort of like this learning pathway of, you know, like, like, like, like great, like a series of things about a particular topic that wouldn’t all fit into one post. So I think whenever Paul White writes a series, that’s the stuff that I would put in like the greatest hits. Like if I were going to make a great post-Paul book in tribute to Paul White’s writing, I would start with all of the posts that are a series.

You know, as far as stuff that has, I think the series of posts that has influenced me the most from him is certainly the Isolation Level series. I have spent a lot of time with that one, especially as I’ve been presenting more and writing more about Isolation Levels. So that one has had the, I think, the most recent big impact on me.

But there are many posts of his that, you know, I have read at different points in my career. Paul White has, of course, been blogging for quite a while. He was actually the first blogger I ever read when I was trying to figure out how to write fast performing paging queries.

He had a couple of posts on SQL Server Central. But yeah, like there are posts that I’ve had to read at like various stages in my career because there were things in each of them that I didn’t fully understand until I understood a lot of other stuff first. And so like it took a lot of puzzle pieces for me to actually get some of the points that he was making, even from many years ago.

So I don’t know. Of course, I always thank you to Paul for everything you’ve done over the years. You are just an amazing human being and teacher.

So thank you for that. But yeah, it’s, you know, I would start with anything that’s a series. That’s the really, like, that’s a really crazy, like, you know, you have to sit down and concentrate. You got to like spend some time with those in order to fully get them.

But anyway, that’s five questions. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in another video another time. Oh, okay.

Let’s just, we know it’s tomorrow at noon Eastern. So let’s just stop pretending. Anyway, cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Using Windows Performance Recorder

A Little About Using Windows Performance Recorder


Video Summary

In this video, I delve into using Windows Performance Recorder to uncover deeper performance issues in SQL Server queries. After running a set of demos where certain queries took significantly longer than expected, I couldn’t pinpoint the exact cause based on traditional query plan analysis and weight stats. To get to the bottom of it, I turned to Windows Performance Recorder, which provided valuable call stack information that led us to discover the impact of read ahead reads, specifically trace flag 652, in this particular scenario. This video is a great example of how sometimes you need to go beyond the query plan and use additional tools like performance recording to truly understand what’s happening under the hood.

Full Transcript

Erik Darling here with Darling Data. And today’s video is an interesting one. So this video is going to talk about using Windows Performance Recorder to track down like a deeper level of performance problem in SQL Server. This is actually based on a demo that I had recorded a little before I went on family vacation and it bothered me the whole time. And when I got back, I started looking at it again. And, uh, like, really couldn’t figure it out. And so, like, I showed a few people and they were like, I don’t get it either. So I was like, okay. So I used Windows Performance Recorder and found some stuff. And then it was actually, uh, my, my good dear friend, Forrest McDaniel, who reminded me about a trace flag that we could use to fix this one. But, uh, this is an interesting one. So I hope everyone’s buckled up. Anyway, uh, if you look down at the video description, there you’ll see all sorts of helpful links where you can hire me for consulting, buy my training, become a member of the channel, and support the stuff that I do here to bring you this content. Uh, ask me office hours questions for my office hours episodes. Those usually come up Monday where I answer five of your questions. And of course, if you enjoy this content, but you don’t want to pay me to do anything, you can always like subscribe, tell a friend and help this channel get bigger and better in other ways. Uh, of course, I still have some speaking events coming up. Just finished, uh, pass on tour New York City, but there is still events in Dallas, uh, coming up in September where I have a half day, pre-con and a regular session. Uh, and then there is the one in Utrecht where I have just about the same thing. Then of course, there is pass data community summit coming up in Seattle, November 17th to 21st. Uh, I have two days of T-SQL pre-cons with the wonderful and talented Kendra Little. And I look forward to seeing you at both of those, not just one of those. You should go to both of those because that’s how you maximize your knowledge and learning. So with that out of the way, let us get to the database party.

Now, just to sort of, uh, bring things back to where, where they were, right? Uh, I had this set of demos where, um, there were a couple of queries that I ran. And in one of the queries, I used this, uh, option use hint, disable optimizer row goal. And what ended up happening in these query plans was, uh, the first query, uh, with row goals, uh, enabled ran for about five and a half seconds and 5.6 seconds. Good enough there. But the other query ran for 40 seconds. And all 40 of those seconds were sort of spent in a place that didn’t make any sense to me. So I’m just going to shuffle these query plans around a little bit. And the biggest time suck in these query plans was right in this branch. So this one took 3.386 seconds, which isn’t bad, right? This is the fast query plan. This one took 37 seconds, which is bad, right? That’s, that’s like a 10 X slowdown for this one operator. And I couldn’t for the, at first, when I listened to this happen, I couldn’t for the life of me figure out what was going on with it.

Because when I looked at the weight stats for the queries, so if we, uh, get the properties over here and, uh, we open up weight stats, there was not a whole lot of weight stat evidence about what might be going on in here. Right? Like, like, like the weights for the, the quick query were of course, you know, not really anything interesting, but then the weights for the slow query, let me just make sure that’s the right one there. The weights for the slow query really didn’t add up to anything either. Like, like, like just looking through these, like nothing in here said, this is why I took 30 seconds, right? This is none of, none of these weights are doing very much.

Uh, like there’s very, very little weight wise going on in here. So what I did was I fired up my friend, the windows performance recorder, which since this is a VM I had to do over here. So, uh, what I’m going to do is show you sort of what I do, uh, to get windows performance, uh, windows performance recorder stuff. Now, of course, the first thing you have to do is actually like run it to collect things, but I’ve already done that because I don’t want you to sit there and wait for a long time for stuff.

So, uh, once I have, uh, once I had things recorded for the two queries, uh, what I usually do is go into the CPU stacks and then I’ll go into SQL Server and I’m going to make this bigger and do some zooming. So you can all see what, what, what I get up to. So the first thing you want to do is select everything in here and then you want to say, look up symbols.

So this is going to go out to Microsoft and this is going to look up symbols and this is going to resolve the names of any public symbols that it can. This is going to take a couple of seconds to run. Uh, maybe you should have done this before I did the video, but you know, I’ve got pretty fast internet, so I’m not too, too worried about it.

So after, uh, while, while this is running, what I’m going to do is there are a few things in here that I like to change, uh, for the group paths and the fold percentage. So I usually just delete that and hit enter. And then for the fold percentage, I put in zero here and I hit enter.

And this will sort of expand everything to what we care about. Now, uh, looking at the stuff in here, I’m just going to slide this over. So it’s a little bit more clear.

Oh, that didn’t, that didn’t help much at all, did it? So like, if we look at what’s going on in here, we have, um, you know, like this locate column internal thing, but there’s not really like a lot of time spent in this one where we care about stuff. Where this be, where this sort of becomes more useful is when you start comparing one, uh, set of call stacks to another, but we’re good, but we need to sort of open up the other one and do stuff there first.

So the CPU stacks here just show some stuff like locate column internal. I think I actually have to squish that a little bit so that I can, uh, use zoom it to zoom in on things. Otherwise, uh, it doesn’t really cooperate with me, but for some reason, zoom it doesn’t want to cooperate when this thing is up anyway.

So we’re just going to skip that part, but we don’t really have anything too, too interesting going on in here. All right. This is the, this is the fast plan.

So let’s go look at the slow one and let’s open, crack open this bad boy. And we’re going to go back into CPU stacks and we’re going to go back into SQL Server and let’s make this as big as we can. And what we have in here is, uh, some interesting stuff.

Once we get rid of the group paths and once we get rid of the, once we do the full percent zero. And what we’re going to see is that we have a new thing in here. Uh, and we, the couple of new things that we have in here are read ahead helper and read ahead.

And this is where, I’m sorry. There’s a, I think this one too was interesting. So if we were to compare, uh, one set of call stacks to another, um, like this is really where we had way more stuff going on.

Right. This is where like we spent way more CPU time. And this is the only way that I can figure out what happened.

Cause again, just looking at the like weight stats and other stuff for the query, there was no sign at all that this stuff was coming up to the forefront. But then if we sort of bring these percents over, excuse me, uh, we’ll see that these numbers, uh, you know, were like much like a, like these things like were present in this call stack trace where they weren’t in the other one really. And they have like fairly high percentages, uh, as far as like, like stuff that executed.

Um, there’s all, all sorts of other stuff that you could look at in here, like make, look at flame graphs of things, but I don’t really, you know, I don’t, I don’t really, I’m not, I’m not enough of a developer to understand those fully. But, uh, so this is how I was like, huh. And then like, you know, as I was sharing this stuff with the people who I was just like, what, what, this is a weird thing.

Uh, again, my friend Forrest McDaniel was like, Hey, there’s a trace flag to disable read ahead reads, uh, 652. I think Paul White’s blogged about it, but, uh, Forrest remembered it. So Forrest gets full credit on that one.

But if we, uh, run this query and we turn on this trace flag 652, 7418, uh, just changes how operator times show, uh, their time in the query plan. Uh, uh, trace flag 7418 changes things. Um, so in a, in a row mode plan, the operator times are cumulative going from right to left.

Uh, but in batch mode plans, uh, they are per operator. So 7418 makes row mode plans act like batch mode plans and just has the operator times, uh, show as per operator. So this one isn’t part of what solves a problem.

652, which disables read ahead reads for the query. This is what solves the problem. So if we rerun this query with those trace flags enabled, and we look at the execution plan without read ahead turned on. And like what I don’t want you to think is that read ahead is a bad thing.

Uh, read ahead is generally a very, very good thing. It turned out to be a negative thing in this plan because of how many of them we had to do. Like there’s two kinds of killer latencies.

Typically in computer systems, there’s the stuff that of course has a long latency, right? You might see that for like lock weights and other things that, uh, you know, when they drag on for too long, other things start having problems. But then there’s the type of latency where it’s sort of like the death by a thousand cuts swarm of mosquitoes latency where you have to do one thing.

And even though that one thing is really fast, if you have to do a lot of it, well, you can end up in a pretty tough spot. So if we look at the execution plan for this with read ahead disabled, this actually ends up running, uh, pretty quickly even compared to the fast plan before. It’s about 500 milliseconds faster.

Not that like that’s the meaningful difference, but really it was with the, the difference between this one and the, uh, row goal being disabled query. That was where it got really interesting. So, uh, if you, when you’re a troubleshooting query performance, you might find all sorts of good things to look at, uh, contextually for the query.

Uh, the query execution plan is always going to be like your first source of truth for like, you know, where, uh, time was spent in the, in the, in the query plan. And then like getting the actual execution plan for that, then you might find where, um, you might look at weight stats for the query plan and see interesting things in there. But sometimes, or like, uh, you might even look at the sys.dmexec session weight stats for your session and see if there’s anything in there.

Because like I, I’ve talked about a few times before, Microsoft screens certain weights out of actual execution plans. It doesn’t report on them. So you might look at the sys.dmexec session, uh, weights, uh, dmex session weight stats view, and you might look in there to see if there’s anything additional that would be useful for troubleshooting.

Other times you have to go deeper and figure out like, you know, stuff that a query plan would never report on. And call stacks are one of them. And windows, windows performance recorder is a much easier way to just record general call stacks for when a query is executing.

Than, say, like something like using the debugger and having to like attach to a thread and like, you know, like click through and wait until you like, like set breakpoints and stuff. Because that, that’s where it gets really difficult. And of course, the nice thing about windows performance recorder is that if you find interesting call stacks in here, you can actually use something like window bug to set breakpoints on the interesting things you find in here.

So you’re just not hunting through this haystack of things and like, you know, just like hitting enter and letting the debugger make progress after each new step. So this was an interesting one. Um, again, readaheads generally not bad.

Like don’t go disabling readahead. I don’t want you to take that away from this video. I don’t want you to think that readahead reads are awful. Generally, they’re very good things and they help performance quite a bit. But, uh, in this case, they ended up being the problem for this one particular query and tracking down what the problem was, was to me very interesting and very fun.

And of course, uh, I, I do always have to thank my, my dear friends for helping me remember the things that I cannot possibly keep, uh, all contained in my head at once. So anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you over in, uh, the next video, which should be an office hours episode because the next video will be on a Monday. Today being Friday. All right.

Cool. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Skewed Data and Skewed Parallelism

A Little About Skewed Data and Skewed Parallelism


Video Summary

In this video, I delve into a fascinating SQL Server performance issue: how skewed data can lead to skewed parallelism, which in turn can significantly impact query performance by sacrificing CPU resources to reduce wall clock time. I walk through an example using the Stack Overflow database, where I artificially skewed the `votes` table and ran a query that highlighted this problem. By examining the query plan and analyzing various metrics such as CPU time, elapsed time, and memory grants, we can identify when parallelism is ineffective due to skewed data distribution. This video serves as an excellent introduction to recognizing and understanding this common performance bottleneck, setting the stage for future content where I will explore solutions to address these issues.

Full Transcript

Erik Darling here with Darling Data. And in response to many miserable, wretched people complaining about the fact that I need to make a living and owing to the fact that I needed to edit some slides. Anyway, I’ve redone my intro deck. So we’re gonna experiment a little bit with that today. In this video, we’re going to talk about how skewed data can lead to skewed parallelism. And skewed parallelism, of course, means where many rows end up on either one or a small number of threads. And kind of undo all of the, you know, like potential speed improvements by sacrificing CPU in order to reduce wall clock time. They can happen because of that. And how you can find it. I’m not gonna talk about how to fix it in this one, because I’ve got something else coming up for that. But this is a good way to introduce the problem and to help you out. identify it in a few different ways. So if you look at the video description, you will see all sorts of helpful links. You can hire me for consulting by my training, of course, at a discount, because I do care about the people who who buy my training and I’m not going broke when they do so. Becoming a channel member to support my efforts to bring you this high quality SQL Server content. Ask me questions on office hours. And of course, if you if you if you want to get a like this stuff, please do like subscribe and tell a friend and all that good stuff. I just finished Pass on Tour New York City. So that that one is off the list. I still have Pass on Tour Dallas and Pass on Tour in Utrecht, where I have half day pre cons and regular sessions at both. And of course, the Pass Data Community Summit coming up in Seattle November 17th to 21st, where Kendra Little and I are tag teaming two days of T SQL pre cons. But with all that out of the way, let’s let’s party, let’s have ourselves a little database party here. Now, let’s go over to SQL Server Management Studio. What I’m going to do a little bit is explain to you how I created this data. And what the query plan for with with the skewed parallelism looks like in some way that you can start ways that you can start figuring out if you’re having this problem. And we’ll get we’ll get to the solving it part later. Different video. So what I did was the votes table in the stack overflow database has the user ID column and the user ID column in the votes table is skewed and is rather is scrubbed, not skewed. We made it. I made it skewed is scrubbed in the public data releases because you know, they don’t mean, of course, you’re supposed to be able to vote in anonymity.

On stack overflow, no one knows who you upvoted downvoted, you’re free to divulge that information if you’d like. But maybe the nice folks at stack overflow don’t just want to go publicly tell everyone what you did. Now, what I did after I selected everything from the votes table into this new table is I updated that table to replace the null user IDs. There are a couple vote types for bounties. Bounties are when you like want to give people additional reputation coming from your reputation for having a really good answer or something. But so there are some of them populated. So what I did was I set all of the null user IDs, which is the majority of the table, equal to the vote type ID that is in the vote type ID column. And there are not too many vote type IDs are mostly upvotes and downvotes.

There are a few other ones, but they’re not terribly interesting, nor do they comprise much data in the table. Of course, then we had to create a clustered primary key on the table so that we were not dealing with some terrible heapness, which you can blame things on. And then, of course, I needed to create an index on the table to support the query that we’re about to run or rather that I ran ahead of time.

Why did I run it ahead of time? You might ask. And I might I might tell you if you were to ask me such a thing because it runs for a little bit. And what this query does is it selects from the users table because users table is kind of where most queries start. You want to figure out what user stuff belongs to who did what.

And then we are going to outer reply this votes skewed table down here correlated on that new on that newly populated user ID column. And then we are saying where vote type ID is between one and four. This will this is going to be part of what gives us the skew because the vote type ID, of course, is ends up as the user ID for these because these are these are these are the ones that were null before.

So this is where things kind of get fun. So if we look at the query plan for this, right, we will see that we indeed have all of the hallmarks of a parallel execution plan. And we have parallel exchanges.

Well, actually, we have one parallel exchange. OK, singular. We have one parallel exchange. We also have all sorts of little parallel indicators across all of these wonderful operators telling us that SQL Server used multiple CPUs to run parts of the query. But with all those one up here, too, let’s go get that.

Oh, let’s let’s be completists over here. Now, this one here, this is this one doesn’t count because this one this one had this sort had a spill. The spill doesn’t really matter all that much for this, but it is actually kind of interesting because in a way that we’ll talk about. So if we start up here in this part of the query plan and let’s make this a little bit bigger so we can see what’s going on here.

And we expand this actual number of rows thing, node, whatever you want to call it. And we zoom in. You’ll see that the each each of these eight threads in here did a fairly even amount of work.

Right. It’s like you might, you know, pick a few bones about some of these. But and for most of this, it’s fine. Right.

So this part of the starting part of the query plan, the scan of the users table, we do not have any problems there. And when we go into the sort and I don’t know why that decided to do that to me. But if we look at the sort operator, we’ll see that parallelism remains pretty even across all of these threads here, too.

We didn’t we didn’t really lose anything here. And when we go into the nested loops join, things are still OK here. We did.

We’re still doing all right. But now this is where things start to get a little wonky on us. And this is this is why this this lower branch of the plan kind of, you know, is where we start having issues is if we go to the clustered index C here, we will see that all of the rows ended up on a single thread.

The reason why this happens with skewed data is because SQL Server uses this thing called the parallel page supplier in order to hand rows out to threads. Part of how it decides which go which rows go on which threads is by using sort of a modulo hash function to say, oh, if well, if this number divides by this, then we send you to this thread. In this case, all of the rows just happened to hash out to thread one.

And this is going to be true when we go across. Now, I said that the sort didn’t really matter, but there is something interesting about it. And the interesting thing about it is, of course, because all query plans start as serial execution plans and parallel plan exploration isn’t started unless the subtree cost of the serial plan is higher than the cost threshold for parallelism.

That means all of the stuff that SQL Server sort of initially assigns to a query, one of those stuffs being memory grants, are decided before a parallel plan is explored and aren’t changed if a parallel plan is chosen. Now, if you have a DOP 8 query, the memory grant is not then multiplied by DOP. The memory grant is divided by DOP.

So the memory grant for this query, which is, if we hover over here, the memory grant for this query is 359 megs, which isn’t big by any stretch of the imagination. But this number is divided by 8 for us because we have this query running at DOP 8, which means that our sort operator asked for enough memory to sort all that data. But look what happened.

One thread, because we had one thread that had any rows on it, entirely spilled because that one thread had one eighth of the memory. If this one thread had all 359 megs, maybe it wouldn’t have spilled. But since it had 358 or 9 divided by 8, so this thread had like one eighth of the memory grant, it didn’t get nearly enough.

So it spilled a whole bunch. Now, this was a particularly long running spill, right? So we had 4.6 seconds spent, or let’s just call it 4.7 seconds spent here.

And then we have 45 seconds here minus that 4.7. So let’s just call this 40 seconds of this one thread spilling out to disk. So this did cause a performance problem for us because this one like memory requiring operator got one eighth of the memory.

Rather, the thread that we wanted, the memory was spread across all eight threads. So each thread got one eighth of the memory. And that one thread doing all the work means that one thread didn’t have nearly enough memory.

So this spilled out a whole bunch. Now, there are a couple ways where you might, you know, aside from looking at the rows on the per thread, like we did in the query plan, there are some other ways that you might look at this and decide that you have an issue.

So one thing that I always tell people to look out for when they run a parallel query, and it does not run as quickly as they would hope, is to look at the CPU and elapsed time. One very, very good indicator that parallelism was ineffective is if CPU time and elapsed time are very close together.

In a perfect world, let’s say we have a DOP8 query, then we would want to see CPU time be 8x elapsed time, right? We would want to see all eight threads do an even-ish number of work. And we would want to see all eight of those threads work cooperatively and produce an elapsed time, like a wall clock time, that is one-eighth of CPU time.

So in this case, we have 45 seconds of CPU time and 53 seconds of elapsed time. So we can kind of tell here that parallelism wasn’t very effective. From a weight stats perspective, this is where things get a little disappointing.

If we look at weight stats for this query, we’re not going to see any parallelism weights in the query plan that help us figure stuff out. We are going to see a whole bunch of this I.O. completion weight from the sort that spilled, but we’re not going to see the weight that really makes a difference here, which is CX Consumer.

Microsoft screens CX Consumer weights out of actual execution plans because some guy who works at Salesforce says they don’t matter. But in reality, they actually do for this sort of thing.

So if you’re running a query like this, or rather you see a parallel query running, and the CX Consumer weights are really high, and then you have other indicators like I’m showing you here, where the parallelism is obviously skewed, either looking at the rows per thread counter, stuff like that, or the CPU time and elapsed time are very close, then that’s a pretty good sign that your query has very skewed parallelism.

So these are things for you to watch out for when parallel queries are not perhaps running as quickly as you would like them to. And, you know, just to sort of finish things out, this is what the votes skewed table looks like as far as, you know, like distributions in here go.

So I suppose I should probably limit this to something a little bit more useful because we have where vote type ID is between 1 and 4 here. So let’s actually add that in down here.

Let’s not be silly and foolish with ourselves. Let’s make sure that we have just sort of a look at exactly what data we care about. This is how all the rows distributed, and, you know, because of whatever hash function SQL Server chose when it was doing the parallel page supplying, every single row ended up on one thread.

So that’s what happened there. So this is how skewed data can also lead to skewed parallelism. And can cause you performance problems. So thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will have a follow-up video on this one eventually about how to fix skewed parallelism for this query. So we have that to look forward to.

But tomorrow’s video is going to be sort of a different one where I’m going to dig into a different way of troubleshooting query performance that goes a little bit deeper than just looking at query plans and weight stats and stuff. Because we’re going to look at a query that actually doesn’t have anything helpful in that regard.

And we have to go deeper. And we have to look at some more stuff. So anyway, thank you for watching. And I’ll see you in tomorrow’s video, I hope.

I pray we all survive the night. Anyway, goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Query Behavior Under Read Committed Row Versioning

Learn T-SQL With Erik: Query Behavior Under Read Committed Row Versioning


Video Summary

In this video, I delve into the nuances of optimistic isolation levels and row versioning in SQL Server, specifically focusing on how certain query structures can lead to unexpected results when using an optimistic isolation level like `READ COMMITTED SNAPSHOT`. By walking through a practical example involving seat assignments at a dinner table, I illustrate how queries that span multiple phases (select and update) can be problematic. This video is part of a comprehensive beginner content series from Darling Data, where all 23 hours of material are currently available for pre-sale at an exclusive price of $250. With advanced content set to launch after the summer, this offer will double in value, making it a fantastic opportunity to invest in your SQL Server knowledge now and into the future.

Full Transcript

Erik Darling here with Darling Data. And we are going to talk about, in this video, where an optimistic isolation level, a row versioning isolation level, may not be perfect because of the way you have written a query. All right. Now, there are all sorts of things in code that may attempt to enforce integrity of some sort. It may be referenced referential, it may be business rules. It might be in a trigger, it might be in a procedure, it might be in a function, who knows where you’ve hidden these things. And that code may not work the same way under a row versioning isolation level that it would under a locking isolation level, unless you add in locking query hints. Now, before we do that, all 23 hours of this fantastic beginner content is available to you to start watching right this instant. You push button kids. You don’t know how easy you have it. All for 250 bucks currently on the presale price. That’s good for the rest of your life. And that will double in value, at least, when the advanced material goes up after the summer. And you’ll wish that you had spent less now. So you should do that now. Anyway, let’s talk about this stuff. So I am going to create a table and fill it with a table and fill it with a little bit of data. And this is what the data in our table looks like. All right. So I’ve got some lucky fans in here. And what we notice immediately about this table is that there is a there is a seat at this table. Right? There’s an empty seat at this table, we have five lucky fans who are going to eat dinner with someone, but who will they eat dinner with? That’s the question.

Now, when I said, because of the way you’ve written the query, we need to talk about that. So in right now, in my Stack Overflow database, I have read committed the read committed snapshot isolation level turned off, it is disabled, you can see the off button right there. And this is the query that will be running. What we’re going to do is we’re going to declare a variable up here to have a name, right? Because we have to we have to give that seat to somebody in this case, it is going to be Eric. Right? Sounds good to me. I like dinner. And what we’re going to do is we are going to run a query to select the top one seat that is free.

Right? So we had that one free seat. And that that free that seat had is free equals one and had a null for the name, we don’t need to check that though, all we need to see that see is that the seat is free. And then based on this free seat, we are going to update the table. And we are going to set is free to zero and assign that seat to a lucky contestant. And what I’m going to do is essentially output the rows that get inserted. So it’s a little bit easier to see exactly what happened when the query ran.

And then, of course, that that CTE, sorry, that CTE gets joined to the table based on the ID that we find here. So under the normal circumstances, or rather under read committed locking, assuming that you don’t have no lock hints everywhere, because if you have no lock hints on these queries, you’re in, you’re not, you’re going to see weird results no matter what. But assuming that you are using read committed locking and you’re depending on that locking to, to give you the behavior that you want your query to have.

If we begin a transaction and we run this update, we will see that this, this lovely Eric right here acquired seat number one. And we have set his name here and we have said that you are, that seat is no longer free, which means that if we come over here and we try to do the same thing, right? We do this stuff and we try to run the same query.

This query is going to get blocked, but importantly, most importantly here, this query is going to get blocked here. This is where this query is going to get blocked because that other query has an exclusive lock to update the, update the, uh, the dinner plans table. It’s, it’s going to get blocked here.

The select gets blocked. So if I try to run this, it’s going to be blocked and we can, we can tell it’s blocked because it’s just sitting there spinning and nothing good is happening, right? We’re not getting any, uh, not getting anything out of this at all.

We are, we apparently have some sort of police emergency outside. Thank you. Thanks for keeping us safe. Uh, but we are completely blocked up here.

So if I come over and I commit this transaction, this will, uh, this will go, this will be committed now. The, the change that we made here and this query will find no free seats, right? So this thing right here, we got no rows back from the output clause.

And you might think, Hey, this is great. Read committed locking. Amazing. You know, this query was blocked for a while while we were talking about stuff and the, the, the, the, the federalities were outside doing something, who knows what.

Uh, but like this, this gives you, I guess, what would be a correct results. We did, we, we did not lose the update that the first query ran because this, and this query did not overwrite it because this query did not find a free row when this select query finished running. As soon as that lock released and this query was allowed to run, it was like, there’s no free seats.

Wah. Ah, that’s a bummer. This backwards Eric can’t possibly have dinner now. So let’s reset those tables. Right?

So we have set name back to null and we have set is free back to one. And that’s all for seat number one. Uh, and now we are going to come over here and we are going to set read committed snapshot on. This might take a second based on, cause we have the, the, the rollback immediate thing happening, which is always, always the best way to run your, uh, your changing database read committed state.

And, uh, let’s do this again. Right? Let’s, uh, let’s re, let’s rerun this.

So we are going to, uh, run this whole thing and this is going to complete immediately. Right? So, uh, we see that, uh, we see that forwards facing Eric has gotten this seat and this seat is no longer free. Now, since now that we’ve done this, what’s going to change is that when this update ran or like, well, I mean, this update is currently running.

We have not committed this update yet. Uh, this update is generating a version of this row. Right?

The row, like the row that we found here, this update has generated a version of this row where is free equals one. And the name, the name is still null. So if we come over here and run this, it’s still going to get blocked, but it’s going to get blocked in a different place now.

So before, when we ran this, this part of the query, this select got blocked. The select got blocked because of the update lock, but now this part of the query, this select can run and read the version of the row that got created when that update happened. What’s blocked here now is this update trying to change data in the table, actually trying to change that same row.

So if we come back over here and we commit this transaction and we come back over here, we’ll see that now backwards Eric has overwritten forward Eric. Okay. So because of this query got blocked in a different place, the select didn’t get blocked.

The update got blocked. The select found a row in the row versions and said, oh, cool. Look what I found. It’s a free seat.

And then updated the table to give that free seat to someone, even though that seat was taken fair and square by forwards facing Eric. So that’s not good. Right.

So this is not a situation that we want to run into. Now, to make things perfectly clear here, though, if we had just written the query like a sort of normal human being, that situation would not arise because there wouldn’t be two phases to the query. There wouldn’t be the select phase to locate a row to change and then an update to change that row.

If we just ran an update top one to set whatever seat to Eric and said is for equal zero, we wouldn’t have had that because the update would have done would have just blocked the other update and we wouldn’t lose it. If you have code that relies on locking for that sort of behavior and you want to switch to an optimistic or rather a row versioning isolation level, you may need to look at something different. You might need to do something.

You might need to change your queries. So if we come back over here and we reset this, well, this will go back to square one and we still have read committed snapshot isolation enabled and we are still running under the read committed isolation level. Right.

We set this here so that the query is free to use the row versions when it when it runs. And let’s do the same thing now. So let’s run this up to the commit transaction. Do that and now let’s come over here and just run the whole thing.

This query gets blocked again, but now it’s the select that gets blocked again and not just the update. So if we come over here and commit this transaction and we come back over here, this query will go back to not having found a row because now this select would have been blocked rather than just the update getting blocked because we added this read committed lock into it. The read committed lock in makes this select obey the locking semantics of read committed locking and not use row versions to go find this stuff.

This thing got blocked looking at the table. So if you if you are looking at switching to using a row version isolation level and you have maybe code that looks like this or you have code in triggers or other places that might like, you know, enforce some sort of integrity or business rules or something. You might need to start looking at that code to figure out if you need to add those read committed locks lock hints to it.

If all of your queries already have a bunch of no lock hints on them, you don’t need to do that. But you do need to get rid of the no lock hints. So your queries start using the row versions and stop doing dirty reads.

All right. I feel like this was a complete video. I feel like we have we have we have lived up to expectations here. All right.

We have we have completed our intended course of action. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video. Next. We are actually done talking about transactions and isolation levels after this, at least for the beginner content.

After this, we have a bunch of programmability stuff to talk about. So, of course, there’s there’s a lot in there. So, oh boy, more videos for you.

All right. You’re welcome. Anyway, that’s good here. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 29

SQL Server Performance Office Hours Episode 29



To ask your questions, head over here.

In Brent’s post about maintenance he said updating statistics causes parameter sniffing. Should I stop doing that? It’s very confusing.
Hi Erik. You discussed index compression in a YouTube video about 10 months ago. I do not see a video on table compression. What is your professional opinion on table compression, and do you ever use it or recommend it to clients? Thank you sir.
Hi Erik, what are thoughts regarding AI and its long term impact on the dba world. I used Claude for the first time a month ago and I’m now considering retraining to be a plumber… Cheers, Oli the dba.
Which metric best reflects actual execution time – the “Last Total Elapsed Time” in the Messages tab via SET STATISTICS TIME ON? In production, a query ran slowly despite a low time shown in the execution plan (guessing low CPU time); what factors might cause this, and how can I find and resolve them? It wasn’t just slow in SSMS but the app too. So I guess we can rule out high Client time?
I’ve been watching your T-SQL bgeginner content and it’s really great. I thought I was more advanced but I’m learning a lot of surprising things. I can’t imagine whatt eh advanced content will look like. How so you remember all this stuff?

Video Summary

In this video, I dive into a series of Office Hours questions from viewers, providing insights and advice on topics ranging from parameter sniffing and statistics maintenance to table compression and the long-term impact of AI on the DBA world. Whether you’re dealing with performance issues or curious about future trends in technology, there’s something for everyone. I also share my thoughts on the T-SQL beginner content that has been well-received by many, emphasizing the importance of continuous learning and practice in the tech field. If you have your own questions or want to support the channel, make sure to check out the links provided in the video description.

Full Transcript

Erik Darling here with Darling Data, and you guessed it, you smart devil you. We are going to do an Office Hours episode. Why? Because I have five questions to answer. So I’m going to answer them. If you want to ask your own question, there’s a link right there. It’s going to look very much like a link down in the video description where you can go ask your very own questions. It’s all very nice and streamlined for you. If you want to support the channel, there’s also a link down there to do that. It’s just a nice way to say thank you for the endless hours of content that I produce. If you’re into that sort of thing. If you’re into the whole gratitude trip. If you need consulting help, and boy, do you need consulting help. I’ve seen your servers. They’re messes. I am available for all of these things and more. And as always, my rates are reasonable. If you want to get your hands on my performance tuning training, you can get all 24 hours of it for about 150 US buckaroos. That will last you for the rest of the day. The rest of your life. Another link down in the video description to do all that with. And if you want to get in on my T-SQL course, while it is still at the pre-sale price of $250, you can. 23 hours of beginner content is out and available. And this will go up to $500 when the advanced material drops after the summer. And so please buy it now before you send an email to me saying, hey, can I get it for the old price? And I say, no, no, you can’t. Anyway, the nice folks at Redgate, they’re making me leave the house this summer. So August, September and October, I will be in New York, Dallas and Utrecht with the Pass on Tour extravaganza, all leading up to the Pass Data Community Summit taking place in Seattle November 17th to 21st.

And now let’s go answer some office hours questions here. Let’s zoom this in to a reasonable degree so that everyone can see exactly what’s happening here. All right. In Brent’s post about maintenance, he said, updating statistics causes parameter sniffing. Should I stop doing that? It’s very confusing. I know the post you’re talking about. I’m not sure that I would have ended the sentence so abruptly about that. But, you know, to each their own. You can’t. It’s hard to write hard to write a whole book in every post. But, you know, when when when you look at Brent, one of the first things you notice is that he is a fitness enthusiast.

What what what may not be obvious is that his fitness icon is a lady named Susan Powder, who is big in the like, I guess, the 80s and 90s. And her catchphrase was stop the insanity. And so I would what I presume was meant by that is if you are running nightly statistics maintenance and you come in in the morning and you are facing some sort of server meltdown because all your queries got bad, different, weird plans that maybe you should examine the way that you are updating statistics nightly. And perhaps rein that in a little bit, perhaps change your technique a bit. I can’t tell you exactly what to do because I don’t know if you are facing these things.

But, you know, when you update statistics, you, of course, invalidate a lot of execution plans. The execution plans become uncashed or however you want to put it. And then SQL Server compiles a new plan based on whatever comes in next. So, you know, you can either take that to is, you know, if you are facing that, if you if you are facing that issue, then you might consider some things like figuring out which queries are susceptible to that.

And you can either maybe, you know, not not not do the statistics maintenance for tables involved in those queries. You could also, you know, you know, use a plan guide or force a plan with query store that doesn’t result in your server melting down. You could, you know, like, you know, after maintenance, you could run a run a store procedure or query with the with parameters that get you the execution plan that you want.

There are there are there are lots of ways to do that that don’t involve stopping updating statistics wholesale, because if you want to stop that, you would actually have to look at, you know, turning auto update stats off and, you know, some other stuff, which could also cause parameter sniffing. And, you know, it’s just my general feeling and my general experience is that when the statistics that are a good representation of the data in your tables and indexes is a good thing for SQL servers query optimizer. So, you know, it’s a little tough to balance that with not updating statistics.

So, I don’t know, perhaps there will be a clarifying post at some point in the future. But sorry, you were confused by something someone else said. Anyway, hi, Eric.

Hi, you discussed index compression in a YouTube video about 10 months ago. I do not see a video on table compression. What is your professional opinion on table compression? And do you ever use it or recommend it to clients?

Thank you, sir. Sure. So, when you say table compression, I assume you mean that you are talking about applying compression to a heap, since index compression would involve clustered and nonclustered indexes. Both, since they are indexes.

And a clustered index makes for a clustered table, not a copy of the table, but the table. So, the only thing left is a heap. That is a table without a clustered index. And the answer there is no.

And the reason why is because index compression, you know, so, like, just remembering the details of this, for an index, when you apply compression, assuming page compression, then as rows get added in, row compression will be applied. And then when a page is full, then page compression will be applied. That could be, it’s a little foggy for me because I haven’t looked at that in a long time.

But the thing with heaps is that data is not compressed on insert to a heap. You have to rebuild the heap in order to apply it. So, usually when I’m talking about heaps with clients, it’s not should I add page compression to this heap.

It’s should this table be a heap? Check yes or check no. If no, then depending on the size of the heap, perhaps we should explore page compression for when we decide on and add a clustered index to that table to make it a clustered table and take away its heapness.

So, there we go. All right. All right.

Oh, another hi, Eric. Look at these polite people. Hi, Eric. Hi, it’s me. Oh, that was not a good square. That was covering up all sorts of stuff. All right. What are your thoughts regarding AI and its long-term impact on the DBA world? I used Claude for the first time a month ago.

Jeez. Late to the game. And I’m now considering retraining to be a plumber. Cheers, Ali the DBA. Well, maybe we should start calling you Ali the plumber. So, I’ve talked about this stuff a couple of times in the past.

And, you know, again, with the AI LLM thing, you don’t really want to watch where the ball is. You want to watch where the ball is going. And so, like, right now, if you are a, you know, moderate to highly skilled DBA, AI is not going to replace you immediately. Depending on the trajectory of AI, because there are all sorts of variables and factors that are going to play into the trajectory, it could get very good and end up being enough to replace moderate to highly skilled people.

Or other factors may kick in and it may be less of an impact. So, like, right now, the things that a lot of AI people worry about is, like, you know, like, just as far as, like, you know, like, things progressing with it. You know, because, like, LLMs, like, if they, like, on the current trajectory, will just keep getting infinitely better and people will stay about the same.

Like, what a lot of people are just like, well, you know, it’s, like, high energy expenditure. It costs a lot of money to do all this stuff and there’s not a lot of, like, return on investment for it. So, like, there are things that could make AI, like, stop just infinitely getting better and kind of, like, crater it off.

But then there also might be some things that happen that improve either the energy, you know, cost, things like that and make it so that there is some return on investment for it. But I think the biggest problem that people are facing right now is that there are a lot of idiots in the world and a lot of idiots in the world are in charge of people. They are managers.

They are executives. They are C-levels. And they are not very highly technical people. And they think that AI and LLMs right now can replace people. And I’m sure that there are some replaceable people in the world.

To some degree, everyone’s replaceable. But I think the biggest thing right now is they don’t realize how, like, sort of not good, like, LLMs are, like, at, like, the deeper level thought. And I know that because when I try to ask an LLM to do deep research, it ends up quoting my blog post back to me.

And I’m like, there has to be something better. Like, there has to be, like, better out there. Like, why are you coming back to me with my own blog posts?

So, you know, depends on a lot of stuff. It could, of course, you know, be a real nightmare for people. And it, of course, could just end up being, you know, sort of a, I don’t know, like, slightly better Siri or something.

I don’t know. So, you know, it does depend a bit on a lot of things. But really, the worst part about it is the higher-ups thinking that it’s way more capable than it is.

It’s, you know, because they, like, they’re under a lot of pressure to use it and keep up with, like, technology trends and stuff. And, you know, like, I try, like, I use AI, or I try to use AI for a lot of things. And I just find it severely wanting for most of the things that I try to do with it.

Anyway, let’s see what we got here. Which metric best reflects actual execution time? The last total elapsed time in the messages tab via set statistics time on?

Question mark. And that went on? In production, a query ran slowly, despite a low time shown in the execution plan, guessing low CPU time.

No. What factors might cause this, and how can I find and resolve them? It wasn’t just slow in SSMS, but the app, too.

So I guess we can rule out high client time. Well, I don’t, I don’t know. It depends on how many rows your query was returning. But I don’t, I don’t, I don’t, I don’t really use set statistics time on anymore.

Now that query plans have, like, the operator times and stuff in them, I just, I don’t see much of the point in it. You can get all that information from various parts of the query plan without having to switch over to the messages tab and look at a bunch of nonsense. But, you know, easy ways to figure that out.

So assuming that you are looking at the right thing in the actual execution plan, sometimes looking at the last operator is not a great way of figuring out. If, if, what, what, how long a query took, because if it was running in either mixed row and batch mode or just all in batch mode, then there could have been some longer running operators off to the side that you didn’t see.

So if you right click on the root of any query plan, any actual execution plan, rather, there will be an actual time statistics thing in the properties window. And that will tell you the CPU and elapsed time of the query.

And that, that, that’s also a good place to look because if you are returning a lot of rows, that will, of course, also include the time that it took for SSMS to receive and render the results. And that can be, that can take a long time.

It might be very, very easy thing for you to test just dumping the results into a temp table, assuming that it’s not like millions and billions of rows. And just see if the elapsed time reduces because you’re sort of discarding the results. You could also try discarding the results via SSMS or assigning columns to variables or something.

But that can have an impact on the query plan overall. So I don’t usually recommend that. But, you know, you know, I wish you had shared the plan or something with me, because then I could give you a more definite answer.

But like I said, I don’t really use set statistics time anymore. I generally rely on what’s in the execution plan. And what you see for the operators and actual execution plans is, of course, wall clock time, not CPU. For CPU, you have to look at the operator properties to determine that.

All right. Last question here. I’ve been watching your T-SQL beginner content, and it’s really great.

Well, thank you. What a sweetie you are. These polite people with their hierics and their compliments. I thought I was more advanced, but I’m learning a lot of surprising things.

Well, surprises abound with T-SQL. It’s actually what the S in SQL is. Surprise. Surprise.

I can’t imagine what the advanced content will look like. How do you remember all this stuff? Well, most of the time I don’t. Like a lot of people who work in tech, I have no internal dialogue or biographical memories.

So I write a lot of stuff down. And that’s generally how I am able to remind myself of things if perhaps I have forgotten something. But, you know, a lot of it is just the fact that I do it all day, every day.

And, you know, the more you do stuff all day, every day, the more you remember, kind of. The more it kind of just sticks with you becomes kind of muscle memory with things. And it becomes harder to forget.

Or I don’t know. I don’t want to say it becomes harder to forget because, man, there’s always a way to forget. But, you know, it does become a bit easier to have these things in the front of your mind rather than having them fly out of your ears. Anyway, that’s five questions.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video tomorrow. All right. Good night.

I love you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Read Committed, Not So Committed

Learn T-SQL With Erik: Read Committed, Not So Committed


Video Summary

In this video, I delve into a scenario where a query running under the read committed locking isolation level in SQL Server appears to return impossible results, challenging common misconceptions about read committed. I explain that read committed only ensures that your query reads data as it was at the time of taking shared locks, not that other queries cannot modify this data before or after your query runs. Through a detailed example involving a query to find top-scoring posts and their answers on Stack Overflow, I demonstrate how concurrent updates can cause unexpected results due to the timing of lock acquisitions during execution. This video is part of my ongoing effort to clarify complex SQL Server concepts for beginners and experienced users alike.

Full Transcript

Erik Darling here with Darling Data. You’re welcome. In this video, we’re going to continue. We’re going to look at a situation where a query running under the read committed locking isolation level appears to return impossible results to us, the end user. And we’re going to do this because some people still have very, very strange misgivings and misunderstandings about read committed locking isolation. And what exactly read committed means. When you get down to it, read committed only means that your query at the time that it attempted to take the shared locks that it requires in order to read rows and pages and whatnot, the data at the time that it was read was what was committed in the database. That does not mean that other queries cannot interact with that data and change that data before and after your query has been submitted. And that’s the data has released those shared locks. All 23 hours of the beginner content is available to you. Currently, there’s a link down in the video description for you to buy it. The presale price is still 250 bucks with the value doubling, skyrocketing to $500 after the summer months when my brain has officially stopped sweating and we can start thinking again. So anyway, let’s take a look at this query. Now, the goal here is to find the top score of the query.

So we’re going to find the top scoring post that is a post type ID to his questions. And what we want to find are all the other answers. We want to find all the answers for the question that has the top scoring answer to it. So we’re selecting star from the post table. We’re going to have an index hint in here just for demo stability. And we’re saying where the parent ID of whatever we find, right, or the parent ID of the things that we want to find equals the parent ID for the highest scoring answer to a question. So all the other answers to the question essentially. Now, because we have a sub query in this query and this, right, we have the post table referenced once here and once here.

This really means that the post table is going to be read from twice. Under locking read committed, that means that there are, I mean, many opportunities for reads to happen at all different points in time because the shared read locks are going to happen for one reference to the post table, right? And we’re going to take those and quickly release them. And then we are going to join one reference to the post table to another reference to the post table on the parent ID column. And that’s going to be another opportunity for shared read locks to slowly be taken and released as they get the data that they need.

So let’s look at the execution plan for this thing. I’ve already run it ahead of time. The first thing I actually want to do though, is show you over here that when we run this unfettered without anything happening around it, without anything else interacting with the data, all the results that get returned over here have a post type ID of two.

Okay. Everything, everything next to me is a two. This whole column is all twos because that’s what we were looking for here. So if you, your, if your post has a parent ID, that means it was an answer to a question. If your post does not have a parent ID, that means it was the question, right?

Questions belong to answer. If your post, let’s try that again. If your post has a parent ID, that means it was an answer. So it was like, it’s like, it’s like the parent ID is the question ID that ties your answer to the question.

There we go. So the execution plan for this features two things. Well, I mean, I guess we have three opportunities in here to read strange data from the post table, don’t we? We have a clustered index scan here and this takes about five seconds.

So a single thread scans the clustered index and gets all the rows. The next thing we do is sort all of those rows. After we sort all of those rows, which we get, we have one of, there’s one row that comes out of this because we find the top one scoring answer to a question here.

Right. So we do that. We scan things, we find the top scoring thing, and then we join what comes out of this to a nonclustered index on the post table. And we find the nine, in this case, there are nine other answers to the question that we found with the top score here. So we go find the nine rows here that we care about.

And then over here, we touch the post table a third time because we need to do a lookup in order to get the other columns that we need to satisfy the presentation section of the query. So we have this whole output list here, right? This is all things that we’re going to do. We’re going back to the clustered index to get these columns after we’ve done a seek to find the parent IDs that we care about here.

So there are really three opportunities to read data from the post table. And they all happen at slightly different times. Like because there’s a sort, all the 11 million rows that we need to find in the post table, right?

We have to get all those rows to the sort operator. Then the sort can happen. Then we do a nested loops join to the lookup.

And then we have another sort here, which means all the rows have to show up here. And then we can start doing a lookup here. So really, there’s like one, two, three different points in time that we’re reading data from anyway. And during these three different points in time, all sorts of other things can interact with the data underneath you.

So let’s do this. Let’s go over what we’re going to do in the other windows. Let me scroll down here first.

So this is the one that we’re going to run when we start doing things. I’m just going to tuck this down here so it’s a little tidier on the screen in case I need to zoom in again. Over here, what we’re going to do is we are going to, this is the last row in the post table.

This is the highest post ID. And what we’re going to do is start a transaction and update the post table. And all we’re doing is adding one millisecond to the last activity date for this last row in the post table.

Right? There are no rows after this. This is the highest assigned ID. The ID is the clustered primary key.

So this is the end of the table. And then what we’re going to do is run this. So let’s pretend that this is all happening very quickly. Like in the database, it’s not me running things in separate tabs. This is a database with a fair amount of concurrency.

Right? And so we have things running at all different times. And those things need to be able to run and do things. And we don’t want to like have like a train schedule for all these queries where it’s like, you go. All right.

You go. Update. Yep. You go now. Right? Like that’s not happening. These queries are all just getting sent in and run and trying to do their thing while other queries are getting sent in and trying to do their thing. So let’s begin this transaction and run this update.

And now let’s run this query. Now, this query took about five seconds before. This query will not take five seconds this time.

Right now, this query is blocked by that update. Right? Because like one of the things that we have to do in this query plan is scan. We have to scan.

We have to touch the clustered index twice. We have like the first thing that we did was scan through the clustered index and then sort all that data. Right now, this query has read everything in the clustered index except that last row because that last row has a lock on it. Right?

So everything got read except that last row. So this select query running under recommitted locking is blocked from taking a shared lock on that last row by the update in the other window. So while this update is holding this lock open, let’s come over here.

And what we’re going to do is we are going to update a very coincidental nine rows, nine or ten anyway. And we’re going to set their post type ID to one. Coincidentally, these are the other rows that we need to, or rather the other rows that get returned by our query is the other answers to the question that have, that belong, they’ll belong to the question with the highest scoring answer and all of stack overflow.

This query runs successfully. Now, if we come over here and we roll this back, we don’t have to commit this. We can just say, oops, I didn’t mean to add it.

I didn’t mean to add that millisecond there. We can roll this back. And as soon as we roll this back, we get results immediately. If we look at the execution plan, well, things look a little bit different. This scan of the cluster index now took one minute and 28 seconds.

It took a minute and 28 seconds because it was like the five seconds that it normally ran for, plus the like 25 or so seconds that I was trying to explain things to you. What’s neat now is if we go to the properties of the select and we go to the weight stats, the top weight that we’re going to see is LCKMS. Right.

So we can see that this query waited a long time on a lock. That lock, that shared lock, LCKMS, it’s trying to take a shared lock, was blocked by the update query that started a transaction and said, go outside and smoke a cigarette. I don’t know.

Maybe grab a beer for lunch. I don’t know. We’ll see what happens. There’s a halal cart. I don’t know. Get crazy out there. But what’s very interesting now is if we look at the results of this query, even though we specifically requested that only things with a matching parent ID, right?

So only, only, again, only answers have a parent ID. That’s the questions don’t have a parent ID, right? And post type ID is answers and post type, post type ID two is answers and post type ID one is questions. Even though this is what we asked for.

And even though the first time we ran this query, we got all post type ID two back. If we look over here now, magically, mystically, all of these post type IDs were one, which seems impossible. Because a post type ID of one, it cannot happen for an answer.

Post type ID two is answer. We looked for other answers and we found them. They all have the same parent ID too. We were not, oh boy, oh boy, oh boy.

Questions do not have a parent ID. Questions are the parent. So something terrible has happened here. And this sort of stuff can happen all the time under any sort of application concurrency using locking read committed. So be careful out there.

Before you go and you start thinking that read committed locking is some sort of snapshot point in time, amazing, consistent, confident. Doesn’t it sound confident, right? Read committed.

What do you read? I read committed data. I don’t read uncommitted data. I’m committed data only. It sounds like an LLM level of confidence from read committed, but we do not get that. Read committed has very weak guarantees.

Those guarantees do not coincide with the start of a statement or the start of a transaction. They are just a mishmash of various points in time when your query was allowed to take shared locks. So again, be careful out there.

Your queries might be returning weird stuff all the time. This isn’t even NOLOC, which is the surprising thing. Like you would expect this sort of thing using NOLOC or read uncommitted, but not read committed.

It’s committed to reading committed data. Well, maybe not so much. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Reads Deadlocking Writes Under Read Committed

Learn T-SQL With Erik: Reads Deadlocking Writes Under Read Committed


Video Summary

In this video, I delve into the intricacies of SQL Server isolation levels, focusing primarily on the notorious “read committed” level and its potential pitfalls. Specifically, we explore a scenario where a read query deadlocks with a write query due to their interaction with indexes. By walking through these examples, I aim to provide you with a deeper understanding of how read committed can lead to unexpected issues like deadlocks, especially when multiple nonclustered indexes are involved. I also discuss potential solutions and highlight the benefits of using optimistic isolation levels such as Read Committed Snapshot Isolation (RCSI) or snapshot isolation, which offer more robust protection against these types of problems without requiring extensive query or index modifications.

Full Transcript

Erik Darling here with Darling Data, continuing on to talk about isolation levels, mainly focusing on what a turd of an isolation level read committed is. And in this video, we’ll be focused in on seeing a read query deadlock with a write query. There will be very similar circumstances to the previous demo, just with a little bit more involved because it is a deadlock, and not just blocking. Again, if you’ve been watching my videos and you know, seeing stuff that I do here, you may have seen this demo before, but you know, we’re gonna go into a bit of detail on it. So if you’ve perhaps forgotten or never knew in the first place, or perhaps you knew about this, didn’t you? It may be good for a refresher. All the same stuff over here, 23 hours of the beginner content is out there and available for you. The course is still at the presale price of $200,000. And that price will be going up to $250,000. And that price will be going up to $500,000 after the summer, when the advanced material is complete. So with that out of the way, let’s look at this situation. Now, what we have going on here is, well, we have a select query. And this select query is going to do a couple neat things.

It’s going to start a while loop. And while the I local variable is less than 10, 100,000, we are going to assign the post ID to this one. And we are also going to thank you, zoom it. We are also going to increment the I parameter in here, we have an index hint for a little bit of demo stability here, no big no nothing, nothing too interesting. And then over in this window here, we have an update query. And the update query is going to do something rather similar, except what it is going to do is basically flip this value for vote type ID back and forth between eight and zero. So we’re saying vote type ID equals set vote type ID equals eight minus vote type ID. So this will either be eight or zero because the vote type ID for this is currently eight. Let’s see, let’s see, let’s see, do we get an estimated plan for this one we do. So if you remember the last video we did on read queries blocking right queries, it’s going to be a sort of similar situation here.

We have an index seek that finds the rows that we care about. And since SQL Server estimates that that more than 25 rows are going to leave this index seek, SQL Server chooses to use the unordered prefetch optimization here. And it does not in this time around opt to choose to use an optimized nested loops, just the unordered prefetch portion. But same deal as before, because we’re using the unordered prefetch every time a row comes out of here and goes into the loop, we come down here to go get a column out of the clustered index.

And for SQL Server will hold on to locks the entire time that we are doing the lookup. The lookup in this case is just to output the column post ID. Oh, I guess that’s the same thing as before. And again, because the prefetch optimization is in play, the locks on the clustered index will be held until this statement completes.

Now, over in the update window, this is where things get a little bit different. Now, if you recall the last one, the last demo on just the blocking, it was only the clustered index that was getting updated. This time, in order for there to be a deadlock, we also need to have a second nonclustered index get updated.

This is what causes the sort of deadlock embrace scenario rather than just a plain blocking scenario. Because now we have two queries that one is holding the object level shared lock. One is trying to update both the clustered index on the table and the nonclustered index on the table.

So we have this sort of circular dependency going on, which causes the deadlock this time around. So if we look at this, you’ll see that we are updating the primary key on the votes table. And we are also updating this other, I mean, we’re updating actually two other nonclustered indexes here.

But we only need one of these nonclustered indexes in order for this to happen. So what I’m going to do is I’m going to start running this and start running this. And then I’m going to hit SP who is active.

And we missed it that time around. There we go. We got it.

Sweet. So here is what this looks like. Now let’s kill this one off. That got a little jumbled, didn’t it? These SSMS tabs get all weird on me when one of them starts, one of them, you hit one and it starts saying executing. And then all of a sudden that tab gets bigger and switching tabs around gets somehow confusing.

But very professional. So here we have our update query right here. And it is trying to take an intent exclusive lock here.

And here we have our select query. And it is attempting to take that object level shared lock here. And they are blocking each other.

You’ll note that session ID 68 is here. And session ID 61 is here. And if we move the blocking session stuff over a little bit closer so we can frame this all up at once, you’ll see that session 61 is blocked by 68.

That’s right there. And session 68 is blocked by 61. Now, of course, the choice of deadlock here is very easy.

SQL Server will always choose the select query as the deadlock victim. That’s what the error message that we get here is. And it does that because the select query is very easy to roll back.

The select query hasn’t done any writes. There’s no log. There’s no version store. There’s no accelerated database recovery persistent version store, rather. There’s no work to do to kill this query off, aside from to say, pow, you’re dead.

But, like, this query over here just runs and does the update until I cancel it. So, like, the update never loses the deadlock battle. It’s only ever the select that loses it.

But we can also take a look at the locks that want to happen here. And you can see for the… This was for the update, right?

Yeah, the update. The update query was granted a whole bunch of locks, right? Intent exclusive. It was granted an X lock here. It was granted an X lock here.

But then there was a couple down here that… No, sorry. There’s just one over here that it is waiting on, right? And it is waiting on this intent exclusive lock on the vote type ID index. And then for the locks for the select query, this one has been granted a few different types of locks, right?

Like, all these shared and intent shared locks have happened there. But then this one shared lock on the clustered primary key, right, where the lookup is happening to, this one is being forced to wait.

So this one is waiting because of the modification query. And this one is waiting because of the select query. And neither one can make progress, right? So they are just stuck waiting for each other.

And that’s why the deadlock occurs. So again, sort of the same situation here. We could, of course, extend the non-cluster… Sorry, we could, in this case, we could extend the nonclustered index so that there is no lookup in the select query.

That would be one way of taking care of this. Another way of taking care of this would be to, of course, use the no-lock hint, which I am, you know, pretty avidly against doing.

But, you know, like I said in the last video, it’s really not no-lock or read uncommitted that’s the problem. It’s the fact that you have to deal with all this crap under read committed that’s really the problem. Read committed is the enemy.

Right. And so a better solution would be to use an optimistic row versioning isolation level, like either read committed snapshot isolation or snapshot isolation, in order to prevent this sort of deadlocking at a grander scale.

The same way you could use that to prevent the sort of blocking we saw in the last video on a grander scale, without having to go through each and every one of your block, like reader, writer, blocking and deadlocking things and saying, okay, well, can I, like, what’s the query plan?

Can I fix the query plan? Can I do this? Can I do that? Can I do the other thing? Can I change indexes? What can I do? The optimistic isolation level thing just fixes all of it in one big go. So you don’t have to go through and nitpick every single individual query.

If you just have a couple queries that this is happening for, then you can probably pretty easily fix this without going and changing any database settings, assuming you’re allowed to change indexes.

But if you have this happening across a wide variety of queries, your life gets very, very difficult when it comes time to fixing these things, because you have to go through a lot of different stuff and fix a lot of different things. And then, you know, you’re maybe not even 100% guaranteed to fix everything.

Not every type of deadlock and not every deadlock situation is entirely unavoidable. You might need to also, you know, introduce, like, you know, lock timeouts, or you might need to introduce deadlock priority, or you might need to introduce some sort of retry mechanism for your queries that end up in deadlock situations.

So there’s a lot more that goes on. Like, you can make your best effort at fixing all this stuff with the things that I’ve talked about, making changes to the queries, indexes, etc.

But at the end of the day, not every blocking and deadlocking situation is entirely avoidable, unless you get to an optimistic isolation level, which resolves things in a far, like, far more, like, far more specific way, let’s say.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video where we will continue to pick on Read Committed, because too many people think that it is too good of an isolation level.

And, uh-uh, not on my watch, pal. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Reads Blocking Writes Under Read Committed

Learn T-SQL With Erik: Reads Blocking Writes Under Read Committed


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: What Does Read Committed Guarantee?

Learn T-SQL With Erik: What Does Read Committed Guarantee?


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 28

SQL Server Performance Office Hours Episode 28



To ask your questions, head over here.

How often have you dealt with corruption? When is the last time you saw it?
Have you ever met a query you couldn’t tune?
Which database will win the database wars?
What are the hardest kinds of queries to write?
When will Joe Obbish be back for a podcast?

Video Summary

In this video, I dive into answering five of your most pressing questions about SQL Server and beyond during another exciting Office Hours episode. I cover a range of topics from dealing with corruption to the future of database management systems like DuckDB and Postgres. Whether you’re curious about tuning queries or wondering which database might win in an imaginary “database wars,” this video has got you covered. Plus, I share some upcoming tour dates for my consulting services, including stops in New York City, Dallas, Utrecht, and the eagerly anticipated Past Data Community Summit in Seattle. If you have any questions of your own or want to support the channel, there are links in the description to get involved.

Full Transcript

Erik Darling here with Darling Data, and we are in the midst of another wonderful Office Hours episode coming from me to you, in which I answer five of your most pertinent, important, burning questions about SQL Server, life, love, the world around you, mysteries of the universe, anything that you’re interested in. If you would like to ask your own questions for Office Hours, there is a link right there to do that. If you would like to support the the channel, there is a link, well actually both of these links are down in the video description. If you’re feeling very kind, and you want to ask me a question, or you want to sign up for a membership, you can do either one. If you need help with your SQL Server, if performance is just not what you always dreamed it would be, I am available for consulting as a consultant to help you with those problems. Health Checks, Performance Analysis, Hands-On Tuning, Dealing with Performance Emergencies, and of course, Developer Training. And as always, my rates are reasonable. Performance Tuning Training, about 150 US bucks, about 75% off there. That link down in the video description, and you will get all of that content for life. Speaking of for life, my T-SQL course is, well, all the beginner content is done, which means it is somewhere near half done. The advanced material, of course, will be bigger than the beginner material, because there is much more, to say in the advanced material, but 23 hours of beginner content is out there now. It is 250 bucks at the pre-sale price, and that price will be going up, doubling the value of the course after the summer months. The summer months, of course, will be busy times for your friend, Erik Darling from Darling Data. Red Gate is taking me on tour across several points of interest across the globe. New York City, August 18th and 19th.

Dallas, September 15th and 16th. And Utrecht, October 1st and 2nd. And then, of course, the event that we’ve all been waiting for, Past Data Community Summit in Seattle, November 17th to 21st. But with that out of the way, let’s do these office-y, hour-y questions here. Let’s see. Let’s make sure that’s framed up nice for everyone. We don’t want anyone to have to struggle to read these things, do we? More importantly, we don’t want me to struggle to read these things.

How often have you dealt with corruption? When is the last time you saw it? Hmm, geez. So, I don’t deal with corruption a lot. It’s not really up my alley. You know, I am a performance-tuning person. I enjoy performance-tuning.

I do not enjoy things like corruption, high availability, disaster recovery, backup, stuff like that. I can, you know, I can work with it a bit, but it’s not, those are not things that I care deeply about. So, I have dealt with corruption a few times while consulting.

You know, most people are surprised when we come across it. You know, some of the initial analysis that I do when I work with people does, like, look at things where, you know, corruption might be popping up, cropping up, hiding, and where they just might not, like, be able to see it. So, you know, like, use SP Log Hunter and look at the error log.

Use SP Health Parser and go through the System Health Extended event. And then, also, I look at the Memory Dumps DMV. And sometimes, you look in the Memory Dumps, and the Memory Dumps will also indicate data corruption.

But it has been a while since I saw it. I don’t see people hitting data corruption issues so much. I think, you know, disks have become, at least to my mind, sufficiently resilient that corruption is not as common of an issue as it used to be.

I, you know, I don’t have any experience with that aside from the work that I do. But, you know, I do run across people who have not run DBCC CheckDB ever. And, you know, maybe they have gotten at least, I don’t know, maybe very lucky that they have no corruption.

But it’s just not something that I run into a lot. Let’s see here. Have you ever met a query you couldn’t tune?

Well, I mean, yes and no. There are certain, let’s just call them pathological issues with an environment that make some queries much more difficult to tune than others. You know, a lot of the times, if you are not allowed to add indexes, then tuning a query becomes a little bit more difficult because, you know, like a lot of the sort of logical rewrites that you do would depend on having some sort of reasonable index around that would allow you to access data in an efficient way.

If you are dealing with tables that are very big and don’t really have indexes on them, that becomes a little bit more difficult. But, you know, as far as just like, like, have I ever looked at a query and been like, I have no idea how to fix this? No, that’s, you know, that it’s been a very long time since I’ve had that happen to me.

But I’m trying to think. Yeah, I mean, really, it’s just, you know, dealing with, it’s usually not like, like, like local to just the query that’s in front of me. Usually there are like extenuating circumstances with either the hardware or the database or something that that that make tuning efforts a lot more difficult.

But as far as just like, you know, could I, if given unlimited resource or not even unlimited, just given like some more time and some more resources, I could I could I could really do this. But for now, we can we can take some shortcuts and like, you know, try to get batch mode happening or something to to speed that up or like crank maxed up up a little higher. So we have more CPU is working on on on on all those rows or something.

But there’s never been a query that I’ve looked at and been like, well, I just can’t do anything. We’re we’re we’re stuck. Let’s see here.

Which database will win the database wars? I mean, obviously, DuckDB. No, I don’t know. I mean, look, Postgres is having its moment.

I and I don’t think that’s a bad thing. I do hope that Postgres having this moment will force other database vendors to perhaps realign their priorities and perhaps think a little bit more about their pricing models for things and what they offer along with those pricing models. It is, you know, you know, I joke that like everyone who uses SQL Server and has never used Postgres thinks that using Postgres will solve all of their problems.

But in general, you’re you’re sort of just trading for a different set of problems there. There it’s it is amusing to me also that, you know, doing development work on Postgres is like working with SQL Server 2050. But doing like monitoring Postgres stuff is like working with SQL Server seven.

Right. It’s like there is just not not a lot built in that’s terribly useful or helpful. And I know that there are plugins and extensions and things that you can add to it that surface more.

But yeah, like out of the out of the box, if you don’t have like some some someone who like knows and cares and pays attention, getting getting meaningful information out is a lot more difficult. Postgres query plans specifically are not as friendly and graphically inclined as SQL Server ones are, nor do they surface. I think some of some of some of the more interesting details that the SQL Server plans give you.

So, you know, you know, do I think SQL Server is going to win? Probably not. I mean, SQL Server, to my mind, does have the best query optimizer.

Postgres’s query optimizer is not nearly as good as Microsoft’s. Not to say that it isn’t capable of doing some things better than what Microsoft SQL Server does. But I think like just generally speaking, I would I would much rather have Microsoft’s query optimizer than than Postgres’s.

But, you know, like I said, I think that Postgres having its moment now is a good opportunity for other big relational database vendors to perhaps have a moment of clarity about some of the things that they’ve been doing with the product over the years. Perhaps correct their courses in a way that will make the specter of a free database perhaps a little bit less appealing or something. Anyway, let’s see here.

Oh, I got a good question next, but you got to get through this one first. What are the hardest kinds of queries to write? I answered a similar question, I think, recently. Anything that involves like a lot of math, any sort like I don’t I don’t have any groundings or foundations in math.

So if there’s like, you know, some advanced data analysis formula that you need to apply to stuff, you know, I would have to, you know, go look at that and figure out how to, you know, write the T-SQL version of that, which I am not particularly good at. And I would not be able to necessarily like, like, like, validate easily. Like, well, I wrote all the math out.

It looks right. Can someone tell me if it’s right? So those are the hardest kinds of queries to write for me are the ones that involve like equations and algorithms and things like that, because like, I just don’t have any grounding in those areas. So here’s the question.

Here’s a good question. When will Joe Obish be back for a podcast? Well, I’m hoping that Joe will be feeling in the mood for recording in the, you know, I don’t know, reasonably near future. I don’t have an exact date, but I have at least gotten an email from him expressing interest in recording again.

So hopefully that is on the horizon. Anyway, that’s five questions. They’re short ones again this week.

All right. Well, I don’t know. I think, I guess it’s, it’s nice when they’re not overly long and involved because then I don’t have to think too, too much. But anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in the next video, which will be, I don’t know, something suitably SQL Server-y. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.