Introducing My SQL Server Performance Engineering Course

Introducing My SQL Server Performance Engineering Course


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Table Variable Deferred Compilation Depth In SQL Server

A Little About Table Variable Deferred Compilation Depth In SQL Server


Video Summary

In this video, I delve into the fascinating world of table variable deferred compilation in SQL Server, specifically exploring how multiple inserts affect its behavior. I walk you through a stored procedure that demonstrates inserting rows into a table variable and then querying it to see if SQL Server defers compilation after each insert. The results are quite enlightening, revealing that while initial executions show promising deferred compilation, subsequent runs start behaving more like parameter sniffing, retaining cardinality estimates from previous queries. This video aims to shed light on this lesser-known aspect of table variables and might help you troubleshoot strange behaviors in your own SQL Server environments. Whether you’re using SQL Server 2019 or higher compatibility levels, understanding these nuances can be crucial for optimizing query performance.

Full Transcript

Erik Darling here with Darling Data. And we’ve spent some time this week talking about different sort of cardinality estimates for things that don’t really get great cardinality estimates. And one of the things that I talked about earlier in the week, I think it was on Tuesday, was that up to a certain point, table variables behaved a lot like local variables and that you would, they would sort of compile with the batch, but would not get cardinality estimates. a lot of information until table variable deferred compilation came along. And this made me think a little bit because, you know, in the past I’ve definitely tested stored procedures just to be like, okay, if I run this once and the table variable gets a thousand rows in it, and the queries that touch it now can see that there are 1000 rows in the table variable, and then I rerun this and I put like a hundred thousand rows in the table variable, what happens? And it turns out that… the behavior of table variables turns into what was essentially the behavior of table valued parameters in the past, which was that they would sort of act like parameters and the cardinality of the table value parameter would get sniffed and reused over the course of a query plan.

So like to me that was sort of interesting, but then it made me wonder like, well, what if I insert rows into a table variable multiple times? Will SQL Server defer compilation for multiple inserts? And so that is the question we are going to answer today. We are going to judge the depth of table variable deferred compilation. We are going to judge its depth like we are at a powerlifting meet and we want to make sure that it is squatted deep enough to get all white lights. We want to see the crease of the hip. We want to see the butt a little below parallel. We don’t want anyone cheating on their squats saying, oh, I squatted 1300 pounds.

Your butt didn’t even break even, pal. Anyway, if you look down in the video description, you will see many helpful links. Most of the help that those links will provide is in the form of giving me money. Hire me for consulting, buy my training, become a supporting channel member.

And if you want to ask me questions for my office hours episodes, you can do that. Otherwise, please do like, subscribe, tell a friend. Assuming that your friends are physically existent in the world and capable of subscribing to a YouTube channel.

But if your AI girlfriend has a YouTube account, I’m willing to accept AI girlfriends. They remain your girlfriends. They remain your girlfriends, but I’m willing to accept subscribers in the form of AI girlfriends. All right. Married man. Don’t mess with me.

All right. So leaving the house, of course, I get to go to Utrecht. I don’t know. I made what I thought was a funny joke, but I don’t know if anyone laughed at it. It was, I trekked, you trekked. We all trekked to Utrecht. All right. That was funny.

I don’t know. I’m a little deflated on that one. Sorry. Delivery was lacking a little life. October 1st to 2nd, I will be there with the nice folks from Red Gate who are bringing Pass on Tour. Well, they brought me to New York City. That was very nice. Dallas and now Utrecht.

And of course, I will be in Seattle for the Pass Data Community Summit, November 17th to 21st, with two rock’em sock’em days of pre-cons with Ms. Kendra Little about T-SQL and all of the wonderful things that happen when you use T-SQL.

All right. The life changer that is T-SQL. There we go, PowerPoint. Only took you five seconds to catch up with that. Click. Great. Wonderful. Let’s party. Let’s look at table variable deferred compilation depth.

So what I wanted to test with this store procedure is, like I said, multiple inserts. So here we have a store procedure that accepts one user input, right? So we are still going, we’re not using a table valued parameter here.

We are just going to use a regular old table variable. But the table variable we are using will acquire the table variable deferred compilation intelligent query processing feature. And we will get on initial compilation a guess, right?

SQL Server will give us table cardinality for that. So if I run this and, you know, run all the code, we should see something interesting, shouldn’t we? So here we have user IDs, right?

We’re going to insert into that table. We’re going to select all of the user IDs that match our user ID first from the users table. And then do a count. So we got some cardinality to estimate.

Then we’ll do everything from the badges table. And then we’ll do everything from the comments table, right? So we got that here.

Then we’ll do everything from the post table. Oh, post, that’s a big one. And then since there are two columns in the post tables that have user IDs in them, I decided to do both of them, right? So we’ll do last editor user ID too.

And then finally, we will do votes. And what I want to see here is if after each insert, SQL Server actually defers compilation of the count query until all the rows are inserted, right? So that’s a good one.

So let’s run this. We have actual execution plans enabled. And when we run this, this will take a couple of seconds to run. No, because we don’t have any good indexes in place. That’s fine, though.

We got time to kill, right? You’re not busy. So we have every other query will be interesting to us. So let’s look at how these pair up. So we have the clustered insert from users, right, into our table variable.

That inserts one row. And we get a one row estimate when it’s selecting from the table variable. That is correct so far. That is not incorrect.

One might see a one row estimate from a table variable and think, my goodness, but the insert actually did only insert one row. So as we get down further and we insert user IDs, say, from now the badges table, we add another 9,363. And now look, our clustered index scan says 9,364.

This is great news. We have actually deferred compilation of the second select until later. Right?

This is great for us. Now when we hit comments, we add 46,7… Can you imagine leaving 46,737 comments? My Lord.

What are you doing all day? But this is interesting though because now our clustered index scan says 56,101. Which, you know, I’m not a math person, but I’m fairly sure that if you added that to that, you would get that.

So that makes sense to me. And if SSMS reframes this one more damn time, I am going to lose my entire mind. So now, let’s see.

Now we’re inserting into user IDs, selecting from posts where owner user ID fits up there. And we add 27,901. So this person has left almost 47,000 comments and posted almost 28,000 questions and answers. So that is insane to me.

But when we add the 27,000 to what we had before, look at our estimate. It holds up very nicely. This is great. And if we scroll all the way down here, we will see this pattern repeat where every time we add rows, we get the right sort of guess from our select from the table variable.

All right. Well, that’s good for a first execution, right? But now let’s do a second execution.

My favorite user in the Stack Overflow database is someone named Eggs McLaren. They’re a dummy account. And let’s see what happens now. So let’s do this.

We’ll enjoy ourselves here a little bit. And if we look at this execution plan now, well, this is still good, right? One of one, one of one. But if you notice down here, we seem to be retaining cardinality estimates from the past execution.

And so we’re not really deferring the compilation anymore, are we? We have sort of given up on deferring compilation because now we’re getting guesses of like 69 of 9,300. And if we scroll down a little further and we click here, so hopefully SSMS stops moving, then we’ll start getting, you know, like we don’t get a good guess here.

I mean, I realize this is the insert, but like we’re retaining cardinality estimates from like not only the insert queries, but also the select queries. We can only defer compilation so much.

So now we’re getting 91 of 56,000 and 65 of 27,901. And if we scroll down further, we will see this pattern repeat itself. So at least on the initial execution of a stored procedure, we do okay, right?

SQL Server is like, well, every time I do an insert, I’m going to defer compilation of the select from the table variable immediately afterwards until after that insert completes and we get table level cardinality. Golly and gosh, that’s wonderful.

I’ve never, never been so happy in my life. The problem becomes that on further executions, the table variable acts like a parameter. And now we sort of have table variable sniffing, which I don’t know if that’s good or bad, but it seems, it just seems like a fun new problem to be aware of and have to go solve for people.

So maybe this video will arm you to go do that. I don’t know. I don’t know if you’re using SQL Server 2019. I don’t know if you’re using compat level 150 or higher.

I don’t know if you have disabled the table variable deferred compilation database scope configuration. And moreover, I don’t know if you’re using table variables, but if you are and things are behaving strangely, this might just answer some questions for you.

Anyway, thank you for watching. Hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where, I don’t know. I don’t know what we’re going to do yet.

I believe it’s going to be a Friday though. So it will be necessarily amusing, right? Because Fridays have to be fun. Fridays aren’t fun. Might as well all just…

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Table Variable vs Local Variable Cardinality Estimates in SQL Server

Table Variable vs Local Variable Cardinality Estimates in SQL Server


Video Summary

In this video, I delve into the fascinating world of local and table variables in SQL Server, particularly focusing on how they behave differently when it comes to cardinality estimation. After exploring these concepts in yesterday’s video, I decided to dig deeper by comparing temp tables with table variables through a series of tests designed to reveal their unique behaviors. By running various queries that include equality and inequality predicates, we observe how the lack of statistics on table variables can lead to unexpected estimates. This exploration not only reinforces the importance of proper indexing but also highlights the subtle differences between local and table variables in cardinality estimation. Stay tuned for tomorrow’s video where I will address some lingering questions about table variable deferred compilation and its implications for query optimization.

Full Transcript

Erik Darling here with Darling Data. And if you watched yesterday’s video, I don’t have a great intro for this one, I’ll be honest with you. If you watched yesterday’s video, we asked and answered the question, why do local variables behave the way that they do with cardinality estimation? And hopefully the answer was sufficient for you. But, you know, looking at that kind of got me a little curious and thinking. about, you know, table variables and local variables. And I wanted to, I wanted to mess around with that just a little bit more. Just to, I don’t know. Some, some details you just never think to look at in detail. If you look down in the video description, my beautiful, wonderful video description, you’ll see all sorts of helpful links for you and me. You can hire me for consulting, buy my training, become a, a, a paid, a, paid, contributing member of the channel. If you don’t feel like doing any of that stuff. Well, if you be so kind as to ask me an office hours question. I do, I do like getting those. It’s a fantastic mailbag. I, I, some of the questions in there have truly blown my mind. And of course, if you enjoy, if you are titillated by this content, please do like, subscribe, tell a friend, family member. I don’t know.

Something like that. Uh, I will be out on the, well, I mean, by the time this gets published, I will be home from what was pass on tour Dallas, at least hopefully. All right. Fingers crossed. Uh, that means that the remaining, uh, travel for me will be, uh, Utrecht, the Netherlands, October 1st and 2nd, uh, and then past data community summit in the Seattle, November 17th to 21st, uh, where, uh, Kendra Little and I, uh, will be presenting two days of, uh, probably the best T-SQL pre-cons you’ll, you’ll see for the rest of your life. Set the bar impossibly high for anyone else ever. But that out of the way, let’s address the, the matters at hand here. Now, uh, I think that, you know, we, I’ve, I’ve covered to a sufficient degree, uh, uh, differences between temp tables and table variables. Um, in tomorrow’s video, uh, we’ll talk about table valued parameters.

Actually, maybe we’ll do that today. Yeah, we’ll know what we’ll do that today. Uh, so what I, what I wanted to sort of go over a little bit first is, um, just like sort of how table variable cardinality estimates, uh, sort of compare to local variable estimates. And so I wrote, um, what I think is a reasonable series of tests, uh, one using a temp table. We all know that temp tables get statistics on them. And because we know that temp tables get statistics on them, I want to make sure that I update my temp table statistics with a full scan.

And, um, I also want to, uh, use a table variable, which I, I believe that at this point I have indelibly proved that even with an index on them, uh, we, we do not get a statistics histogram for, for the index or for anything on a table variable. Um, if I were to try to run a statistics update for a table variable down here, I would get an error. It would, it would just yell at me. Um, I suppose it would be funny if I tried to do it in some error handling code and had it print out something ridiculous, uh, on failure, but you know, um, gotta save something for the paid material, I guess.

Uh, but then, uh, we’re, I’m going to run a series of queries, series of queries. Why have I never thought about that phrase before? It’s great. Wow. It’s like the other day, someone said the phrase benchmarking to me and I was like, it’s fantastic. I want to, I want to find a way to say that somewhere.

So, uh, we’ll just run a couple of full counts from both of the tables. Uh, we will run a count with an equality predicate. Uh, the equality predicate will be using a literal value for the table variable. And for the temp table, we will be using a local variable that we declared up here called zero, right? That’s that thing.

Uh, and that’s, that, that’s basically, that pattern is going to repeat for inequality predicates greater than, equal to, less than, equal to, uh, greater than, I don’t know. And, uh, of course, less than down here.

So this will be our first test to kind of see how the two compare and we’ll run all this and we’ll get all these results back. Won’t life be grand. So we’ll have a couple uninteresting areas.

The uninteresting areas are of course the inserts. We don’t care so much about those. And then once we get down here, we will have some vaguely interesting areas. So because I am in, uh, I am on SQL Server 2022 and I am using the, um, 160 compatibility level.

And I have not, uh, disabled the database scope configuration for a table variable deferred compilation. I will get a 100 row estimate from the table variable. But because the table variable has no statistics histogram on it and the temp table does, uh, we, we, this is about where, um, things, things might start to get a little interesting.

So as soon as zoom, uh, we can look at some of the more interesting stuff. So selecting from the, uh, table variable with a literal value of zero and saying where ID equals zero. Well, we get zero rows back, but SQL Server thought that we would get 10 rows back.

This is a stock, this is a stock 10% estimate, uh, unknown estimate for the, what do you call it? Table variable. Cool.

Now, because our statistics histogram on the temp table, uh, accurately describes the data in here, we get a one row estimate back from the temp table when we use the local variable. SQL Server is like, hmm, I think we can only get one row here. Now we could absolutely get a one row estimate from the table variable as well.

If we made the clustered index unique. SQL Server would look at that and say, oh, well, uh, this is a unique index. Uh, I know that I’m only going to get one value back.

I don’t know what that value is going to be, but I know I pretty sure that I’ll get one back. So we could change that and get a little, something a little bit different on the second one, but I’d rather just say that and move on. Now, if we come down to where we start having the, uh, inequality predicates, and this is where I should probably start like actually clicking on the plan.

So SSMS stops resetting where I was before. Uh, if we look at the inequality predicates, uh, for the, uh, table variable, we get 51. And what I did prior to recording this was I tried with larger numbers because sometimes like, you know, like 51 out of a hundred, like if you did like a thousand, you might get five, one and some other numbers.

Or if you did like a million, you might get five, one and some different numbers, but it was always like five, one and however many zeros would have been 51%. So for the table variable, uh, with an inequality predicate, uh, we always get 51% using the default cardinality estimator. Remember I’m in compat level 160.

So I’m using the new cardinality estimator say new, but it’s like 2014. So screw whatever. Uh, I, I have a test with the legacy one and the next tab over, which we’ll look at. And then, uh, this one for this one, we get 90.

This was the only one that like really meaningfully changed between, uh, the legacy and default cardinality estimators. This used to, this will, this will get a slightly different number in the other tests. Everything else is just about the same.

And if we scroll down through the, uh, range predicates, so the greater than equal to less than equal to all that other stuff, these will all get 30%. 30% will be true for either the table variable, uh, where there’s no histogram or the local variable with the temp table where there is a histogram, right? Like, so the local variable works to screw up the histogram a bit, uh, for the, for the temp table and the lack of statistics on the table variable, uh, screws with the literal value.

Right. So kind of a, we’re just like dueling banjos of kind of like, ah, but why, why, why would you do that? Uh, so if we scroll down, this will be the same 30, 30, 30, all over there.

Uh, the next set of tests I did, and this is where I really had to like, I, I did some scrounging on this to, to make it work in like, uh, the same way. So if you remember the last one, when I did the temp table, I had an update statistics with full scan, doing the full scan update, update statistics. And this test was not getting me kind of what I wanted to show you.

So I had to create statistics on the temp table, uh, sample zero rows and, uh, tell the SQL Server not to recompute the statistics ever. So, um, I think bit of an unlikely scenario in real life. So let’s run this and we, we got, we got execution.

We didn’t get execution plans. Let’s do that one more time. There we go. Execution plans. So, uh, getting down to where things maybe start to get interesting, uh, because, uh, the table variable deferred compilation is still on. Um, the guesses up here for things like the count are still going to say a hundred.

And I wanted that. I wanted to maintain that because, uh, I want to have a table cardinality for the table variable to compare to the temp table. So, uh, let’s click on that so we don’t completely lose our spot.

And let’s come down here to the first sort of point of interest. And here is where we have, uh, the literal value for the table variable, uh, which is now guessing within the legacy cardinality estimator, 32 rows, uh, instead of 10 rows. And for, when we use the, uh, local variable for the temp table with no statistics on it, essentially, we also get 32 rows.

So a temp table with no statistics acts strangely like a table variable, huh? Weird, right? Bizarre.

It’s almost like I’ve, I’ve said that before and I’m just repeating myself at this juncture. Uh, and then we get to the inequality predicates. Remember I said this is where it was different with the legacy cardinality estimator? Uh, and the other one, it was 90 and this it’s 68, right?

So this was, uh, the same for either the table variable, uh, with the literal value or the temp table with the local variable. So that was a little weird, right? Like one from 90 to 68, just changing the cardinality estimation model.

I don’t know that that’s all that, uh, great of a different guess. I’m not sure why that changed, but you know, here we are. Uh, for the rest of these, for the, for the, all the range predicates.

So greater than equal to less than equal to yada, yada. These all get 30, right? So 30 all down here, right? Everything is 30 from there on out.

It doesn’t matter legacy or default cardinality estimator. They all get this, uh, guess of 30% of the table. So this, like that, like that 30%, right? Three zero 30 out of a hundred 30%.

Again, when I cranked the numbers up to like, like a thousand or 10,000 or a million, it was always just 30%. Now, of course, this, um, you know, leads to the question a little bit. Uh, is there a difference between, uh, temp tables and table value, table, table variables and table valued parameters?

And the answer is of course, yes. But, uh, before we do that, um, because you are like something that I wondered about a little bit. And, um, maybe you, you would also wonder this because you’re a naturally curious person filled with wonder for the world around you is what would it look like if we mixed the local variable with the table variable?

So, uh, I did that, right? So this is all table variable and local variable. And the answer is that because the table variable has no statistics on it anyway, the local variable really doesn’t do anything interesting, right?

We get all the same guesses that we would have gotten sort of between the two. So it’s like for the equality predicate, it’s 10%. For the inequality predicate, it’s 90%. For everything else, it’s 30%.

Okay. Well, um, great. All right. Wonderful. We’ve again, once again, sort of found a new way to say table variables don’t get statistics on them. But what about table valued parameters?

Now, table valued parameters used to be a way to get table variables to behave more like parameters, right? Because, uh, it’s sort of like the way table variables behave, uh, with the table variable deferred compilation, uh, intelligent query processing feature, uh, where table variables act a bit like parameters. But we’ll, we’ll talk a little bit about a little more about that tomorrow.

So, um, with, uh, table variables. Now, if you’re using 2019 plus and you’re getting table variable deferred compilation, the first time your store procedure compiles, you’ll get table cardinality for the table variable and further executions will reuse that. Okay.

Uh, this generally works by deferring compilation of the execution plans for queries that access the table variable until after the table variable has been populated. Now, uh, when I wrote that sentence, I had some questions and that’s what tomorrow’s video will answer those questions.

So here is the table type that I created to have a table value parameter, uh, owner user IDs. We have an owner user ID column in here with a clustered primary key on it. We’re not going to make this one unique here, even though we could, uh, it wouldn’t really do much for us, but, um, I’ve already pre-run this query because, uh, if I didn’t pre-run this query, you and I would be sitting around for a while.

Now, I only put one row into our table value parameter, right? Or rather, you know, our table variable that is assigned this table type. So it’s a table value parameter magic, magically for us.

And the execution plan for this, it just does not really very well reflect, um, a situation where a cardinality estimation was done statistically. Uh, you might notice here that we get a one row estimate coming from the clustered index scan of the table value parameter. And if I duck down a little bit here, right, this is going to show off the fact that I do in fact do squats in real life.

Uh, SQL Server estimated 12 rows would come out of the post table, but we get almost 28,000 rows. And then when we come over here, SQL Server is like, well, no, that’s still wrong. And this is obviously still wrong.

And then we get, when we get over to comments, well, we are, we are just very, very wrong. SQL Server is like 319, but really it’s a much larger number. And so this thing, whole thing runs for two minutes and 20 seconds.

So I don’t know. I thought that comparing the, uh, temp table and table variable with a local variable thing. And looking at the cardinality estimation would yield like some, some really interesting stuff.

But all, all, all I really seem to do was just further prove that the lack of statistics on table variables and by extension, table value parameters can really screw things up. All right. Well, hopefully tomorrow, tomorrow’s video will give us a little bit more to think about.

All right. A little bit, give us a little bit more to, uh, to chew on in our lives. Uh, I’m going to talk a little bit about just how far table variable, table variable deferred compilation goes in a store procedure. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About *Why* Local Variables Give Weird Cardinality Estimates in SQL Server

A Little About *Why* Local Variables Give Weird Cardinality Estimates in SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 34

SQL Server Performance Office Hours Episode 34



Questions:

* Why do parallel batch mode sorts only utilize 1 thread?

* With all of the recent content frequently mentioning no locks and read uncommitted isolation level, Is there ever a situation that you would recommend using nolock or is it always avoidable?

* What’s required for my query to use batch-mode on my rowstore indexes? Like what version or compat level is required? I remember a while ago, a fake join to a columnstore index was required. Are there any hints I can use?

* Hi Erik, i admire all the work you do for the community, thanks a lot! You are one of the great minds of SQL Server.

* what happened to the old sqlperformance.com website? they just suddenly stopped posting. but the authors are still around and authoring on other places on the interwebs

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five user-submitted questions during my Monday office hours session. Whether you’re curious about parallel batch mode sorts or the nuances of using NOLOCK hints, there’s something for everyone. I also take a moment to thank those who support my work through likes, subscriptions, and sharing with friends—every bit helps! Additionally, I share some exciting upcoming events where you can catch me live in person, including the Redgate Pass-on-Tour in Dallas and Utrecht, as well as the Pass Data Community Summit in Seattle. If you have any questions or want to see me at one of these events, make sure to check out the links provided in the video description for more details on how to get involved.

Full Transcript

Hey, it’s me, Erik Darling here with Darling Data. And in today’s video, we are of course, it is Monday, so we do office hours and I answer five of your wonderful user submitted questions. And that’s fun. Before we do that, if you want to ask me a question for office hours, the link to do so is down in the video description. If you would like to hire me for consulting, buy my training, or become a paid channel member to support the effort that I put in the video, and then I put into all this SQL Server content for you. All of those links are available there as well. If you are uninterested in a monetary exchange with me, you can do free things that make me just glow with happiness. You can like, you can subscribe, and you can tell a friend or two or 20 or a thousand, assuming you know that many people. I’m not sure how full your Rolodex is, but… Hopefully it’s more… Hopefully it’s not just your mom. But your mom is important.

If you would like to see me out on the road, live and in person, I will be in Dallas, September 15th to 16th, and Utrecht, October 1st and 2nd. These are the pass-on-tour dates that Redgate is putting on this year. Aside from that, I also have Pass Data Community Summit in Seattle, from November 17th to 21st, where I will be doing not one, not three, but two days of T-SQL performance pre-conference. I will be doing the pass-on-t-cons with Kendra Little, and that’ll be great for everybody. Especially you. Because you’ll finally learn T-SQL, and that’ll be great. Anyway, let us office hours away here.

All right. First up, why do parallel batch mode sorts only utilize one thread? Well, I’m gonna… So there are two parts to this. One, your question is correct about one part, but parallel batch mode sorts, of course, do the actual sorting work on multiple threads. They do not do the sort on one threads. The sort only produces one thread.

The best explanation that I’ve ever seen in detail of why is in a Paul White blog post that I will put in the show notes about why batch mode sort spills are so slow in SQL Server. It has to do with the sorting algorithm that SQL Server uses internally and the complexity of producing parallel output from a batch mode sort. There is, of course, one operator in all of SQL Server that currently does support receiving parallel threads from a batch mode sort, and that is the window aggregate operator that gets used when window functions are processed in batch mode.

So there’s that. But I’ll put the link to Paul’s post in the video description and everything, because anything further that I said about why that is would be pure plagiarism, because the technical details there are best left up to the best SQL Server consultancy in all of New Zealand. Outside of New Zealand, it’s, of course, me, but within New Zealand. That’s Paul White.

All right. So next. With all of the recent content, frequently mentioning no locks and read uncommitted isolation level, is there ever a situation that you would recommend using no lock or is it always avoidable? Well, there are. So, you know, if it’s a query that you don’t particularly care about and it’s causing problems and no one seems to really care about what the results of that query are, you are free to put no lock hints on it.

You know, that would be that would be the first place. A second way of thinking about it would be a situation where. So one thing that’s interesting about the no lock hint.

is that it allows for SQL Server to use what’s called an allocation order scan, which basically just reads like data file contents in the order that the files were created. And in a nutshell, there’s there’s a little bit more to it, but, you know, I don’t have days to talk about these things, which can be faster in some situations, especially if you are reading pages from disk, I think. And so if you have a data warehouse type workload where you have a lot of data that gets loaded, say, at night or in the morning or something, and then like the data is fairly static during the day, you could use no lock hints on your big tables to sort of get encourage the optimizer to use an allocation orders order scan versus an index order scan, which can be faster.

So that would be one use case for it there. It would be very similar to if you changed your data warehouse to be read only once data loads were finished. That’s another way to achieve a similar result.

So that would be when I might use no lock. Aside from that, you know, I will judge you for using it. What’s required for my query to use batch mode on my rowstore indexes?

Like what version or compat level is required? I remember a while ago, a fake join to a columnstore index was required. Are there any hints I can use?

So there are no hints that you can use that I’m aware of. And that would extend into saying there are no supported hints that I am currently aware of that would allow that would force batch mode to occur. The fake join to a columnstore index will still get sometimes.

So the the the the the tricks you can play on SQL Server, one of them is, of course, the fake join to a columnstore index. The other would be creating a non clustered columnstore index on one of the involved tables with a filter that cannot possibly contain a row like say where ID equals one and ID equals negative one. No ID can be one and negative one at the same time.

So there would be no rows in the filtered index, but you would still have a columnstore index on the table that might encourage the optimizer to use batch mode in some places that will that will not. So like the tricks that you can use, I’ve found are not baked as deeply into the optimizer is when batch mode on rowstore happens. Like like something like some like sometimes you’ll see when you do one of the tricks, you’ll still read from tables in row mode.

But when batch mode on rowstore happens, you can read from rowstore indexes using batch mode. So there are some subtle differences that can affect how effective the batch mode is in the execution plan. But to get to your actual question, you need to be on Enterprise Edition.

It does need to be SQL Server 2019 or better. And you do need to be in compat level 150 or higher. If you have control of your queries and you meet most of those most of those requirements, you can use the option use hint.

And you can say the option use hint, the compatibility level 150 or 160 or if you’re in the cloud or something 170 in order to get a higher compat level for the query than what your database is currently set to, which can allow for additional batch mode stuff to kick in. Batch mode on rowstore itself has a bunch of heuristics. It looks at like the size of the tables, the complexity of the query, the joins and everything like that.

And it makes a runtime decision about whether to use batch mode on rowstore for your query when that happens. Like I said, you can use various tricks to encourage it, but those various tricks often do not get batch mode across the breadth of operators that batch mode on rowstore does. All right. Next up.

Hi, Eric. I admire all the work you do for the community. Thanks a lot. You are one of the great minds of SQL Server. That’s very kind of you. This is I’m going to.

This is not a question. I just want to point that out, but it is very kind of you to say that. And I appreciate I appreciate the sentiment there. So thank you. I think you are.

I think you’re also a great mind, but I’m not sure in what community you are a great mind. But I am sure someone benefits from you somewhere. So good job. All right.

Oh, here’s a fun one. What happened to the old SQL performance dot com website? They just suddenly stopped posting, but the authors are still around and authoring on other places on the interwebs. Well, this is another SolarWinds special.

You see, when SolarWinds purchased SQL Sentry and the or whatever Sentry one and the SQL Sentry monitoring tool was part of that package, of course, two things happened. Both the SQL performance site that that SQL Sentry had hosted and paid people to blog on for years suddenly stopped being a monetary priority, as did the SQL Sentry monitoring tool. So again, as I’ve said before, someone at SolarWinds should be in jail for neglect of these resources for the SQL Server community.

But in a nutshell, that’s what happened. You know, I guess the authors there were not interested in blogging there for free and all sorts of things on the site have deteriorated. Code examples are no longer formatted well.

Everything is a nightmare. I do know that Aaron Bertrand has migrated the I think if all. Well, oh, no, that was from SQL blog that he did that.

I don’t know if Aaron has migrated any of his posts from SQL performance elsewhere, but Paul White has moved all of his posts to his his personal website, SQL dot Kiwi. That’s SQL dot Kiwi. Any posts that Paul had on the SQL performance site is hosted there with a very similar URL just at the SQL dot Kiwi domain.

And I believe for some of them, he has updated the contents to reflect newer changes in SQL Server where were applicable, I suppose. Anyway, that’s about what happened there. It is a shame.

You know, that’s no longer an up to date resource for things and it’s just sort of withering on the vine. But, you know, there’s still some good stuff there from back when people cared about things. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video. I think can never be quite sure. I think I think that’ll happen.

But, you know, maybe maybe I’ll just get drunk and decide to retire. Who knows? That’s still early.

All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Inline Table Valued Functions

Learn T-SQL With Erik: Inline Table Valued Functions


Video Summary

In this video, I dive into the world of inline table-valued functions in SQL Server and explore why they might be a better choice than scalar or multi-statement table-valued functions for certain scenarios. I explain that these functions are essentially queries wrapped in a function call, much like views but with parameters, which can lead to more efficient execution plans due to their ability to be inlined directly into the query. I also highlight how inline table-valued functions avoid some of the performance pitfalls associated with scalar UDFs and multi-statement TTVFs, such as preventing parallel execution and automatic inlining restrictions. By walking through examples and comparing them with other types of functions, I demonstrate their potential benefits and when they might be most useful for improving query performance.

Full Transcript

I don’t want to alarm any of you, but I have taken 10 milligrams of creatine today and I can now see the future. So what I predict is that we are going to talk about inline table valued functions today. Crazy, huh? Inline table valued functions are, of course, the one kind of function that is really only as bad as the query you put in it and the indexes that you have to support that query. There are many side effects of using scalar user defined functions and multi-statement table valued functions that we talked about over the last couple of days that would hopefully dissuade you from trying to use them in any way where performance matters to your queries. Inline table valued functions really don’t have those side effects. They are just queries. They’re like views that can accept parameters. So that is an important distinction there. You know, you know, you can put, you know, you can put, you know, you can put, you know, you can put a query in there, but what people tend to put in views is a whole other matter. Inline table valued functions share that similarity where, you know, you can, you can put a query in there, but the quality of that query depends on the author. Anyway, we’re going to talk more about that in a moment.

But before we do, we know it’s pause for this commercial break where if you look down in the video description, you will find all sorts of ways that you can give me money and interact with me for free. If you would like to hire me for consulting, great. I’m available. I am a consultant. I consult. There’s probably all sorts of cool Latin words for that that I’m not aware of. You can also buy my training, performance tuning training, or if you wanted to, you could buy the full course. That encapsulates all of the material and more that we have talked about in these YouTube videos. You can also become a subscribed member of the channel where you can give me a little tiny bit of money every month just to support these videos. You can, for free, ask me office hours questions. And of course, if you enjoy this content in any way, shape, or form, you may like, subscribe, and tell not one, but all of your friends at once.

Send out a mass email and say, Hey, everyone. This is the best SQL Server channel on the internet. Wild, right? Anyway, things I am going to be doing outside of the home this year. Well, I gotta go away a little. The nice folks at Redgate were kind enough to say, Hey, Eric. I forget what they said. Anyway, I’ll be in Dallas September 15th and 16th. Utrecht, the Netherlands October 1st and 2nd.

I’ll be in the past data community summit in Seattle, November 17th to 21st, where I will have two joy-filled days of pre-cons with Kendra Little, talking about a lot of T-SQL matters and all the T-SQL that matters. I’m going to work on that pitch line a little bit. Still workshopping things.

Anyway, let’s talk about inline table-valued functions here. We must get to SSMS. So, where scalar UDFs, they would say something like return a data type here, and multi-statement table-valued functions would return a defined table variable here. This just says return table, right? And this is not a physical table.

One thing that’s important to understand about queries in SQL Server is that any query really returns tabular results. So, that’s why when you, say, use a derived table or a CTE or some other kind of table expression, including like a values clause, once you’ve sort of selected from that, you can nest that from in another from, and you can like manipulate the data or filter the data further with further froms and where’s and joins and whatnot.

So, this just returns a tabular result. Of course, saying returns tabular result is a bit too verbose for even Microsoft’s team of developers. So, it just says returns table.

But keep in mind, it’s not a physical table. It’s just a tabular result. And what we can do with these functions is get away with things that we could not maybe get away with if we were to write a scalar UDF. So, if you remember the video from a couple days ago where I talked about scalar UDFs and how things like these non-deterministic built-in functions, like sysdate, time, get date, and that whole family of things, would prevent the automatic inlining of a scalar UDF.

We do not have that restriction when we create an inline table-valued function. Why that is a restriction, I’m not sure. But what I can tell you is that this is perfectly capable of being inlined when it is written as an inline table-valued function that just returns a table.

So, if we look at the estimated plan for this. Now, if you remember, with scalar UDFs and multi-statement table UDFs that return a table variable, when we ran those queries, there was this separate execution plan like down below, right?

So, this is our non-inlinable scalar UDF. If we were to get the estimated execution plan for the inline table-valued function, which is this thing, and we were to say, give me this estimated execution plan, there’s no separate execution plan for this, right?

It’s just the body of the function is inlined into the statement itself. There are, of course, going to be differences here where we turn on actual execution plans and run these. We’re going to see that, you know, this query spent 196 milliseconds scanning this and this.

And, you know, this compute scalar now has some time assigned to it where before it didn’t. But, like, because this was inline, this compute scalar actually, like, ran a function, so there’s time assigned to it. Now, of course, to compare these a little bit further, if we come over here, this top one will, the non-inlined function has this non-parallel plan reason where the inline table-valued function does not have that.

So, inline table-valued functions do not have a couple of the properties that scalar UDFs do, which is the once-per-o execution and the prevention of parallel execution plans by the query that invokes the UDF. So, inline table-valued functions do not have some of the immediate baggage that multi-statement table-valued functions do.

And if we run, well, we already ran that query. Now, another important distinction between the two is when you have an inline table-valued function that touches data, right? So, rather than just, like, let’s just, for the sake of expediency, let’s just call this, what this thing does, processing in memory, right?

Let’s just say that, because you’re not really touching anything physically. You’re not touching any physical objects on disk or in memory, right? I mean, memory isn’t really physical.

I mean, RAM is a physical object, but the stuff in memory isn’t a physical object. But let’s just call this in memory, because all this is doing is accepting some input and doing a little bit of math on it, right? So, this is, like, a lot of just, like, CPU processing.

We don’t have to go out and, like, read data pages from a table or index and, like, bring them into memory and do anything with them there before we start handing them out to queries. This function does, though, right?

This function goes out and touches two tables that live in our database posts and comments here. So, I think I already created this function, but we’ll do it once more for good luck. So, if we look at the scalar UDF from a couple days ago that does the same thing, remember, the estimated plan will show us what happens inside the scalar UDF, but the actual execution plan will not show us what happened inside the scalar UDF, right?

So, we’re running this scalar UDF plan, and it is taking its sweet time. And now we finally have an execution plan, but all we see is that we spent about 200 milliseconds here, but then on this compute scalar, we spent nearly 10 seconds, or a little over 10 seconds, actually, because it’s 10.288 minus 0.190, right?

And, of course, SQL Server remains telling us that we must index the user’s table to make this query faster, which we know is a bunch of hokum, because what happened in here was the bulk of the execution time.

Even though it was 0% of the cost, all the time in the plan was spent here, right? 10-something seconds. But an inline table value function is, of course, a lot more honest with you about what it does.

If we just get the estimated execution plan here, we will see that we did indeed touch the post table, and we did indeed touch the comments table, right? We see all the work involved here.

And if we run this query and get the actual execution plan, we will see, not only is this query a lot faster, granted it was allowed to go parallel, which is part of the help, but even if we ran this at max stop 1, it would be faster than 10 seconds.

But, because it’s allowed to go, well, sorry, if we ran this at max stop 1, it would still be faster than 10 seconds. I’m not sure how much faster.

But we are allowed to get a fully parallel plan here, and we do get a lot more honesty about what the execution plan did along the way, touching all these tables and indexes over to the side here. So, again, inline table valued functions are just like a view that accepts a parameter.

They are really only as terrible as the query that you put in them, and the indexes that you have to support that query. Remember, queries and indexes do tend to go hand in hand in that way.

So, we always do need to consider indexing when we are considering query performance, sometimes more than others. Anyway, that’s about it for inline table valued functions.

I do recommend, if you are in need of encapsulating some code in a functional way, that you do your best to choose these over scalar or multi-statement table valued functions, because they do not have the built-in performance baggage that those two types of functions do.

The inline table valued function does not force your query to run single-threaded or necessarily execute per row. Though, if you get a nested loops join plan, it is sort of per row anyway, but not in the same sort of ugly procedural way.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. This is getting published on a Friday, so I hope you have a great weekend. And, yeah, that’s about it for me.

All right, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Multi-Statement Functions

Learn T-SQL With Erik: Multi-Statement Functions


Video Summary

In this video, I delve into the world of multi-statement table-valued functions in T-SQL, explaining their structure and behavior while highlighting key differences from scalar UDFs. I explore how these functions operate within SQL Server 2017 and later versions, focusing on concepts like interleaved execution and deferred compilation. By walking through detailed examples and query plans, I aim to demystify the intricacies of multi-statement table-valued functions and their impact on performance tuning. If you’re interested in diving deeper into this topic or exploring other advanced T-SQL topics, be sure to check out my full course content available in the video description.

Full Transcript

Erik Darling here with Darling Data, finest SQL Server consultancy outside of New Zealand. I just got my BeerGut Magazine Magic Quadrant notification that I am a leader in SQL Server performance tuning consultancies. So we are proud to announce that. So, you know, we just can’t be stopped. We cannot be beat. We cannot be beaten. Even though there’s only me, we cannot be beaten. All right. Team effort. Anyway, we’re going to keep talking about T-SQL today. We’re going to expand our T-SQL horizons with, by talking about multi-statement UDFs. Multi-statement UDFs, of course, return a table variable. If you’ve been watching the content on my channel for any amount of time, you’ll know exactly what my feelings on table variables are. They are good for a few niche use cases. Aside from that, they are not of much use or value. But, you know, we find people using them in all the familiar ways and all the familiar places. And we are able to remedy that situation when we come across them. But multi-statement UDFs really have no choice but to use a table variable. So we need to talk about these today.

If you would like to buy this course, which, of course, all the videos that I’m doing here are just small nuggets and morsels from the full course content. And if you would like to buy the full course, down in the video description, where this helpful array of fingers is pointing, you can find links to buy the full training, hire me for consulting, become a contributing via money member of the channel. You can also do things that are free, like ask me office hours questions. And of course, if you if you enjoy this content, please do like subscribe and tell a friend or two. It doesn’t have to just be one. Tell one you’re not maximizing your efforts. And this MLM is never going to get off the ground. If you don’t maximize your efforts, is it? All right. Anyway, if you would like to see me out live and in person, the nice folks at Redgate have decided to pull me from my wine cellar and drag me out to various and sundry locations across the world, Dallas, September 15th to 16th, Utrecht, Netherlands, October 1st and 2nd, and Seattle, Washington, November 17th to 21st.

Quite notably, in Seattle, I will be doing two days of T-SQL pre-cons, some of it based on this material with Ms. Kendra Little, and it will be the best two days of your lives. All other days will pale in comparison to those two days. So you can, you can come see me and we can compare how pale we are together. Anyway, let’s talk about multi-statement table valued functions in T-SQL. We must go to the correct spawn of SSMS in order to engage with this material. So let’s do that. And let’s talk a little bit about multi-statement table valued functions. Now back in 2017, imagine where the time goes. SQL Server 2017. It feels like it was just yesterday. It was just one plague ago. SQL Server 2017 was around, and there was sort of a precursor feature to a later feature called interleaved execution for multi-statement table valued functions, where when you invoked a multi-statement table valued function, rather than just guessing like one row or a hundred rows, depending on which cardinality, estimation model you’re using. SQL Server would wait until the function had executed and then use table cardinality to drive, you know, some of the cardinality estimates for the rest of the query.

Not a bad start, but, you know, we still don’t get column level statistics histograms with table variables, no matter if they are used in multi-statement table valued functions, if they are used normally as declared tables, or even if they are used as table valued parameters. I suppose we should talk about table valued parameters at some point, but not today. Not today, buddy.

But the precursor, the feature that this was a precursor to was, of course, table variable deferred compilation, where SQL Server would now wait for a table variable to be fully populated, and then it would use table cardinality for the table variable in order to inform various plan choices. But again, still no statistics, still no histograms.

So even though we knew how many rows were in the table variable, we still did not have any idea of the contents of them. But this is what a pretty typical multi-statement table valued function looks like. It returns a table variable with a set definition, and that’s about it.

Between the begin and end, you can put a lot of different stuff in there. There are some usual set of restrictions on functions where you can’t do a lot of side-effecting stuff, execute store procedures, yada, yada, yada. But today we’re just going to look at a very basic single insert into a table variable and some things about the query plans.

All right. So the first one is just like scalar UDFs. If when you get, say, an estimated execution plan for a multi-statement table valued function, the query that calls it will only have some sort of hidden information about the user-defined function itself, right?

You’ll see the table valued function with the name and then a table scan where data is pulled out of the function. The query plan won’t always look exactly like this. We’ll talk about why in a moment.

But then down in the estimate, down when you call the estimated plan, you’ll also have one additional estimated execution plan for the function body itself. You can see this is where we hit the post table, do some stuff with the row numbers, some filtering, yada, yada. But up here, we did not see any direct reference to the post table.

This is all happening within the body of the function. If we look in the properties here, we will see the reason why the query plan looks like this is because we have this contains interleaved execution candidates true attribute in the plan XML. That will not be there for all multi-statement table valued functions, of course.

Some of them, we will be denied access to that feature. But when we run the query and get the actual execution plan, well, this query does take a little bit to run. But the actual execution plan just shows us this.

And this is a little funny because we don’t see any time in here at all, even though the query took about four seconds to run. If we rerun this, I believe this is where we’ll see the operator times show up. And here we’ll see 4.6 seconds here and 4.6 seconds here.

So this is a very tricky thing, is when interleaved execution is used in your multi-statement table valued function, the first execution where you gather an actual execution plan will have no operator times for it. The second execution will.

So a little bit of a tricky troubleshooting thing there. So just be very mindful of that when you are query tuning. But notice that for this, we don’t get that separate execution plan down here where we had one before. This is the exact same thing like with the scalar UDFs.

All the work is, all the things that happen inside the function are hidden inside these operators in the plan. Well, mostly this operator in the plan. But this one is where data was pulled out of the multi-statement table valued function.

We see the two rows leave here. So that’s where, like, the work is all sort of in here, but the stuff that gets pulled out is here. The table scan is, of course, just the definition of the table that we have in here.

So that’s fun. And this gets, of course, particularly tricky when we want to do something with a query that calls a UDF and joins it to something else. Where, you know, now we just have this very simple looking execution plan up here with absolutely no references to all the work that occurs in here.

Now, since this type of function at some point has to insert into a table variable, what we’re going to see is some sort of non-parallel plan reason. The non-parallel plan reason, if you are on SQL Server 22-ish bits or better, will be table variable transactions do not support parallel nested transaction, which is a nice way of saying you can’t use a parallel execution plan when you modify a table variable that includes inserts. That is true even inside of these functions.

Of course, you can’t see that when you get the actual execution plan because the query plan for the function body is neglected from being shown in there for the very similar reasons to that of the scalar UDF where for every row that this thing has to process, we do pass data in and get data out. So if we get the actual execution plan for this thing, it’ll take a little bit longer. How long?

Who knows? Who can ever tell? Right? Could be a very long time. This function doesn’t seem very quick to me. Perhaps there is some indexing work that we could do with this function to speed it up.

The world may never know. Maybe an index on the post table would help us out here. Right?

Maybe this is just too painful. Oh my God. 27 seconds. But if we go and look at this. Now notice that we did not, for this query, we don’t have that sequence operator in the query plan, which means that we do not have an interleaved execution function in here. Notice that over here, we just don’t have that, what do you call it?

We don’t have the contains interleaved candidate attribute. It’s just not there. Right?

It’s not there in saying false. It’s just not there. Now we just get this generic non-parallel plan reason. Could not generate valid parallel plan. But even though in the function body we had, like when we had the estimated plan for the function body, we could see it was because of the table variable. So now we just see that as the higher level query plan for it there.

And we can see that we spent 27 seconds inside of here. Of course, this doesn’t look like very much of a big deal in the plan itself. Like if you were to be, if you were looking at, say, an estimated execution plan or, you know, you were like, you know, something like you got it from the plan cache or from query store where the actual runtime metrics aren’t included in the plan.

You would see this 0% cost and you would think nothing of it, wouldn’t you? You would say, my goodness, we scanned this cluster. 97% of the cost is scanning the clustered index of the users table.

We have to do something about this. We must, we must remedy this. But no matter what indexing you do up here, it is not going to help the 27 seconds that we spend in here. It’s really only by getting the plan for the function body that we can make any sort of reasonable assessment of what we might be able to do in order to tune this query using the multistatement table valued function.

Sometimes it is good enough to, say, add an index to support whatever the body of the function is doing. In this case, even just an index on the owner user ID column would be enough to get us somewhere. But, you know, we’re not going to do that here.

This isn’t necessarily about query tuning. This is teaching you about the T-SQL that makes up these various modules and how they might mess with your T-SQL query plans and queries. So, multistatement table valued functions, generally, don’t want to use them.

Again, they have some niche use cases just like T-SQL scalar UDFs. But generally, one should attempt to avoid them at all costs if one cares deeply about performance. So, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video, maybe. You never know. Tomorrow’s never promised.

All that stuff. Anyway, thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Scalar UDFs

Learn T-SQL With Erik: Scalar UDFs


Video Summary

In this video, I delve into scalar UDFs in T-SQL, discussing their potential pitfalls and how they can impact query performance. I explore the differences between non-inlinable scalar UDFs and those that are inlined, using practical examples to illustrate the hidden work these functions can do when not inlined. By walking through an example with a non-inlinable scalar UDF, I demonstrate how it executes once per row, leading to inefficiencies that can be mitigated by adding parameters or rewriting the function as an inline table-valued function. The video also includes links for those interested in my training courses and upcoming events where you can interact directly with me.

Full Transcript

I’ve been listening to a lot of Tom Petty lately. I don’t know why. Anyway, Erik Darling here with Darling Data, joined today by Bats Maru, who has some thoughts and feelings on what we’re going to talk about. We are going to, of course, be doing some more T-SQL learning. These are all little tidbits and morsels from my larger course called Learn T-SQL. I think with Erik might just be Learn T-SQL. I forget how much I involve myself in the course name. But that’s okay. There are more important things like content, accuracy, you know, polarity, things, other things that I care deeply about. But before we go talk about Scalar UDFs, if you would like to buy any of my training, including the full Learn T-SQL with me course, all these links are available down in the video description. You can find all sorts of useful ways to interact with me, both by giving me money for things and by getting things for free from me, like asking office hours questions or liking, subscribing and telling a friend. So look down and you’ll find links for all the useful stuff in the world.

If you would like to have some useful interactions with me in person, the fine folks at Redgate are yanking me from the comforts of home and dragging me to various business class hotels in Dallas, Texas, September 15th and 16th. Utrecht, Netherlands, October 1st and 2nd and Seattle, Washington, November 17th to 21st, where I will be attending various past events and talking about SQL Server stuff, including, but not limited to, two days of T-SQL pre-cons with me, hosted by myself and Kendra Little. So you should come to those because you won’t be disappointed like you would be if you went to any other pre-con.

It’s the tremendous disappointment, not just fear, but the knowledge of missing, the como, the knowledge of missing out that you would feel by going to a different pre-con would just ruin your life. You would be a complete dark downward spiral for you from that day on. You would be able to just mark it on the calendar.

Life, graph, down. If you were making a tapestry of your life, it would just be black thread from that moment onward. Frayed at the edges.

All right. Anyway, let’s talk about T-SQL. Let’s talk about something. Well, we are in dark mode here, aren’t we? Coincidence?

I don’t know. But let’s talk about scalar UDFs. Now, scalar UDFs, of course, if you have spent any amount of time in your life dealing with T-SQL and perhaps Googling, or let’s just say Googling. There’s no one using Bing.

Let’s not lie to ourselves. You will undoubtedly have come across many articles, maybe even some articles by yours truly, about the perils of scalar user-defined functions in T-SQL code. And you may have even read something added to SQL Server 2019 called scalar UDF inlining.

Of course, scalar UDF inlining has many restrictions and limitations put on it that make it so it is ineligible for certain forms of functions. If you would like to explore the full list of that, Microsoft has just about, I think, just about all of it documented. There may be some stuff that we don’t know about, we don’t see.

But, you know, look up scalar UDF inlining. You’ll see the Microsoft Learn or Docs or whatever they’re calling it these days page. And you’ll be able to see the full list of restrictions and limitations.

So what we’re going to do is, since my copy of the Stack Overflow database is currently in compatibility level 160, I am currently using SQL Server 2022. I need to play a slight trick on SQL Server in order for this function to not be eligible for scalar UDF inlining. The trick that I am playing on SQL Server is incorporating one of these non-deterministic date functions in the function body.

So sysdate time, get date, any of those things would have the same effect on disallowing inlining for this code. And what I want to show you first is the estimated execution plan for this. So when we run the estimated execution plan, when we collect the estimated execution plan, we get two things back.

We get back the query plan for the query. And then we also get back an additional query plan that shows us what the scalar UDF would have done. We will not see that with the actual execution plan for this query.

Because SQL Server does two things with non-inlineable scalar UDFs. The thing that immediately affects this is that the scalar UDF is not run once per query, since it is not inlined. It is run once per row that the query has to process.

If we look at this query plan, we’ll see an index scan, we’ll see a sort, and then we will see a compute scalar. The compute scalar is where the scalar UDF is… Rather, the compute scalar is the operator responsible for executing our scalar UDF.

You can see that right here. These are the properties of the compute scalar. And we will see the name of the UDF in there.

So that is where the UDF was executed once per row that the query projected out. All right. So we do not see the actual execution plan for the scalar UDF because it executes once per row.

And if we executed that UDF for a thousand rows, we would not want to get back 1,000 execution plans, would we? No. No siree, Bob.

The second thing that scalar UDFs will do to us is prevent a… Rather, non-inlinable scalar UDFs will do to us is prevent the calling query from using a parallel execution plan. Not that one is warranted here, but it is restricted from using one even if one were warranted.

So those are the two big downsides of non-inlinable scalar UDFs. To prove out that this UDF does, in fact, execute once per row, that query returned six rows. And we have six executions of the scalar UDF in the sys.dm exec function stats view, which only collects data about scalar UDFs.

Ah, I didn’t want that. One way that we can work around that situation with the UDF not being inlinable is rather than have sysdate time in the body of the UDF, we can just add a third parameter to the function. And the third parameter will get used here in place of the sysdate time function.

And if we do that and we pass in sysdate time from outside of the function call, then it will be perfectly inlinable. So if we run this query and look at the execution plan, this will look a little bit different. Rather than seeing the compute scalar with the scalar UDF in it, now we have some extra stuff in here.

And this is where the function was inlined into the query. All right, these constant scans and the nested loops join and stuff. This is where the function, this is where the body of the function ran, but it was inlined into the execution plan.

So now we have this little bit of additional information. If we get an estimated execution plan for this, we no longer have that second plan down here, right? We can only prove a negative so much, but there’s no second execution plan that shows us what the body of the scalar UDF would have done.

And if we go and interrogate the sys.dmexec function stats dmb, it will be empty for the inlinable UDF. All right, so that is one sort of peril of the UDF is when they are not inlinable, they hide some work for you. Of course, the work that they hide from you is only obvious if we look at the use of a function where user tables are touched.

Right, so here you can see in the body of this function, we touch the post table and we touch the comments table. But if we get, and we’ll only, rather, let’s start here. We’ll only see that if we get the estimated execution plan, right?

So like the query plan up here with the UDF in it just shows us clustered index c compute scalar select. The estimated execution plan for the function, right? This is our non-inlinable scalar UDF here.

This shows us two index scans of posts and comments, some parallelism, some stream aggregates, some compute scalars and so forth. And when we execute this query, we get the actual execution plan. We will once again not have a, we will not have the benefit of the execution plan for the UDF.

We will have absolutely no information that it went out and touched the post table and the comments table and scanned the clustered index. That will not be visible to us. All we’ll see is that we scanned the clustered index of the users table, which took zero seconds.

And then we spent 1.2 seconds in the compute scalar. And of course, the 1.2 seconds that we spent in the compute scalar was executing our user defined function once per row that came out of the original query, which for our results was just once, but that one execution took 1.2 seconds.

So think about that if we returned 10 rows, we’re looking at 10 something sec, maybe 12 seconds at that point. If we executed this 20 times, well, what’s 1.2 times 20? It’s a higher number, isn’t it?

So when scalar UDFs are a particularly nasty piece of work, if you are, you know, given the number of limitations and restrictions, we will, we will have work to do rewriting these as inline table valued functions for many years to come. So a really smart fellow named Sam Arch wrote a paper a little while back about scalar UDF inlining and how, I think it’s called like, scalar UDF inlining first start with outlining or something.

And it was actually implemented in DuckDB. And what it did was it took cases where scalar UDF inlining broke like queries completely and fixed them. So it was a very, very cool paper.

So it’s Sam Arch, A-R-C-H. If you have the wherewithal to read a white paper about query optimization, I highly suggest checking that one out. Very, very cool thing.

Anyway, scalar UDFs, you know, you can pretty much tell when someone has used one because they just return a data type like integer or date or date time or varchar max or something. And when you see these in your queries, you better hope that they’re simple and that SQL Server can rewrite them. Otherwise, you’re going to have to hire a young and handsome consultant with reasonable rates to come and rewrite them for you.

Isn’t that a shame? Isn’t that something there? All right.

Anyway, thank you for watching. Hope you enjoyed yourselves. Hope you learned something. And I will see you in tomorrow’s video. I believe we’re going to be talking about multi-statement table-valued functions. And we’re going to learn a little bit about how the T-SQL that goes into those interacts with the queries and query plans that you see presented to you.

When you go trying to tune queries and whatnot. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About ROW_NUMBER Filtering Performance in SQL Server

A Little About ROW_NUMBER Filtering Performance in SQL Server


Video Summary

In this video, I dive deep into an intriguing SQL Server performance question that came up during yesterday’s office hours. Specifically, we explore why a `ROW_NUMBER() = 1` query performs differently from a `ROW_NUMBER() <= 1` query by analyzing their execution plans in detail. You'll see how the choice of operators and execution modes can significantly impact performance, with one query finishing in just over five seconds compared to another that completes in less than two seconds. I walk you through the nuances of windowing functions, batch mode vs. row mode operations, and how SQL Server optimizes these queries. Whether you're a seasoned DBA or just starting out, this video offers valuable insights into optimizing your T-SQL code for better performance.

Full Transcript

Erik Darling here with Darling Data and I’m very excited. Very excited today. Hugely excited. Probably the most excited I’ve ever been. I’m going to be answering in longer form a question from yesterday’s office hours about why row number equals one. It has a different performance than row number is less than or equal to one. And of course, we will be looking at query plans. We’ll be looking at performance and we will be answering this question in as much detail as I can muster off the top of my head. So good stuff there, right? All right. I was going to say something else funny here, now I forgot it. See, this is the problem with the top of my head or things just fly off it. It’s like frying an egg. Little bubbles. Anyway, if this sort of stuff is interesting to you and you think, gosh, this Erik Darling guy just might know a thing or two about SQL Server, maybe he could help me with SQL Server. Maybe he could do it. Maybe he can be the one to finally change my life. Well, there are all sorts of ways for me to accomplish that. And if you look down in the video description, you will find all of these ways for me to change your life. You can hire me for consulting by my training. You can hire me for consulting by my training. You can hire me for consulting by my training. You can hire me for consulting by my training. You can hire me for consulting by my training.

You can hire me for consulting by my training. You can hire me for consulting by my training. You can hire me for consulting by my training, oversee this monthly consultation, and then for free, you can ask me office hours questions. And also for free, you can like and subscribe and tell a friend about the magnificence and glory of this channel and its content. And of course, the young and handsome consultant behind the majority of the content therein. If you would like to see a young and handsome consultant live and in person, well, golly and gosh. Won’t I just be a good friend? Here’s an awesome consultant. everywhere you want to be. Dallas, September 15th and 16th. Utrecht, the Netherlands, October 1st and 2nd. And Seattle, the Washington, November 17th through the 21st. These are all past events put on by the magnificent, lovely, also glorious, also magnificent and glorious folks at Redgate who were kind enough to say, hey Eric, we like you a little bit. Why don’t you come talk about these things? And I said, yes, yes, please. Yeah, I’d love to go places, talk about things because you know, there’s only so much time one can spend trapped in a PowerPoint or in an SSMS tab. It starts to jar the brain a little bit. So with that out of the way, let’s talk about this any aggregate thing. Let’s make sure that I go to the right spawn of SSMS. Now, it took me a little, it took me a little while to get this demo to the place where there was a nice noticeable difference in performance between these two queries. And part of it was having a slightly imperfect index key to support the windowing function that we’re going to be using. So that was the first thing. The second thing was getting this index to create in a reasonable amount of time by filtering out all these zeros, because there are a lot of zeros in the comments table.

And then of course, writing the query in a way where there would be a noticeable difference in performance between a query where the row number function is filtered to equals one, and a query where the row number function is filtered to less than or equal to one. All right, so we’re going to, well, we’re not going to create this. We’re not going to go do all this stuff over again, because that’s just silly. We are going to make sure that we have actual execution plans enabled, which we do. We can tell that via the lovely purple highlighting there. That is a nice touch. I do, I do like purple. If Erin Stellato had as much to do with picking the color purple for that highlight as I think she did, my kudos go out to Erin Stellato for her choice in purples, because that is a great purple. There’s a nice, there’s like a grimace purple. And that is, that is the perfect purple for me. That is purple rain to me. So let’s run these queries. Now what these, now just to get rid of the query results, because I hate waiting for query results. And the discard query results thing is a bit too heavy handed for me, because then I have to remember to turn it on and back off and on and back off. It’s all too much. We are going to be dumping the results of these queries into different, slightly different temp tables, because we don’t want to overcrowd a single temp table.

And let’s do this and let’s look at the performance difference. We’re going to execute these and we’re going to wait, oh, around about seven seconds total, I think, if I, if I remember things correctly. There we go. All right, look at all those rows affected. See, sometimes having, having no count off is nice, right? Make sure that the same number of rows got changed. Look at that beautiful symmetry. Wow, we’re so good at this. All right. So that was indeed about seven seconds.

And here is the first query plan. Now you’ll, if you’re familiar at all with SQL Server execution plans, you will immediately recognize that for a query with a windowing function in it, we are missing some potential operators. Usually when there is a windowing function involved, you will see some combination of segment and sequence project. You may even see a window spool for some of them. And here we do not have those. Here all we have is a stream aggregate. What this means is that SQL Server transformed the row numbering into a grouping operation. So we did not actually generate a row number. We just grouped the whole thing. Now there are some rules that SQL Server follows in order for this to happen. You must filter the, I think it’s, you must filter the windowing function to an equality predicate. I think that’s one of them, but more, but other more important stuff is like you can’t be presenting the window function column and the results, right? So if I had the column X dot N in here, that’s what the windowing function is, um, is alias to, uh, then we would not see this transformation get applied. But, uh, you, you can tell from this that we just did a stream aggregate. Now this all took around, around about five and a half seconds, right? 5.356 seconds, which is close enough to five and a half for me. Part of why this query, uh, took the five and a half seconds is that this query happened entirely in row mode, right? So we processed quite a lot of rows all in row mode. We did not use batch mode for this. Part of the other signs that we use batch mode for this, rather that we did not use batch mode for this, are that we have, uh, some parallel exchanges in the plan. Often these are, uh, omitted to some degree when you have a, uh, when you have a query plan that uses batch mode, usually fully omitted. If you have a query plan that is just about fully batch mode, sometimes only partially omitted. If you have a query plan that is a mix of row mode and batch mode. Another good sign is that we have a stream aggregate operator. The stream aggregate operator, of course, does not support batch mode, right? Uh, only the hash, uh, only the hash, uh, aggregate and, uh, join support batch mode. Nested loops and merge joins also do not support batch mode. Now, if we come down here and look at the second query, the first, well, let’s, let’s, let’s actually make this a little bit easier on the eyeballs. What do you say? I say that’s a great idea. Uh, this query finished in 1.6 seconds.

Hmm. It’s a meaningful improvement, isn’t it? I would say so. From about five and a half to about one and a half. That’s, that’s pretty good, right? We saved about four seconds there from, my math is correct. And, uh, if you look at the various operators in here, you might notice that, uh, SQL Server, uh, used, you might, you might have some visual indicators that SQL Server used batch mode. And, uh, uh, the visual indicators are, well, the main, the prime visual indicator is that we have a window aggregate function, right? So the window aggregate is a batch mode operator that SQL Server uses when you have a windowing function and batch mode is chosen as the execution engine for the query.

So if we look at this, uh, the details of this operator, we will see that this executed in batch mode. We will see that this sort executed in batch mode. We will see that the window aggregate, of course, executed in batch mode and even the filter operator filtered in, uh, yeah, even the filter operator filtered in batch mode. So, uh, if you, so to answer the question from yesterday’s office hours, where, uh, why is performance different? Well, this, these are the types of things that you would want to look for in your query plans in order to ascertain why performance is different for you locally.

Like most things, the answer is in the execution plan. So, uh, you should look at them and maybe take, take note of some of the details, take note of some of the operator times. Uh, I realize that there is nothing in the query plan that’s going to tell you that your, uh, windowing function was transformed to an any aggregate, uh, by SQL Server’s query optimizer. But, uh, this is just a piece of knowledge that you will have to stick on the shelf in your brain where you store your SQL Server knowledge and come back to when you see this sort of thing happening. Anyway, I hope that answers your question in a little bit more detail. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where I reckon, I do reckon, I do hereby reckon, uh, we will talk more about learning T-SQL. So, um, hope to see you there.

Anyway, thanks for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 33

SQL Server Performance Office Hours Episode 33



Questions:

* What are your favorite sandwich condiments?

* First off big fan. Also, Why does lt=1 on a query with row_number window function perform so different than = 1. I find its a trick that almost forces sql server to compute all row number data before doing something else like pushing down the predicate of a join if the window query is used as a join “table”. I tried in (1,null) and gt0 and lt2 without the same results. Thanks!

* If you could change something about the world, not related to Perf Tuning SQL Server, what would it be?

* What do we need to be aware of when enabling snapshot isolation in a large production DB? Should we expect any downtime? Thanks!

* I enjoyed hearing you beg a viewer to switch from DPA to SQL Sentry. Can I hear you say more about what DPA does wrong? I haven’t used it in years.

To ask your questions, head over here.

Video Summary

In this video, I dive into some fascinating questions from the community during our office hours session. We tackled topics ranging from sandwich condiments to SQL Server performance tuning and database management best practices. Specifically, we explored why less than or equal to one in a query with row number window function behaves differently compared to equals one, delved into enabling snapshot isolation in large production databases without expecting downtime, and discussed the shortcomings of DPA as a monitoring tool compared to SQL Sentry. If you have any more questions or topics you’d like me to cover, feel free to ask during our next office hours session!

Full Transcript

Erik Darling here with Darling Data, and boy oh boy do we have an exciting, exciting day for you planned. More exciting than you could imagine because we are going to, we’re going to have an office hours. We’re going to answer some questions, five of them, this many, and we’re going to, we’re going to see what we got in the old mailbag today. Before we go and answer questions, if you would like to give me money in some meaningful way, there are all sorts of helpful ways to do this. I’ll do that down in the video description below. You can hire me for consulting, buy some training, become a contributing member of the channel. Of course you can, you know, do some stuff for free too, like ask me office hours questions. That doesn’t cost anything. That is a, that is a free service. And if you enjoy any or all of what I do here, you can of course like, subscribe, and the most important thing is tell a friend. That is how communities work. That is how communities get built, isn’t it? You spread the good word. All right. Uh, I will be leaving the house a little bit over the next couple few months. Uh, well, I guess that’s, well, it is September, isn’t it? I mean, oh boy, I gotta, I gotta start working on some stuff. Uh, we have Dallas, September 15th and 16th. Uh, Utrecht, Netherlands, October 1st and 2nd, and past, and, uh, Seattle, November 17th to the 21st. These are all, uh, past events put on by the nice folks at Redgate.

Uh, and, uh, I look forward to hopefully seeing you at, uh, any or all of them. Uh, well, we’re, we’re, we’re on the subject of any’s and all’s. Uh, but with that out of the way, let’s do our office hours dance here. You know, let’s, uh, zoom on in and get things properly situated so that, uh, everyone can focus on these things correctly. All right. Uh, what are your favorite sandwich condiments? Well, this brings up an interesting question is what you define as a condiment. Uh, of course, you know, you have your traditional spreadable condiments like mustard, mayonnaise, and, uh, stuff like that. Uh, of which I, I only, I really just only have a preference towards mustard. Um, of course, my favorite sandwich would be a roast beef and Swiss with, with mustard. That’s about as good as it’s going to get. Uh, then you have your spreadables. You have your relishes and your, your, your hots and other things like that. They all have their place. Uh, well-made muffaletta sandwich is probably one of, one of the better things in life. Um, you know, my, my personal favorite sandwich is, of course, the peanut butter and onion on a, on a nice, nice, uh, potato bread. That’s, that’s about as good as things get for me.

But, you know, uh, I think, I think mustard would probably take the top, top spot there. Mustard is probably number one for me. Uh, good, solid deli mustard. We’re, we’re good to go wherever we go. Wherever we go, there we are. All right. That’s, oh, well, that was, that was a line, not a square. Thank you, Zoomit, for betraying me once again. Uh, first off, big fan. Well, that blows. Ha ha ha. Uh, why does less than or equal to one on a query with row number window function perform so different than equals one?

I find it’s a trick that almost forces SQL Server to compute all row number data before doing something else like pushing down the predicate of a join if the window query is used as a join table. I tried in one null and greater than zero and less than two without the same results. Thanks. Well, uh, so I’m not sure, uh, why, uh, other attempts at this, uh, did not work, uh, in the same way. But, uh, I, I, I, so two things here. One, uh, you, you have earned, I think you have earned yourself a full, uh, video explanation. So I’ll, uh, I’ll crack one of those out tomorrow.

But, um, really what the, the thing is, is that, um, when you use, uh, less than or equal to one, uh, you prevent SQL servers optimizer from, uh, trans using an any aggregate transformation, uh, in the query plan, uh, where you, if you use, so there, um, Paul White has, so I did a video about this, um, some time ago and, uh, I forget the title of it, of course, because, you know, there are a lot there, there are somewhere over 600 of these things. Uh, so it’s hard to remember the names of them all, you know, I love my children, but there are too many.

Uh, but, uh, Paul White also has a good post on any aggregate transformations, uh, where he talks about, uh, ways that SQL Server can take a windowing function and under certain circumstances, uh, can transform, uh, the windowing function into, uh, an aggregation rather than, uh, doing the full, uh, like segment sequence projector window aggregate, uh, typical query plan for a windowing function. So that’s probably what’s happening, but I will, I will do, uh, a demonstrative video about this tomorrow. So you have that to look forward to.

I will postpone all my other important work and do that because I care deeply about you. And, uh, you know, as, as much as I do want people to buy my training, I cannot, uh, I cannot bear to just tell you to buy it and hope that you stumble upon the correct thing. Uh, if you could change something about the world, not related to perf tuning SQL Server, what would it be?

Oh, I mean, there’s some obvious stuff isn’t there. You get rid of cancer, you know, you get rid of egg salad, you know, you Thanos snap government employees. Uh, I don’t know.

Some other, I don’t know. I think, I think, I think that would, that would just about clear everything up for me. Yeah.

Um, yeah, that about does it. I don’t know. Protein should be free. Socialized protein, I guess. Or alcohol or cigarettes.

I don’t know. One of those things. Whatever makes you happy. You get to pick one thing in life and that one thing is free for you. Everything else you have to pay extra for. All right.

Uh, what do we need to be aware of when enabling snapshot isolation in a large production database? Should we expect any downtime? No, you shouldn’t expect any downtime. Uh, enabling snapshot isolation is of course different from enabling read committed snapshot isolation in that you do not require, uh, an exclusive, exclusive access to the database.

Uh, when you enable read committed snapshot isolation, uh, it is most beneficial to use the with rollback immediate, uh, addition to the command in order to do that. But with regular snapshot isolation, uh, you do not require any such thing. Uh, there could of course be some hiccups along the way.

Um, you know, you, you might see increased CPU if you have a busy system and you start generating a lot of row versions, uh, right off the bat. Uh, if you have more of a slow moving system, you probably wouldn’t see much of that. Um, if you’ve enabled accelerated database recovery, you may see some, uh, growth in your user database.

Uh, also in the transaction logs, since the, uh, persistent version store is fully logged. Um, so you might see some transaction log growth in there. Uh, if your error, if your, uh, transactions suffer a lot of errors or, uh, aborts, or, um, um, if you have, uh, certain types of transactions that, uh, do not qualify for, uh, accelerated database recovery to work on them, uh, then you might see, uh, some additional, uh, transaction log growth in there.

Uh, but no, I wouldn’t expect any, anything like downtime. You might just see some increased usage and utilization and other stuff like that. But, uh, usually those things settle out over time.

Uh, if you leave it, leave it, if you leave things, uh, to their own devices for a bit. But of course we are working with databases and there is no such thing as a risk-free change. All of these changes that you make to your database, uh, involve trade-offs.

It’s up to you to make sure that the trade-offs that you are willing to make are fair and square for your workload. So, uh, always test in a development environment first and then beat the tar out of your development environment. Really smack it around.

All right. Last but not least here. Uh, I enjoyed hearing you beg a viewer to switch from DPA to SQL Sentry. First off, I don’t beg.

I, I instruct, I command. Uh, no, I’m kidding. Uh, I, I, I do really wish that people would, you know, step away from DPA, but, uh, that’s a personal thing. Uh, can I hear you say more about what DPA does wrong?

I haven’t used it in years. Well, you’ve had some happy years then, haven’t you? Um, so, you know, all monitoring tools probably collect roundabout the same set of information. Some, you know, more than others.

I do think SQL Sentry collects, uh, a bit more since it is very specific to SQL Server and DPA is more of a generalized database monitoring tool. Of course, you can use it on other platforms, uh, which may appeal to some people, but does not particularly appeal to me. Uh, and, uh, I think my biggest gripe with DPA is in the web interface.

It is very difficult to navigate around and isolate actual problems. Uh, you find yourself, like, you know, just twiddling around through snapshots and everything’s in a different place. And, like, it’s, it’s just very, I can, I don’t know, like, the way it displays blocking and deadlocks, it, it just doesn’t, it doesn’t work for me.

It doesn’t, like, it just doesn’t mentally jive with the way that I, the way that I prefer things. Uh, and, you know, not to say that SQL Sentry is a perfect monitoring tool either. Um, you know, certain aspects of the way that it, uh, displays, uh, blocking and deadlocks are very, you know, sort of unfortunate Vizio weirdness kind of thing.

It’s like, I don’t, like, there’s too many arrows to click through to see stuff. And, uh, you know, having the, the deadlock graph constantly at the bottom is not terribly helpful for me. Um, you know, I, so, like, the reason why, uh, I originally wrote SP Blitzlock and the reason why I wrote SP Human Events Block Viewer is because a lot of the information about blocking and deadlocking, uh, really does, um, make more sense when it’s sort of laid out in front of you rather than you having to click through a million different things.

And, like, the visual representations of things are just, like, I don’t know. They, they, they, it doesn’t, it doesn’t work well, uh, for me, for, for those things. For things like query plans and other stuff like that, it works very well.

Uh, DPA uses, uh, some, uh, strange interpretation of Justin Peeling’s, uh, HTML Plan Viewer, which, um, is, uh, woeful. I’m not saying Justin’s thing is woeful. I’m saying DPA’s weird, uh, implementation of it is quite woeful.

Um, things are mislabeled and, uh, things are not shown, uh, in the tool that can be very useful, uh, when looking at execution plans. Often you have to, you know, download whatever, uh, execution plan DPA has saved off and open it in another tool of your choosing to, uh, view the query plan with, uh, is, with more details. Uh, and of course, you know, neither of these tools capture, uh, what I would say is an actual execution plan.

SQL Sentries, Trace will capture some execution details, uh, about long running queries. So, like, sometimes you’ll be looking through, like, the sort of Plan Explorer-esque, um, uh, query plan, like, stuff when you open up a query plan in there. And, like, some of the steps in there will have, like, uh, CPU and duration and stuff assigned to them, which is helpful, but you’re still not getting, like, the actual execution plan with runtime metrics in there.

Um, you just have some additional information about, like, at the trace level, what, like, how long that query ran for and CPU and stuff. So, there are helpful things in there, but it’s still not as good as getting an actual execution plan. Um, you know, uh, I think, I don’t know, uh, yeah, just, it’s really mostly the way that, uh, DPA presents the data and the steps that you have to take to dig in further, uh, with DPA in order to, um, in order to find, uh, problematic things.

Uh, you know, be able to come to some, some reasonable conclusion about, um, what, what went wrong and what you can do to fix it. Uh, also, um, if, if there’s one thing that I, I really hate about what DPA does, uh, it’s the sort of canned advice that’s in the tool. Uh, whoever wrote that, I don’t think, knows anything about SQL Server.

It doesn’t seem like they have ever actually worked with SQL Server. Uh, it, it seems like they maybe got most of their information from very old blog posts and just sort of, uh, copied and pasted some, uh, inherited old wisdom, uh, into, into the tool, which, uh, doesn’t really make much sense these days. So, that’s, that’s about it there.

Uh, so, I don’t know. Uh, I, I am, I am quite anti-DPA as a monitoring tool. Um, you know, I, I still haven’t found anything that, uh, does as well as SQL Sentry with the ease that SQL Sentry does. Uh, and again, I do think that, uh, uh, whoever is in charge of SQL Sentry at SolarWinds, uh, should be sent to some sort of federal prison for, uh, their, their neglect and malfeasance.

Uh, and, uh, keeping that tool up to date with more modern SQL Server, uh, information that is available. But, uh, that’s just me. Anyway, uh, that’s about it for this Office Hours.

Thank you for watching. I do hope you enjoyed yourselves. I hope you learned something, and I will see you tomorrow, where I will do a full, uh, uh, full, uh, example of why you might see some of this stuff get weird. So, with that out of the way, I’m gonna go do something else.

Uh, mainly, uh, record that second video while it’s on my mind, so I don’t forget. And I will see you next time! Alright, thanks for watchin’, bye!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.