Query Plan Archives | Darling Data

Meet Darling: Free, Headless Fleet Performance Monitoring for SQL Server

Posted on July 24, 2026July 22, 2026 by Erik Darling

Meet Darling: Free, Headless Fleet Performance Monitoring for SQL Server

Watching one SQL Server is easy. Watching fifty is where monitoring vendors smell blood. The price is per server, per year, and it climbs every time your environment grows. Your reward for paying it: your performance data gets shipped to somebody else’s cloud, where you look at it through dashboards built by people who’ve never tuned a query in their lives.

Darling is my answer to that. It’s the new flagship edition of my free, open source SQL Server Performance Monitor. One SQL Server or five hundred, one product, no per-server tax, and your data never leaves your network.

What Darling is

Darling is a headless Windows service. You install it on one monitoring host, point it at your servers, and it collects around the clock. Nobody has to be logged in. Nothing gets installed on the monitored servers for its own storage.

It brings its own database. The installer bootstraps a managed PostgreSQL instance with TimescaleDB and runs it for you. There’s no repository server to stand up, no schema to deploy, no extra license to buy. Darling replaces the old SQL-Server-backed Dashboard edition, which needed a SQL Server of its own to store what it collected.

What it collects

36 collectors, the same shared library Lite uses: wait stats, query stats from the plan cache, Query Store, active query snapshots, blocking, deadlock graphs, execution plans, tempdb, memory grants and clerks, file IO latency, CPU, Agent jobs, server configuration, and more. Deltas are computed for you, so you see the work done between snapshots instead of staring at cumulative counters.

The store is yours

Everything lands in a PostgreSQL store you own. Not an API. Not an export wizard. Not a vendor data lake. Point any SQL client at it and query.

— Top waits across the fleet, last 24 hours

SELECT server_name, wait_type, sum(wait_time_ms) AS total_ms

FROM collect.wait_stats

WHERE collection_time >= now() – interval ‘1 day’

GROUP BY server_name, wait_type

ORDER BY total_ms DESC LIMIT 10;

Views are included for the common questions, but you’re not limited to them. It’s just tables. Join them however you want, feed Power BI, export to Excel.

Alerts without a babysitter

A real-time alert engine runs continuously: blocking, deadlocks, poison waits, long-running queries, tempdb space, long-running Agent jobs, high CPU, and servers that stop answering. Since Darling is headless, alerts go out by email and webhook (Slack, Teams, or any endpoint you point it at). Emails carry the query text, blocking chains, and deadlock XML. When a condition clears, it tells you that too. For the alerts that cry wolf, there are mute rules by server, metric, database, query, wait type, or job, with optional expiration.

An MCP server that can do things

Both editions ship a built-in MCP server, so an AI assistant like Claude can read your performance data directly. Darling’s can also write. An agent can build Custom Views, tune alert thresholds and mute rules, and onboard an entire fleet, all over MCP. Standing up monitoring for twenty servers is a sentence, not an afternoon.

Watch it from a browser

An optional read-only web dashboard shows the fleet from any browser, no install. It’s off by default and binds loopback-only until you deliberately expose it, token-gated and scoped to the network range you allow.

Custom Views and notebooks

This is the part the desktop app never had. Compose your own views over the collected data: pick the metrics, filters, grouping, and charts, or build notebook pages that mix charts with commentary. Make them by hand, or have an AI make them for you over MCP.

How to get it

Download the Darling zip, run the scripted install from an elevated prompt, and it bootstraps the Postgres store and starts the service. Add servers by pasting a list into the viewer, or over MCP. SQL Server 2016 through 2025, Azure SQL Managed Instance, Azure SQL Database, and AWS RDS are supported. Every release is signed.

Download Darling

There’s no paid version, and no locked features. If your compliance team needs a vendor agreement and a support contact on file before anything touches production, there’s a support subscription for that. The software is identical either way.

And Lite is not going anywhere

Lite, the desktop app with its local DuckDB store, is still here and still supported. Same 36 collectors, same shared brain. Darling is the answer for fleets and headless monitoring. Lite is the answer for the one machine you’re sitting in front of. Pick the one that matches your day.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Don’t Be Slack With Data Types

Posted on July 23, 2026 by Erik Darling

Learn T-SQL With Erik: Don’t Be Slack With Data Types

Chapters

00:00:00 – Introduction to Data Type Mismatch Issues
00:02:45 – Using Date Functions with Incorrect Data Types
00:06:27 – Plan Shape Catastrophe Example
00:10:38 – Martin Smith’s VARCAR50 Demo
00:11:29 – Conclusion and Next Steps

Full Transcript

Erik, big deal darling here with Darling Data and today’s video we are going to talk about how you should not be slack with data types and by that I mean you should always very carefully match your data types. It can be important both for performance and logical correctness when you write your queries to do this. And we are going to look at some examples around date time and date time 2 and stuff like that.

So with that out of the way, if you like this material, it is available as a whole video course and there is a link down in the video description where you can pick it up today at this very second. It is just available to you for $100 off down below. There are also other helpful links there if you would like to engage with me in other ways.

You can hire me for consulting. You can become a supporting member of the channel for $4 to $10 a month. It is a heck of a way to say, here is a little tip jar.

Say, thanks Erik for all the free stuff. You can ask me office hours questions. Keep that gravy train rolling. And of course, I always do appreciate as the channel grows. So if you would not mind doing some level of liking, subscribing, and telling a friend, I would be momentarily grateful for you.

Not eternally. Just… Just a couple of seconds.

Hey, look, the number went up. That is cool. Back to work. If you would like a free SQL Server performance monitoring tool, I have got one. I have been working on it for, oh, I guess, six, seven months now.

So things are maturing pretty nicely. Certainly competitive with all the paid T-SQL, SQL Server monitoring tools out there in the world. And of course, the price tag on mine is way better.

So, you know, if you are curious, you can go download it and start testing it out. And of course, if you run into any issues, have any questions, have ideas that you would like to see in the monitoring tool, just throw them up on GitHub. And my robot companions and I will respond just as quickly as we can.

They do not sleep, but I do. But anyway, let us continue our voyage through the allergic environment. Let us continue our voyage through the allergic heat death hell of June.

And I have got the ACs blaring, absolutely blaring. That is why I am not shiny. All right.

So I have created an index here on the creation date column in the comments table in the Stack Overflow database. And we are going to look at the difference in performance when we are slack with data types versus when we are not. So this is the current sort of method that you would use to flatten dates.

And notice that it, like, so I have to do a little bit of extra work here because I am just using, like, passed in string values. If you were using, like, if you are writing strings, you know, you have to, you should be careful about making sure that your strings are unambiguous and formatted in a way that the SQL Server does not have, there is no guesswork about them. Make sure that we are using the style.

Of convert that we need. 112 for dates. I think it is 121 for date time, date time 2 and stuff like that.

So make sure that you are doing these things because, or sorry, 112. That 112, that 121. There we go.

For that. So we want to make sure that we are doing these things because they help SQL Server make the best possible choices. And they help you from running into weird issues with ambiguous data. If you ever have to internationalize your audience, you will find very quickly that dates become a very murky subject.

And I am not just talking about time zones, but we will talk about time zones later. You have got that to look forward to. Woohoo.

High five. Time zones. Nothing better. Yeah. But using this method of date flattening and converting specifically to date times, we get a nice, tidy, easy seek into our index on the comments table. And all is fairly well with this query.

Now, like I said in the last video, date trunk returns a dynamic data type. So if we do not convert this from what is obviously a date time 2 based on the number of milliseconds that we have here, SQL Server will return it as a date time 2. And when we start comparing date time 2s to date time columns, the performance does get a little bit worse here.

This isn’t like the end of the world. But notice that this plan does look a little funny. All right.

It is a constant scan. We have got a compute scalar. And then we go into a nested loops join this many times to go find the rows that we care about. This is because we are being slack with our data types. We lose that nice, tidy seek plan and we get this plan with all this extra stuff to it.

We are essentially creating a row set and joining that over and over again. That is not fun. That is not the kind of execution plan you want to see.

But if we are taking a look at the data types, we get a nice, tidy, easy plan. We are taking advantage of SQL Server 2022. Like brand-new SQL Server 2025. I am actually using 2025 at this point.

I think it is finally enough cumulative updates in where I feel pretty safe running demos and everything on it. But if we run this, what we are going to hit is, of course, or rather if we run this, we will see we will go back to our nice, tidy seek plan because we are converting to a date time up here. Duh.

Date time. Good for us. And we can at least get back to the seek plan that we wanted before. So that is exactly what we want to see. Similar caution does need to be shown when assembling a date time or date time to from parts.

You have a variety of functions at your disposal to assemble a date from parts. You have date from parts, date time from parts, and date time to from parts. And if you are just throwing some strings into those, things can get rather perilous for your queries.

Just a couple of examples here. If I run these, this is the first one that is using date from parts. And we are back to this sort of weird plan with the constant scan compute scaleR and the nested loops join over to here.

I mean, it takes like 300 milliseconds, which again, this is not the end of the world. This is not supposed to show you a drastic performance change. But it is there to show you the plan shape and what you want to look out for in your queries when you want to get things right.

Notice down here, when we use date time from parts, we are back to our simple seek plan. We get a parallel plan from this, which is good given the number of rows that we are hitting. You can sometimes get parallelism with these, but the optimizer support for it is not so great.

But one thing that I want to show you is this plan, which is a real catastrophe. Right? We are going to, what I want to show you is what SQL Server is kind of doing when you mismatch data types badly, especially dates.

Right? So this is sort of the plan shape that I warned you about before, where you’ve got constant scan, compute scaleR, merge interval, and then a nested loops join to go find stuff. Right?

And this is because down in here, we created our table with a date time data type for the column. Right? We converted that column to a date. And then we asked where it was between a date and a date time 2, 7.

So SQL Server does have all sorts of stuff to do. If you open up the plan XML, this stuff doesn’t show up. This stuff doesn’t show up if you just look at the query plan.

But you’ll see stuff like this in query plans where SQL Server has to do extra work, get range through convert, get range with mismatched types. These are optimizer rules that SQL Server has built in. To try and help you or try to help queries that use mismatched data types do the right thing.

You can see where SQL Server is converting stuff and all that. So it’s extra effort for the optimizer to have to deal with your queries. This is a very interesting problem that my friend Martin Smith ran into with strings.

If you go to this link, you’ll be able to see the issue that Martin opened up here. Martin Smith, very smart fella. Incredible with SQL Server stuff.

One of my absolute heroes. And what he found was a very interesting problem where we have a VARCAR50 column collated like so. It is nullable.

And then what we would do is insert 20 null values into them. Get a count from the table. And then we would select another count from the table. And we would say where problem child equals this or problem child is null.

Now, this one is actually a little bit perilous because I verified this on SQL 22. But I said, I just started using SQL Server 2025. If it doesn’t repro here, good job, Microsoft.

If it does, you stink. Just kidding. You’re busy. You got a lot of fabric weaving to do. So let’s see if this thing still repros on SQL Server 2025. Let’s see.

It does. So we get so when we do a regular count from the table, we return the 20 rows, right? Because, I mean, just like you can see, I am very much limiting this to 20 rows going in, right?

Top 20. Generate series 1 through 20. When we do a count from the table, SQL Server correctly says there are 20 rows in there. But when we say where problem child equals, we have this Unicode string in there, right?

So we have an implicit conversion. Or it’s null. SQL Server counts 40 rows instead of 20 rows. Very interesting stuff.

And, of course, the execution plan looks just like some of the other execution. Well, I guess there’s an extra operator in there where it looks a lot like a lot of the other query plans I’ve shown you with this sort of weird constant scan concatenation top end merge interval situation over here. And then we have…

I should have highlighted the select so it stayed where it was supposed to be. And then over here we have our dynamic seek where SQL Server is doing an implicit conversion on our VARCHAR column. And it’s saying, is it this or is it null, right?

And, of course, we get back twice as many rows as actually exist in the table. So thank you, Martin Smith, for that wonderful demo. High five from wherever you are in the world.

I believe, according to his Stack Exchange profile, he lives in Rugby, England. Hopefully that’s not Dachshund. I never want to give away too much information. But thank you, Martin.

You are a wonderful, smart, brilliant human being. And I don’t know. I wish we got to spend more time together. You may probably don’t feel the same way. I understand. But I think you’re great.

Anyway, that’s enough for today. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you next week on Tuesday for Office Hours. All right.

Thank you.

Going Further

Learn T-SQL With Erik: DATEDIFF Considerations

Posted on July 22, 2026 by Erik Darling

Learn T-SQL With Erik: DATEDIFF Considerations

Chapters

00:00:00 – Introduction
00:02:15 – T-SQL Basics
00:04:18 – Date Diff Considerations
00:06:14 – Date Add and Date Diff
00:07:56 – SQL Server 2022 Improvements
00:09:35 – Time Boundaries and Chasms
00:11:31 – Data Type Considerations
00:12:58 – Conclusion

Full Transcript

Erik Darling here, with Darling Data, trying to enunciate so the robots stop transcribing my name, and stuff all weird. Stupid robots. I probably shouldn’t say that, they’re gonna transcribe that and get mad at me. Robot, if you can hear me, I apologize.

Anyway, in today’s video we are going to continue our learning journey through the T-SQL language. That’s how we talk to our SQL servers. And we’re going to talk about some considerations around date diff that I think are interesting.

This is of course just small snippets of the full course material, so if you are interested in going beyond what I’m talking about in these videos, I would encourage you to go down to the links below, where you will find a link to buy this entire course for $100 off the manufacturer suggested retail price. On your way there, all sorts of other great links.

You can hire me for consulting, you can become a supporting member of the channel for anywhere between $4 and $10 a month, if you feel so inclined. You can ask me office hours questions. And of course, another thing that I would encourage you to do.

Actually, I have two things that I would encourage you to do. One of them is of course to like, subscribe, and tell a friend. And the other one is to check out my free SQL Server performance monitoring tool. It’s all the stuff that I think is important to monitor for performance in SQL Server.

And it is all stuff that I think is pretty well thoughtfully laid out. I am working on some cool new features now that you should see bubbling up. And it’s just a real good time and a very fulfilling process building a popular community tool.

I’m over 11, close to 12,000 downloads at this point. So I’m pretty excited about that. And with all that stuff out of the way, let’s continue our summer journey here.

And let’s, of course, you know, what do you call it there? . Let’s talk about T-SQL.

That sounds like a good idea to me. All right. We’re going to go to Management Studio. And let me just do a little bit of cleanup over yonder here. So working with dates and times, if you’ve ever had a weird experience, like maybe you haven’t, you know, used the right matching data types, and you’ve had a performance issue, or you’ve hit weird bugs, there’s all sorts of stuff that should rightfully strike fear into the very hearts, minds, and souls of developers when they’re working with these things.

So we’re going to talk about just some not terribly advanced stuff, but stuff that is at least worth making sure that everyone understands when it comes to the date diff function.

There’s not terribly a lot of advanced things to say about it, but who knows where you’re starting off. So one thing that seems to get on some people’s nerves is deciding on a boundary.

So when you say, I only care about a year of data, you need to think carefully about how you do that. All right.

Do you just say date diff year minus 1? Do you say date diff month minus 12? Do you say date diff day minus 365? These things can all measure slightly different things. Depending on where you measure your boundaries.

Not so much for dates, of course, but if you have date times or date time twos, more precise data types than just a date, these things become quite important to think about.

You might even figure out how many seconds are in a year and measure that precisely if you need an absolute year from when your query runs. These are things that not a lot of people consider.

This stuff all becomes… I think much more interesting when you’re dealing with somewhat narrower spans of time, like is the last day of data just day minus 1? Is it 24 hours minus 1?

Is it however many minutes or seconds minus those numbers? You have to think about this stuff. Then I think one thing that is somewhat excruciating about date diff is just, I mean, it’s not like you weren’t warned, but if you look at these two dates, we have 2025, 1231, at 2359, 59, a bunch of 9s, and then we have 2026, 0101, and a bunch of 0s.

If we measure this very last moment of 2025 and compare it to the very first moment of 2026, there’s a whole lot of date diff boundaries that will be true and will come back with a 1.

If we… Let’s see. Let’s blow this up a little bit. The date diff between 2025, 1231, well, it says that’s a year apart, which technically it is.

We crossed a year boundary. It’s also one month apart. We crossed one month boundary. It’s also one day, one hour, one minute, one second, one millisecond, and one microsecond apart because we have crossed all of those boundaries, but the difference between all of them is, of course, 1.

And this is just stuff that you need to sort of be aware of. When you are using date diff to look at the differences between two things, the results can get a bit surprising to some people.

So you should always think quite carefully about which chasm you are attempting to span when you are date diffing.

The other thing that you can do that becomes interesting with both date add and date diff is flattening dates.

Again, not terribly advanced stuff, but there’s been cheat sheets and all sorts of stuff throughout the years published where people will tell you all different ways to figure out how to add some span of time to make sure that everything is working.

But SQL Server 2022, they put in this function called date trunk. And date trunk, look how nice and compact this is, right? We’re just going to truncate this to the last year, which beats the pants out of all of us.

All the other code that you would have to write in order to do this in the past. You would have to add a year to the date diff between the year and 1900 or 101, and then all this code, right?

It’s a lot of stuff. But the good news is date trunk makes life a lot easier, at least for truncating back to a date, right? So both of those pieces of code, this being far more succinct, give us the same thing.

We just go back to the first of the year of 2020. 2026. But getting to the end of a thing is a bit more challenging, right? So I actually opened up a support issue saying that it would be cool if date trunk accepted a third parameter where you could add whatever span of time you wanted to this.

So like, if this were date trunk year at test date time comma one, then we would add one year to test date time and go forward a year. We don’t have that currently. We may never have that because apparently fabric is more important than SQL Server.

But it is there and it does work pretty well. You can flatten with date trunk down to all sorts of wonderful things. But if you want to add time onto that, you still need to do a little bit of extra annoying stuff.

So what I’m doing here is adding one year to date trunk. Like if we had that third parameter, we could skip all this stuff and we could just say add one year to it and truncate to that, which would be so much more nice and compact.

But one thing that occurs to me is I don’t know that the people who make SQL Server actually use SQL Server. You got to keep bothering them for things.

You got to keep making feature requests and all sorts of other annoying stuff. But just to be extra precise here, 50 nanoseconds just rounds to 100 nanoseconds, right? 100 nanoseconds is the smallest unit of sort of measure that you can get out of these things.

But if you do this, then we will get us to our one year thing, right? So this will get us to the very end of 2026, 1231, right? We could have just added a year or something.

But if you just wanted to get to the very last moment of 2026, that’s how you would do it. If you just wanted to get to the first of 2027, well, of course, that’s a little bit less math, which is always a good thing.

More math and more problems, right? But you still need to think about the chasms of time that you care about. So three months, 12 weeks, 90 days, like what actual span of time do you care about?

It’s an important question because you might be either including data that you shouldn’t or not including data that you should. And these are important considerations when you are measuring times with these boundaries.

So if I run these queries, or I guess this is one query with just a bunch of things selected in it, we have what is right now, right?

So this will bring us back to the first of June, which is just about a week ago now, at least when I’m recording this. When this gets published is, of course, in the future. We have the right now date, which is how you would do this if you just wanted to convert this from a date time with all the stuff in it, or I guess that’s a date time too, technically, to just this is a date.

You can get to the end of the month with the EOMONTH function. Pretty handy thing there. We don’t have any other functions, right? It says EOMONTH. There’s no EODay, EOYear, right?

Can we just get to the end of the month? Sure. So this is the reason why I think that datetrunk should accept a third parameter is because EOMONTH does.

This is an example of adding three months to EOMONTH, and this is an example of subtracting three months from EOMONTH, and this just gets us some slightly different things. So the end of three months into the future would be September 30th, and the end of three months ago would be the end of March, right?

So EOMONTH has this neat third input that you can use, but datetrunk does not. So it would be nice if datetrunk did, right? That’d be good stuff.

Datetrunk returns a dynamic data type. So you do have to be careful with how you’re doing this, because if you put in a system function like sysdatetime that returns a datetime2 and you compare that to a datetime column, you may find yourself in rather awkward situations, both with data correctness, of course, and with, you know, like query performance can also be impacted, the query plans that SQL Server comes up with, the little optimizer rules like getRangeThroughConvert and getRangeThroughMismatchType and stuff, they often result in very awkward query plans where you have these, like, these constant scans and these merge things and then like an awful little nested loop into your table billions and billions of times.

It’s not fun. So we have to be very precise with data types. But we can use this describeFirstResultSet procedure, and what we’ll see is that in the first one where I am, where my input value here is a datetime2 and here where I’m converting that, or rather I’m converting this string to a datetime27 and this one where I’m converting it to a datetime, SQL Server will respect whatever you convert it to.

So that first one comes back as a datetime2 and the second one comes back as a datetime. So just be very, very careful because that is a dynamic data type. You can’t control it, but it is dynamic based on what goes into it.

Anyway, not a lot of fireworks in this one, but some good stuff to think about if you haven’t spent enough time thinking about working with dates. We’re going to look at some more interesting stuff tomorrow.

This is just, you know, some things that you have to say to clear the air first. All right. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I hope you will think quite carefully about what chasms of time you are hoping to span when you start date diffing and date adding. And I will see you over there. I will see you over in tomorrow’s video.

All right. Thank you. Goodbye.

Going Further

SQL Server Performance Office Hours Episode 69

Posted on July 21, 2026June 30, 2026 by Erik Darling

SQL Server Performance Office Hours Episode 69

To ask your questions, head over here.

Chapters

00:00:00 – Introduction
00:02:15 – Office Hours Promo
00:04:05 – Personal Interests
00:06:06 – TV Shows
00:07:39 – EXISTS vs JOIN
00:08:45 – Reducing Compile Time

Full Transcript

Erik Darling here, with Darling Data, for another exciting episode of Office Hours, and that is where I wake up early in the morning and I answer five user-submitted questions, and if you want to ask your own question, you can find a link to do that down in the video description, that’s where that lives. There’s a link there with Office Hours, right in the words, so it’s very easy to figure out where to go to ask a question. On your way to find that link, it’s not surprising, you will find all sorts of other helpful links, where you can hire me for consulting, you can purchase my training videos, which are arguably the best SQL Server training on the internet.

You can support this YouTube channel with as few as $4 a month. I believe you could also do up to $10 a month, if you’re feeling particularly generous. Perhaps you’ve gotten lucky with the lottery, a relative has died, something along those lines, and you’re just feeling like spreading the joy.

And of course, if you don’t feel like doing any of those things, or you are not so monetarily inclined towards me, then you can, of course, just do the usual liking, subscribing, and telling of friends. If you would like a free trial of Office Hours, I’d be happy to help you.

you can go get it and you can start monitoring the performance of your sql servers uh in a way that is just as good as what all the paid tools do i’m adding some very exciting stuff to that right now it’ll be out in the next release um hoping that’ll be this week but we have to we have to tidy some things up first have to make sure that everything is working correctly before we go and do that but uh anyway um i’m skipping the the the uh the speaking promo slide because at this point i have nothing for several months so uh if something new comes up who knows right if some exciting opportunity arises uh then by gosh i will go ahead and do that but uh for now we’re just gonna just gonna admire our strange databases going crazy in a field with allergies anyway that’s enough of that uh we need to go to the excel file and we need to uh talk about this um does the option for in-memory tempdb make sorts faster i’ve never seen that claim before well sort sorts don’t always use tempdb if they if they spill the tempdb i suppose it won’t make the spill any faster but uh it used to be on older versions of sql server you could run into tempdb contention from lots of queries running at the same time and spilling so at least that goes away but uh no no that that that doesn’t do anything i forgot to highlight that was this question here that i was just answering uh let’s see here select uh someone’s being funny ah someone from a foreign land is being funny i see that you in color all right uh select count from eric’s closet at least you spelled my name right where type equals t-dash shirt and color equal color equals black and logo equals ad well i i do not have any t-shirts that are not black and i do not have any t-shirts that do not say adidas on them so uh really it’s just and i keep all my t-shirts in a drawer not in the closet i if i hung these things up it would look insane i send my laundry out get it folded perfect squares and i put it in a drawer and everything works out pretty well um i believe i have somewhere around 20 of these that i i wear it’s at various points wear them everywhere go to the gym wear them all day when i work don’t have to think about anything it’s wonderful wonderful wonderful wonderful eric tell us about you where do you live i live in new york city that one uh what do you like to do in your spare time well uh i have some probably some rather generic interests i i do like uh going to restaurants and i like traveling i like going to museums and hanging out with my family so generically that apart from that the only the only real i think hobby that i have uh is is barbell training um which i’m equally dedicated to uh is as databases at least i hope i am so that that’s that’s about me in a nutshell i’m a rather rather simple rather simple man heard that’s the way to be all right uh oh wait there’s another one what tv shows do you like oh well that’s where things get interesting isn’t it um i spend a lot of time watching um nostalgic television uh from my youth uh like cheers uh like the x-files i recently uh re-watched all of the nanny that was a very good time friend dresser in the 90s that’s a that’s a tough one to top man uh i don’t know uh i think 30 rock is probably about the funniest tv show i’ve ever seen uh community had a pretty good run um uh yeah i don’t know stuff like that you know up my alley i like i like weird sci-fi shows been watching uh widow’s bay lately uh forget what i forget where that’s streaming on but that that’s been that’s been a nice treat that’s been it’s been a good pretty good show i think so far but that’s the type of stuff that i enjoy um i attempted to watch the last season of euphoria but it was some of the worst television that i’ve ever seen in my life so ah i just i just read the spoilers anyway let’s see here uh is it bad to nest exist statements within each other rather than having one big select with lots of inner joins not on the face of it no uh i’m okay with nesting exists i think that’s perfectly okay with me no but i’ve never i’ve never found anything that is pathologically wrong with doing that but of course you know running the query is the real tale of the tape here is if if you are able to nest your exists and and get a good fast query with a reasonable query plan then keep on doing it if you need to switch things around i understand i’ve had to do plenty of switching around in my life uh wouldn’t be the first time wouldn’t be the last time but you know just you know i think the the thing with exists is uh you know they are i think generally a little bit more sensitive to indexing uh especially uh given the the row goals that often get introduced uh with them and the optimizers uh i don’t even know that i’d call it a preference but you know when of course the difference between join and exists is that you know exists only cares if a row is there or not joins find every match in like a one-to-many relationship or a many-to-many relationship so um you know the the row goal that often gets introduced there can can certainly inflict some weird plan stuff um the other thing that i find is that if uh the stuff that you’re checking the existence of if it is uh rare data if it is data that is not regularly occurring in your database you can get uh some some pretty choppy execution plans from all that so uh of course look at your execution plans i mean you have my blessing to try these all right so let’s get started with our try these things you just you just have to do the performance testing yourself unless you choose to hire me in which case i can do that for you oh boy i have a large view about 900 lines it’s pretty large right charles barkley said that’s like one of those san antonio ladies uh with about 60 60 joins that take six seconds to compile you are lucky it only takes six seconds to compile that’s that’s like one second for every 10 joints uh the business logic makes it near impossible to break this into smaller chunks how can i reduce the compile time well um i disagree with the business logic making it near impossible to break this into smaller chunks um even in the case where you need to uh like sort of you have like junction table joins where you have to do things uh it it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s quite possible to break these things down uh the first thing i’ll say is that if if this is what you’re dealing with i would i would like the my first instinct is to say that you have chosen the incorrect vehicle for this logic um that’s that’s a that’s a lot of action for one query unless things are real carefully done really carefully done um if they’re all inner joins you could create an indexed view um if if if they’re not all inner joins you could take whatever inner joins are in there and make an indexed view and you could create an indexed view and you could create an indexed view that would at least reduce some of it you’re going to you’re really going to want to lean on the no expand hint there um you could you could force all the queries that hit it to run with option force order so that sql server does not think about uh the ordering of 60 joins but you would have to very carefully write those joins in the order that sql server that gives sql server the best execution plan for them uh you would have to look at the exit like a fast execution plan for this thing uh you would have to figure out the execution plan for this thing in the exact order that sql server joins the tables together in and then you would have to write your joins in that order you could try that weird bushy join syntax but i don’t know that that’s really going to value much of anything um it’s it’s something to think about but it’s for me even that’s it’s a dicey proposition um i i if i if the first thing that i would do um and and this is this is a good use case for uh the the ais is i would i would probably feed that query to them and and ask them uh how you could if you could break it up to some degree um i do believe that the appropriate vehicle for something like this often is a stored procedure uh where you can dump little bits of things into temp tables and then carry on from there but having never seen your your view i don’t know exactly i don’t know precisely what is possible uh under under the local conditions that you have in your database anyway that is five questions you’re welcome thank you for watching i hope you enjoyed yourselves i hope you learned something and i will see you in tomorrow’s video where we are going to continue uh the learn t-sql with eric material from my course which you can buy from the video description for 100 bucks off uh we’re gonna start talking about dates and date math and time zones and stuff so got a lot to look forward to in there don’t we sure do all right thank you for watching

Going Further

Learn T-SQL With Erik: Aligning Queries and Indexes Part 6

Posted on July 16, 2026 by Erik Darling

Learn T-SQL With Erik: Aligning Queries and Indexes Part 6

Chapters

00:00:00 – Introduction
00:00:22 – Funny Story
00:01:11 – Missing Index Requests
00:03:13 – Query Plan Overview
00:04:02 – Impact Number Explanation
00:05:20 – Getting an Execution Plan
00:06:20 – Understanding Selectivity
00:07:37 – Ordinal Position Importance
00:08:14 – Example Query Analysis
00:11:00 – Execution Plan Scans
00:12:01 – Reversing Indexes
00:13:01 – Taking Missing Index Requests
00:13:57 – Conclusion

Full Transcript

Erik Darling here with Darling Data, the one and only, the original monitoring tool mogul, just kidding, that doesn’t make any sense. In today’s video, we are going to finish up talking about the query align and design thing series of videos by talking about missing index requests. Before we get into anything else, this is where I get to share a funny story with you.

In one of my consulting engagements many years ago, I was talking to a developer about indexes. And they said, Erik, I’m going to be honest with you. And I said, this is the first time for everything, isn’t there?

And they said, I don’t really know anything about indexing. I just follow whatever the missing index requests say. And I was like, back then I thought, oh, so you just follow whatever the computer says, and now you don’t know anything about this important thing. And I was like, wow, that’s just like today.

Anyway, it’s a good thing you all got AI ready with me, so now you don’t have that problem. But in today’s video, we’re going to talk about how missing index requests will often leave much to be desired. We’re not going to do all of the content because this is, of course, a video.

It’s a small snippet from the larger Learn T-SQL with Erik course, which if you would like to buy that course, if you think this material might be worth your time and money, aside from these free passing moments on YouTube, you can purchase the course down in the video description.

There’s a coupon code on there for $100 off. I don’t know. Saving $100 is kind of cool. But there are also other links. You can hire me for consulting.

Perhaps you would like me to look at your missing index requests and make fun of them. Or give you better missing index requests, which I can do. I am capable of that. But instead of offloading that to the robots, you would be offloading that to me.

But at least I’m happy to teach you about these things. You can also buy my other training, become a supporting member of the channel for as low as $4 a month. If you just like this material enough to say thanks here and there for $4, you can do that.

You can also ask me office hours questions. And of course, as always, please do like, subscribe, tell a friend. Tell a family member.

I don’t know. If you think this content just absolutely sucks, tell your worst enemy and drag them down into the abyss with you. If you would like a free SQL Server performance monitoring tool, I have one.

Offer one. It is at my GitHub repo. The link is down again in the video description. Absolutely free, open source.

All of the performance monitoring metrics that you would ever care to have in front of you. And give your robot friends. A way to look at, because again, we are offloading everything to the machine. So let the robots figure it out.

And then you, I don’t know, I guess go do stuff. I mean, when you’re watching this, I will be at Data Saturday Croatia like tomorrow basically. So that’s fun.

I think anyway. But I’m not there now. I’m at home now recording this ahead of time so that the content flows while I’m traveling. But after that, I will be home for a bit and then I will be in Seattle in November.

So yeah, again, up in the air on how we’re going to fill that time. We might need to think of some things to do. But now it is time to continue losing our summer minds in this summer heat and talk about missing indexes.

So on missing indexes, generally, if you’re looking just at a query plan, you only see one up at the top. But there might be many. The impact number that you see is an estimate in how much it will reduce another estimate.

And that is a cost. So the impact number is really an estimate of an estimate. It’s a very, very fuzzy number.

Uses is more based on the plan cache than anything else, like actual queries using something. And the thing to keep in mind here is that they are quite opportunistic and they are not very thoughtful. Missing index requests occur as part of the process.

They are part of query optimization. And if you let the optimizer spend a long time thinking about indexes during query optimization, you would probably be unhappy at how long it takes for the optimizer to produce a query plan for you. So the missing index requests are a bit like a dating game where the optimizer is like, well, I really want an index that’s six feet, makes six figures, and has a six pack of Miller Lite Natty Ice.

Steel Reserve. But I don’t know. So it just figures out, well, I have this index, but I would prefer this index.

So it’s almost like a little bit of infidelity there, if you ask me. So you just really don’t want it to take a long time to do that. And if you see a missing index request, the best way to validate it is to run the query that is requesting it, get an actual execution plan, and then look at, if the slow part of the query plan lines up with the index that SQL Server is asking for.

In a lot of examples that I go through in my Stack Overflow database, SQL Server will ask for a missing index on like a scan of the users table, which is only a couple million rows.

And you would go from like, let’s say, 100 milliseconds to like 20 milliseconds. It’s not a big meaningful difference. You want to make sure that that index takes some meaningful chunk of time out of your query by being there.

Because indexes, they need to pay for themselves a little bit. Because once you create an index, now that index is costing you in transaction log space, in space in the buffer pool, in writes, in locking.

So there’s all sorts of costs to having that index, and you want to make sure that that index pays for itself by making queries faster. But there are some issues with missing index requests.

And one of them is that they don’t really understand selectivity at all. What you get from them is key columns based on the WHERE clause. And to some extent, like the other things, but really the key of the missing index request is just like the WHERE clause stuff that you see.

And then like anything else just goes in the includes as a mishmash. It doesn’t matter if you’re joining, sorting by it, grouping by it. It does not, SQL Server does not pay much attention to that.

Equality predicates will always go first in the missing index request, and inequality predicates will always go second. Anything else ends up in the includes. The order of columns within the equality and inequality predicate chunks, or groups, some people might call them, has nothing to do with the current predicates at all.

It just orders them by each column’s ordinal position in the table. There is a great question by a fellow named Brian Reebok, who I haven’t spoken to in a while, but I hope he’s doing well, over here.

If you ever feel like reading it, just click that link on your screen with your forehead. It’s really hard. Just click it with your nose or something, right?

You can do it. But by ordinal position, I mean this. When you create a table, the order that you create, the order that you list off your columns in, SQL Server, that is the ordinal position in the table.

So like in the POST table, these are the ordinal positions. It would make a lot of sense if I just said c.columnId. I typed it right, there we go.

If I did this, then you would see this is the ordinal position. So SQL Server will order the columns in your equality and inequality predicates by this, not by how selective anything is.

So sometimes that might work out in your favor. Because if we run this, answer count of 518, one row has that. POST type ID 1 has 6 million rows.

So if we look at the execution plan, there is indeed a missing index request. And answer count comes first, qualifies for one row. And if we created this index, we would be able to seek right to that answer count and then find the POST type ID associated with it.

So that would be cool. That would be a very efficient seek. Seeking through 6 million rows, maybe a little bit less so. You don’t know, right?

You never can tell. Other times, that might not pay off. For example, if we asked for a POST type, a parent ID of 0, which has 6 million rows, and a POST type ID of 8, which has 8 rows, SQL Server will say, Hey, give me a missing index on parent ID and then POST type ID.

Which is, again, not what we would want, right? Because if we were designing indexes, we would most likely want to have the more selective predicate out in front.

Now, this missing index request, I believe, because we have, again, we have two inequality predicates here, one on creation date and one on score.

So these are going to be grouped together. Before, we had inequality predicates, so those were grouped together. But now, these inequality predicates, this is obviously nonsense. And the reason why I say that is because not a whole lot of POSTs have a score greater than or equal to 25,000.

But every single POST in the POST table is between 2,000, 7,000, 1,000, 1,000, and 2,013, 1,231. At least in my copy of the Stack Overflow database, it is where the world ended on January 1st of 2014.

There are no POSTs beyond that date. So every single POST in this table is between those dates. Almost no POSTs have a score of 25,000 or greater than or equal to, right?

So if we look at the execution plan, we again have a missing index request. But SQL Server has said, No, give it to me on creation date and then score. Well, okay.

Let’s do what SQL Server asks. Let’s add this index on creation date and then score. So that’s what we’re doing. Here we’re also going to include owner user ID so we don’t have to deal with anything else.

I can’t remember if owner user ID was in the includes up there. I didn’t pay that much attention. But with that index in place, if we run these three queries, what we will see are three execution plans.

And none of these take terribly long. But notice that we scan through all of these. They are parallel plans.

And each one, let’s just say, they take close enough to 500 milliseconds across the board. Is this a particularly good strategy? I don’t know. Because we read 17 million rows from the table.

Because our leading key of the index, our initial seek predicate, was everything. And then we have this residual predicate come back to me, my love, where score is greater than or equal to either 5,000 or 10,000 or 25,000.

And all of those numbers are more selective than the entire creation date range in the table. But again, this is stuff we don’t want the optimizer thinking about at run time.

This is not something we’re like, optimizer, but this is a huge date range. We don’t want it thinking about those things. We want it thinking, come up with a good query plan. All right.

So, but that’s where we come in, right? Where we’re going to create, we’re going to reverse that index and we’re going to make one on score and then creation date, right? Because we are going to evaluate our data and our query.

And we’re going to say, I think our predicate on score is way more selective. And we’re going to be right, aren’t we? Are we ever wrong?

No. If we were ever wrong, it would be a terrible demo. Or I don’t know, maybe it’d be a very amusing demo for you. But now when we run these three queries, which are identical to the previous three queries that we ran, we will get much tidier, much more efficient query plans that seek and seek and seek.

And these all take zero milliseconds. Why? Because we sock to a smaller range of data first, and then we applied our range predicate, our larger range predicate second.

So when it comes to missing index requests, there are a lot of things for you to watch out for and be careful for. If anything, don’t take them literally. There are many other, there are many things where, you know, you should maybe not take them literally, but maybe take them as sort of like an instruction or something where you might say, hey, SQL Server is throwing a missing index request.

Perhaps there is something I should pay attention to here. Perhaps it is trying to hint me towards something. What’s that, Lassie? Timmy’s in the well? You go over there and like, you know, I don’t know, maybe Timmy’s in the well.

Maybe he’s, maybe Lassie’s just messing with you because Lassie knows every time she, she barks, you go and look in the well and she’s like, it’s funny to me. Dogs, you know, what are you going to do? Anyway, lots of things to take as a grain of salt with missing index requests.

Again, you may use them as a hint or an indicator that something about your query, the indexing for your query could be improved, but there are many things that you need to be careful of and many things that you should know about indexing before you go and create those missing index requests.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I will see you next Tuesday. I should stop saying that for office hours.

Hopefully I make it home alive from Croatia and all that stuff, you know, air travels. It’s been a little weird lately. But anyway, thank you for watching.

Going Further

Learn T-SQL With Erik: Aligning Queries and Indexes Part 5

Posted on July 15, 2026 by Erik Darling

Learn T-SQL With Erik: Aligning Queries and Indexes Part 5

Chapters

00:00:00 – Introduction
00:02:15 – Actual Execution Plans
00:04:56 – Parallel Plan Analysis
00:07:09 – Memory Grant Reduction
00:08:41 – Query Rewriting Benefits
00:13:09 – Logical Units in Queries
00:14:20 – Parallelism Trade-offs

Full Transcript

Erik Darling here, with Darling Data, the best SQL Server consultancy outside of New Zealand, lest we forget, lest we forget the reasonable rates that Darling Data offers in exchange for making your SQL Server faster. Fair trade, believe in that. I think it’s a fair trade. Thankfully a lot of you do too, so that’s nice. But today we are going to get back to learning T-SQL, aligning queries and indexes, because these are good things to do, right?

Make queries faster, make your queries and your indexes line up better so that things happen for you that are good and fast and performance. You’ll get there someday. Anyway, if you want to purchase the full training, remember these are just little dribs and drabs and tidbits of the brilliance of the full course. There’s a link down below where you can purchase the course for a hundred bucks off.

It’s a nice coupon, right? Because I care about you and your well-being, your mental sanity and whatnot. You might have to learn this stuff so that you stop getting outwitted by the robots, right? You get smart enough to challenge the robots when they’re full of malarkey.

You can also hire me for consulting. If you’re interested, I can help you out. If you’re just not down with the robots and you just don’t have time to learn all this stuff on your own, I have crammed my head full of this stuff for like 20 years now.

So I’ve got a lot of things in there and I can do a lot of stuff pretty quickly, right? So that’s cool too. If you’d like to support this channel with money, like four bucks a month, you can do that.

Again, link down below, somewhere down there. I forget. It’s like a subscriber member button thing. If you want to ask me office hours questions.

Like I answered in yesterday’s video. At least I think it was yesterday. It’s hard to tell these days. Everything blurs and blends together. Or if you don’t want to do any of that stuff, maybe you could find it in your heart, your withered, sad, grinch of a heart, to like, subscribe, and tell a friend.

I wasn’t wishing like coronary failure on you. I just want you to like making some jokes. If you want free, this is how much I want you to like me.

This is how desperate I am for friendship and companion. If you want the free SQL Server monitoring, you can get it from your best friend, Erik Darling. You go to the GitHub.

The link is down there. You download it. You point it at SQL Server. It tells you everything you could possibly want to know about the performance of your SQL Server. It is fantastic.

It is a great time. And again, it’s free, right? You don’t need to talk to me. You don’t need to email me. It’s just there.

I will be, well, this week, yeah, in like a few days, magic, at Data Saturday Croatia, where I will be doing my advanced T-SQL pre-con. If they haven’t stopped selling tickets, you should buy a ticket, if not to my pre-con, then maybe to the Data Saturday event.

Maybe we could hang out there, and maybe we’ll be best friends in real life, too. Who knows? We don’t just have to be best friends on YouTube. And then, of course, Past Data Summit in Seattle, Washington, November 9th to 11th, which seems an impossibly far way off, but so far as I can tell, it keeps getting closer.

So that’s cool. But now it is June. We will suffer.

We will suffer summer. That’s also a good band name. Maybe not now. The reason why I’m not a professional band namer, I guess. Anyway, we have this index currently.

It is on the post table, on the columns, owner, user ID, and then creation date. And we’re going to write a query. First, we’re going to turn on actual execution plans because these are important things to have on in life.

When we’re attempting to talk about SQL Server performance, if anyone ever tries to talk to you about performance without turning on query plans, I would be very suspicious of them.

Right? That is a strange thing to do. But this is probably a reasonable way for most people to write a query. When we look at the execution plan, we’ll notice a few things.

One, we got a parallel plan. This thing went parallel, ran at DOP 8. It spilled a wee little bit on this sort.

We went from 72 milliseconds up to 119 milliseconds. 8, 20, 30, 40-ish milliseconds of time in the sort plus the spill. Not the end of the world.

This finishes in 120 milliseconds total. Is this good? Is this bad? Is this ugly? Well, I mean, to me, it depends a bit on a few things, not PowerShell. It does not depend on that.

But it depends on a few things. Right? We get a 406 megabyte memory grant, which in and of itself is not terrible. It’s not a big deal.

But we asked for 400 megs and we still spilled. Maybe I’m not in love with that. And this goes parallel to DOP 8. So in order for this query that returns 1,000 rows, SQL Server’s best idea for executing this was a parallel DOP 8 query with 400 megs of memory behind it.

I have two problems with that. One, it’s not that fast for 1,000 rows. And two, why do we need memory for this at all?

We have an index that has all the data that we care about in order. It’s on owner user ID and creation date descending. And we are looking for owner user IDs.

We are seeking the three owner user IDs and ordering by creation date descending. Why in God’s name does SQL Server need to sort this data? It is sorted for us. Cracky.

One kind of fancy way to rewrite this query is rather than an IN clause, we could put those values that we care about into a VALUES clause.

We could value our VALUES clause. And then we could take those values and we could cross-apply out and we could correlate to them. Remember our values are aliased up here as U and the column that we’re naming these values is ID.

So U.ID is really just those three values. And if we run this, we will get a query plan that is no longer parallel.

Right? We can see all of the visual indicators of parallelism have disappeared from this plan. And it has gotten faster. We went from 120 milliseconds down to 33 milliseconds.

But notice that we still, SQL Server is still like, ah, now we got to sort that data. That seems silly to me. A bit. And that sort takes, I don’t know, whatever 33 minus 9 is.

Right? Oh. Now we did okay here. We improved things. But, SQL Server should have better ideas for these things.

One way that we can rewrite this query, to again, give SQL Server a different way of thinking about this, would be to do this.

And we could say, where the owner user ID is in select union, select union, select. Right?

And if we run this query, remember that, we went from 120 milliseconds to 33 milliseconds. We still didn’t get rid of that sort. And before I run this maybe, we could just look at how much memory this thing takes.

1,024 KB of memory. So we went down from 400 megs to 1,024 KB. That’s pretty good. Right? And so, not only do we have like, you know, this query got, I don’t know, like 60 something percent faster.

But we still have this sort of, the memory grant went way down. Like granted, the memory grant, way down. So good job there. But, if you write the query like this, we get a very interesting execution plan.

If you’re sitting at home thinking, gosh, that returned 1,000 rows very quickly, you would be right. So we went from 33 milliseconds down to 3 milliseconds. And we get a very interesting execution plan.

Where for each of those union values, we now have one, two, three constant scans. And they each produce a singular seek through our indexes.

And we have this merge join. So, normally, I would, I crap all over merge joins. But remember, from the Office Hours video, I actually, there was, one of my, one of the things I said, is that sometimes SQL Server will choose merge in order to satisfy the ordering requirements of downstream operations.

And it has, my goodness. It’s almost like I know what I’m talking about sometimes. Right? So we don’t have a single sort in this query plan.

This query plan is still single threaded. We have no visual indicators that parallelism has. That parallelism has occurred here. If we look in the tooltip for the select, there is no memory grant anymore.

The memory grant is gone. And our degree of parallelism is one. So our query went from a 120 millisecond, 400 meg, DOP8 parallel plan at 120 milliseconds to a single threaded plan with a sort still at 33 milliseconds.

The sort, the memory grant went way down. So like, we’re still cool there, right? It spilled a little bit, but you know, it was still, still pretty decent to a query plan that now has no sorts, no memory grant, still run single threaded, and is now down to three milliseconds.

That is a pretty good deal. This is an admittedly funny rewrite. I get it. I maybe, you know, wouldn’t suggest you rewrite all of your in clauses to do this, but you know, sometimes it’s, you know, jibba jab there.

One thing that I think is missing from the optimizer is just the ability for it to spot these sort of patterns and apply these sort of patterns to our query plans. Because really, it would be identical to writing this query, right?

If we said select the top 1000, U dot star, and we could, I mean, the P dot stars in here don’t mean anything because we’re selecting U dot star out here, right?

So like we could, we could do this, right? And say, look at that. The single server comes up with the same basic execution plan with the, except we don’t have constant scans in these, right?

The plan is simplified a little bit. It still takes three milliseconds. There’s still not a single sort to be seen. And, and we’re still, no memory grant, dot one. So this is, but this is just one of the things that the optimizer is sort of not good at sort of spotting and working out.

It goes for like, like almost like the lowest common denominator plan sometimes. And that’s, it’s a little aggravating. We could also rewrite this query.

This is admittedly much, much more verbose, but it, it gives us the same basic execution plan that we’ve seen in the, the, the other rewrites that have given us this plan shape with the merge joint concatenations, the three separate index seeks and the no sort, the no parallelism, and again, finishing in three milliseconds.

So a lot of the, the query performance tuning that you see, performance tuning that you might have to do in your life is recognizing, recognize patterns that the optimizer is perhaps bad at dealing with, uh, and writing your queries in ways that make the optimizer not really have a choice in how they’re going to deal with those things because you have taken control and rewritten your queries in ways that the optimizer cannot ignore.

A lot of that stuff is going to come down to breaking your query up and, and, and sometimes, you know, uh, like obviously temp tables are a way to break a query up, but sometimes breaking your query up into, uh, even like logical units within the query itself is a pretty smart thing to do.

Coming back to the way that this query was first written, right? If we run this, we look at the query plan, we look at the seek, right?

We have three, three separate seek keys in here. This is a multi-seek or a dynamic seek, and this is something that I’ve talked about before, but it essentially takes away SQL Server’s ability to preserve the order of things, uh, as it, as it goes through because we’re, we’re, we’re not just doing one seek into the index and finding one range of data.

We’re doing three separate seeks into it and we don’t, it doesn’t maintain the order when, when we do that, which is why we needed to sort out here. So even getting rid of the, the spill on this sort, we are still at 107 milliseconds.

We still have the parallel plan. Now, I don’t want to crap on parallel plans because I use parallelism a lot. Um, all the time to make queries faster. It doesn’t make every query faster though.

This is unfortunate. Sometimes it, it does, it does have overhead and, uh, that overhead does, uh, you know, the, the, the, the assembling and reassembling of rows across exchanges and buffers and, you know, sending worker threads out and having them all work and do stuff.

It can’t, it doesn’t have overhead, but that overhead is, it amortizes itself in larger queries where parallelism is a more practical, um, is a more practical element.

And, and it makes the query faster because you’re spreading lots and lots of rows out across parallel threads. In a lot of cases, when SQL Server chooses a parallel plan, it’s just not getting so many rows that spreading them out across multiple CPUs is a really good strategy.

So we end up with sometimes when a query running at max.1 can be a lot faster for the small amount of rows than a parallel query, right?

There are times when that absolutely happens like the, the one I just showed you, right? But this thing really does, the way that this query is originally written, the two problems with it are one, the in clause doesn’t allow us to, with the multi-seq does not allow us to preserve the order of the, the nonclustered index that we have, whereas other query forms do.

Anyway, few different interesting ways to rewrite a query with an in clause. You thought, you thought it was just so, so simple. Uh, the, the optimizer is not good at sort of unraveling and being able to, uh, reason better query plans for, uh, and that I think you have just learned how to do.

And, uh, one, one thing that it probably is worth saying here is, uh, these are not query tuning strategies that our, our robot friends often come up with.

And, uh, the reason for that is because our robot friends, well, well, they are terribly, terribly logical and they, they exceed at, I think mostly at, at logical situations.

Uh, they, they are not clever. They are not clever like you and I. So, we still have hope out there. On top of the fact that, um, they’re, they’re getting very expensive and companies are like, wait a minute, now that this costs money, I don’t think I like it.

Uh, anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you next week on Tuesday for office hours. You’re great. And I hope your weekend is too.

All right. Goodbye.

Going Further

SQL Server Performance Office Hours Episode 68

Posted on July 14, 2026 by Erik Darling

SQL Server Performance Office Hours Episode 68

To ask your questions, head over here.

Chapters

00:00:00 – Introduction
00:00:38 – Video Description Link
00:02:18 – Data Saturday Croatia
00:03:14 – First Up
00:04:54 – Merge Considerations
00:06:06 – Delayed Durability
00:07:35 – Adding Information
00:09:44 – Parameter Sensitivity
00:10:28 – Tire Pressure
00:11:11 – Plan Analysis
00:12:49 – Hash Join Comparison

Full Transcript

Erik Darling here, Darling Data, and it is, once again, one of the most miraculous days of the week, Tuesday. Wow. Whoever invented Tuesday, you deserve some kind of prize, medal, award. I hope you get the historical commendations that you deserve, so richly deserve. But that means it is time for Office Hours, where I answer five of the most important questions that you, the SQL Server community, have submitted to me to answer, for free.

Down in the video description, if you want to ask your own question, the link to do that is right down below, there. Just keep looking down. It’s not a trick, I promise. I’m wearing pants. You can also do other things, like hire me for consulting, buy my training, become a supporting member of the channel if you want. I hope you enjoy my endeavors and efforts here. And, of course, you can also, for free, like, subscribe, tell a friend.

I guess there’s some effort on your part still involved there, but it’s quite minimal. It’s some clicking, right? Some clicks here and there. Just a random click. If you just love free stuff, you cheapskate you, you can download my free SQL Server performance monitoring tool. Absolutely free, open source, no email, no form.

It’s not a phone home. It’s not telling me anything weird about your server. It’s just grabbing all the important performance monitoring metrics that you would ever want, right? Weight stats, blocking, deadlocks, query performance, CPU, disk, memory, all the stuff that you care about.

When you’re like, why is this SQL Server being such an incredible jerk right now? You can figure out why using my free performance monitoring tool. And if you can’t figure out why looking at pretty charts and graphs, then you can have your robot companion do it for you.

Have your robot companion friends use the built-in MCP tools to look at your performance data and give you a helping hand. Perhaps that is what you need. I don’t know. Data Saturday Croatia.

Swiftly, swiftly coming towards us. Actually, as I record this, it’ll be coming out on Tuesday the… Let’s just look at a calendar here. Let’s make sure.

Tuesday the 9th. So I will already be overseas. And then it will be this Friday coming up. So after, I guess, I record the next couple of videos, I’m going to have to edit this slide again.

This slide used to be so full of life and travel and possibilities in the world. And now it’s two, just going to be one soon, I guess. I don’t know.

Maybe I’ll just stop talking about where I’m going to be since I won’t be in Seattle until November. That seems a little silly to talk about that for five months. Ha ha.

Maybe when it gets closer. Marketing. Some crazy stuff. But for now, we are in the throes of June. Be a good band name. And we’re going to do office hours.

All right. First up, let me surround you correctly. I know lots of good reasons to avoid merge.

Okay. Should performance be one of them? Have you ever saved the day by removing merge? Yeah.

In fact, I have saved the day by removing merge. So, you know, performance, it’s always something you got to watch with merge. You know, depending on maybe the amount of data you’re merging, performance would be more of a thing.

If you are just doing a pretty stock and standard upsert statement with, like, a set of values. I don’t think performance is going to be hampered all that much. That’s just where all the other reasons kick in.

If you are merging large amounts of data, yeah, I’d be pretty opposed to doing, like, a big, like, even just, like, leaving it at, like, the upsert thing. Not even deleting stuff. Let’s just say it’s a million rows.

Right? Or rather, let’s say it’s two million rows. One million rows will get updated. One million rows will get inserted. Like, you’re still doing two million. You’re still doing two million modifications in one query.

Right? It could really blow up on you. So, is merge performance usually the first thing I think of? No.

But I will say that in most practical circumstances, I have never run into a situation where separating out the insert and the update performed worse than the merge. Right? So, on top of all the sort of, you know, getting to sleep soundly at night stuff.

First, by removing the merge statement, you may also see a performance benefit. So, let’s see. This is a weird one.

I’m using SQL Server 2025 for running integration tests. No important data is being stored. And reliability doesn’t need to be 100% guaranteed. Are there configuration options on the instance or database level that can improve performance for this scenario?

Well, so, I’m going to attempt to read this signal. Read the signal that you are sending here. And that signal is you don’t care much about your data.

I believe that’s a fair statement based on what you’re saying here. And depending on the nature of the integration tests, one setting that might help you. But it’s not a 2025 setting.

It is a 2014 setting. It might be setting delayed durability to forced at the database level. If you have multiple databases, that might be something to consider. But it really does depend on what your integration tests are running.

If they’re testing like data changes, like inserts, updates, deletes, even merge. Removing merge. Not a database or system level setting, though.

If you’re writing a lot of data, delayed durability could be helpful for you here. It essentially allows SQL Server to hold off on writing to the transaction log before. Like in this, say, I’m just going to hang on to this.

I’ll make this data durable later. That might be one. But it really does depend a bit on the rest of what you’re integrating. If it’s a bunch of select queries, it would really depend on what sort of performance issues your select queries are currently hitting.

So, one important piece of this puzzle is, aside from the strong signal you’re sending. You don’t care about your data. Is what problem you are trying to solve, right?

Like, are you having a specific performance problem with your integration tests that I just need to guess? Like, what could it be? I don’t know.

Drop MSDB. Maybe. I don’t know. But, I mean, you know. I’m going to give you a chance to ask this question with a little more detail. That’s what I’m going to do.

I’m a kind and forgiving person. I’m a merciful office hours professor. I’m going to give you a chance to add a little bit more information to this question. If you’d like.

And maybe tell me if there is a specific performance problem that you are having with your integration tests. And if there is a setting that might help solve those particular problems. Otherwise, I will be here all day talking through every single potential database and server level configuration option.

And speculating on when they might help solve it. Which I do not have the energy to do. I’m sorry.

What usually causes a query to suddenly start spilling when it never used to before? Well, this sounds. Road on a limb here.

This sounds like a fairly common parameter sensitivity issue. Or. Or.

Could also be that your query. Because the data. The data in your database has changed. To some degree or another. And you have hit one of the very famous tipping points in cardinality estimation.

Perhaps. SQL Server has started choosing a different query plan. Right? Crazy things have happened.

I think one example that I can think of off the top of my head where I saw with this. Which would fit either the parameter sensitivity or the I’m suddenly just choosing a new plan all the time motif. Would be.

Let’s say your query was always using a nested loops plan with no sort operator in it. Or maybe there was a sort operator. But. I don’t know. Maybe. Maybe things were just working out well for that sort operator.

And now you’re doing a merge join. And maybe. SQL Server is choosing that merge join stupidly. Where it has to sort data from one or both inputs. To make the merge join happen.

That would be. That would be one thing. That could. It could certainly. One illustration of the problem that you were describing here. So. My.

My guess. Parameter sensitivity. Or. SQL Server just choosing a new plan. Right? Two. Two possibilities there. Do. Do.

I see memory grant feedback kicking in. But performance still stinks. What gives? Well. Maybe it’s not the memory grant my friend. There are so many other things that can make performance. It’s like.

I painted my car blue. But it still stinks. It’s a little slow. All right. Maybe it’s not the paint job. Put it.

Put some air in the tires. So. Memory grant feedback. It’s a cool feature. Mostly. You know. It’s gotten some nice revisions over the years. But.

But perhaps the problem is not the memory grant. Perhaps you need to look elsewhere. Perhaps something else in the query plan. May. May give you a fair indicator. Or a fair warning. Of what is wrong here.

But. It doesn’t sound like it’s the memory grant. It’s like. When. When people. Don’t have a single parameter. Or local variable in a query. But they’re like.

Option recompile. I’m like. Go on. Then you go. What. What. What do you. What do you. What do you think is going to happen?

Ah. Oh. Ah. Well. There we go. Why does SQL Server sometimes pick merge joins that look absolutely terrible? Ah.

You know. I’ll be honest. It’s terrible to me. I mean. Maybe not everyone. I mean. I guess. There’s. There’s some benefit to. To orderly data flowing through your plan.

But. Man. I. I. I. I hate a merge join most of the time. Ah. But. You know. It all comes down to costing.

And perhaps the. Other requirements within the plan. Um. You know. It’s like. Sometimes SQL Server will choose a merge join to keep data in order. So it doesn’t have to sort data later. And you’re like. Oh. Okay.

But. Man. God. God help you. If that’s a many to many merge join. There. There are all sorts of strange things that go on with that. Um. Yeah. Man.

I. I fail you on this one. Um. Most of the time. When I see SQL Server pick a merge join. I’m like. Some. Something went wrong. Like. Something.

Something wrong is happening in your life. SQL. Today’s SQL Server. You. Ah. Yeah. Yeah. Ought to not do that. But. You know.

It’s a lot of testing. Right. Like. You know. Just like. Or. Like. Coming back to the orderly data thing. Right. Like.

Uh. Let’s say. You have a rather large result set. And. Uh. If you did a hash join. That large result set would become disordered. And. Then you have to like. You know.

Sort that data. And now. But if you did a merge join. You wouldn’t have to sort that data again. Or. Perhaps to support a stream aggregate. Without having to sort data. Again. That would be another reason why SQL Server might. Choose a merge join.

That’s usually what I see. It’s usually wrong. About that being the best possible idea. But. That is usually. The thought process. That at least I can identify. So. Ah. Man.

I. If you. If you were. If you were here in the room with me. I would. I would hug five you. Man. I would. Um.

Um. I’ve. I feel this one deeply. Alright. Well. Before my feelings get too intense. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video.

We’ll see you then. Where we will learn. Some more T-SQL. With Eric. That’s me. Alright. Thanks for watching.

Going Further

Learn T-SQL With Erik: Aligning Queries and Indexes Part 4

Posted on July 9, 2026June 30, 2026 by Erik Darling

Learn T-SQL With Erik: Aligning Queries and Indexes Part 4

Chapters

00:00:00 – Introduction
00:00:21 – Consulting Services
00:01:10 – Free Tools Mention
00:02:18 – Upcoming Event
00:03:16 – Tipping Point Queries
00:04:00 – Query Execution Time
00:05:35 – Query Plan Analysis
00:07:01 – Optimizing the Query
00:08:20 – EXISTS vs IN Subquery
00:09:10 – Adaptive Join
00:10:12 – Nested Loops Join
00:11:00 – Original Query Performance
00:12:24 – Conclusion

Full Transcript

Erik Darling here with Darling Data and today’s video we are going to carry on in our task which is learning how to better align our queries and our indexes. If you need help aligning your queries and your indexes, boy do I have options for you. You can hire me for, aside from watching these videos, you can hire me for consulting, do this stuff all day.

You can also purchase my training. The videos that you’re watching here are just tiny little snippets from the full course material in the Learn T-SQL with Erik course. The link to buy that for a hundred bucks off is down in the video description if you feel like doing that sort of thing and watching more videos of me. It’s crazy. You can also become a supporting member of the channel, ask me office hours questions, and I guess outside of the downstairs links you can also do other things that would make me think of you as a more useful human being.

Such as liking this video, subscribing to this channel, and forcing all of your friends. Hijack their browsers and force them to love me as well. If you need SQL Server performance monitoring, I got you covered.

There’s nothing Erik Darling won’t do for you. Maybe a couple of things. But this thing I’ll do for you. I would do anything for you but I won’t do that.

Anyway, I don’t like that song. Totally free, open source. You can see everything it’s doing. It’s free. It’s right out there on GitHub. It’s a bunch of T-SQL collectors.

They all run on a schedule. They collect important performance information from your SQL Server, put it into pretty charts and graphs, and allow you to talk via your robot companions using MCP servers to do that analysis on your performance data. The MCP stuff is all opt-in.

It is not on by default if you don’t want it broadcasting that it’s there. But it’s just, you know, gives you a different way of… figuring out what’s up with your SQL Server aside from just looking at the pretty charts and graphs and doing your own form of analysis.

So, all that good stuff. If you want to see me live out in the world and you happen to be in the Croatian area, I also got you covered. June 12th and 13th.

I will be at Data Saturday Croatia. I have an advanced T-SQL pre-con. It’ll be the material that you’re seeing here and more. If you come to the class, you get all of the T-SQL stuff. All of the T-SQL videos that I publish as part of the full course.

So you show up, you hang out with me for a day, and then you get like 100 hours of videos to go watch at home. But until then, let’s continue our maddening descent into heat brain leaking hell. I guess that’s what this is.

Maybe it’s just allergies. I get those too. The databases are just allergic as hell to everything. Especially users and developers. Just like me.

Anyway. We’re going to look at some interesting sort of tipping point queries. And this video is going to explore both rewriting queries to get better performance and tweaking an index to get better performance. So you get a twofer on this one.

Don’t say I never did nothing for you aside from all this stuff I already do for you. Anyway. We’re going to start by running this query. And we are going to use drop clean buffers.

Not because this one ends up terribly. Because the next one will end up terribly. So we’re just like this worst case scenario. This has a little go to after it.

So it executes twice. Even if you look in the messages tab, you will see this handy little message here. Beginning execution loop. Batch execution completed two times. Thank you. But the first query, it is a little bit slower. It does take about 1.2 seconds to run. And the second query takes about half that time.

And this is just the effective cache data. Right? And what’s kind of funny is it’s like when you look at these things, it’s like almost hard to spot where they really go astray. Like sure, this takes 60 milliseconds.

This takes 237 milliseconds. Somehow we end up at 922 milliseconds in the nested loops join. So the nested loops join did spend some extra time in there. I’ll talk about why in a minute.

But if you look down here, really the big difference in time. It’s not this, right? That’s about like 12 seconds different. That’s actually 60 milliseconds slower somehow, right? 237 to 295.

But this is at 460 milliseconds. Now part of that is because the nested loops join is responsible for a little bit more work than it lets on if you are just looking at the graphical execution plan. If you right click and you go into the properties, you will see this prefetch attribute assigned to your nested loops join.

This one just happens to be unordered. The same thing would happen if it were ordered. But this is just essentially telling SQL Server to go out and read a bunch of data ahead of time and get some extra stuff that we might need to make this query run and return stuff.

So the nested loops join here doing a little bit more work than in this one. We’ll forgive it though. But this isn’t really like the crappy one.

The crappy one comes. So this is looking through 2013-03-18. This is looking through 2013-03-19. And if we run this one, this is where things get demonstrably worse, right?

Because we have hit a tipping point when SQL Server is no longer willing to give us the query plan that we had before. It is no longer willing to do that key lookup. It just goes ahead and scans the clustered index.

Scanning the clustered index on the POST table for me takes about 8 seconds when I’m reading from disk. When I’m not reading from disk, it takes about 10 seconds. When I’m not reading from disk, it takes about 618 milliseconds.

I know which one I prefer. I also know that I’m pretty sure that I would prefer if SQL Server chose that lookup plan a little bit more reliably. How can we do that?

Great question. If we wanted to influence the optimizer to avoid the clustered index, we might rewrite the query like this, right? So what we’ll say is, again, sort of almost doing the same sort of self-join technique.

But we can just use an answer. We’ll say, just give me the top 1000 rows that would qualify for our original query. And just say where the ID from the outer POST table is in this list of IDs.

And this will influence SQL Server to use that same fast query. Use our nonclustered index instead of the clustered index, right? We’re going to go seek right into that bad boy over here.

Find the rows that we care about. And narrow it down to just the 1000 that we need to satisfy our query. And then go get the columns from the POST table via the self-join here.

And we return all that out. And that’s even a bit faster than either of the ones that we did before at 147 milliseconds. Now, IN and EXISTS often behave as far as the execution plan goes identically.

Often, right? But not in this case. When you have a top 1000 in an IN subquery, you look at this.

Again, the query plan, it looks like this. You see a top operator in it, right? SQL Server is like, oh, I need to limit this to a top 1000. If you do that with EXISTS, though, and I’m just going to get the estimated plan here.

Because if I run this query, things will not go as maybe they look here. The top 1000 is not, there is no top operator present in this. SQL Server will go and find all of the top 1000.

The rows and figure out which ones exist. The top is just ignored inside of EXISTS. SQL Server just throws that away.

It’s not valid to use top in there. So this does not turn out probably as you might expect or as you might have planned on it turning out. This would run for a long time and return a lot of rows.

Because we’re just essentially asking for everything from the POST table where the IDs exist. Even the top 1000 here and all of the rows that this would match. So we could do this, right?

But even this won’t turn out so great. What we’ll do is, no, I’m in the right place. There we go. We’ll say, we’ll put the top 1000 on the outer part of the query where SQL Server can no longer just dispose of it and throw it away and say, you’re not valid.

But if we run this, it’s still a little bit clunky, right? We’re back up to like a second on this. We had this tuned nicely with that in sub query.

If we’re not in a place where SQL Server might use, I should probably stop here for a moment. We get a batch mode adaptive join for this query, right? So good for us, right?

We’re on developer edition. So we’re getting that enterprise edition class for free. That’s cool. But we get a batch mode adaptive join here. SQL Server has chosen batch mode for the query.

And it said, well, I’m going to figure out. The best join strategy based on, at runtime, how many rows come out of one thing or the other. And then I will choose the correct join type based on how many rows leave here.

Great. You may not always get that. If you don’t always get that, you will most likely end up with a hash join here. And the hash join takes, on its own, just about the same amount of time.

Most of the stuff in here does still run in batch mode on rowstore. So you’re still getting just about the same improvement. Just without the join choice at runtime.

The join choice at runtime doesn’t add anything bad here. But it doesn’t add anything good here either. Batch mode makes this thing, like, still okay, but not where it was before. We did a much better job.

We could also force a nested loops join here if we wanted. And we could get down to an okay amount of time. But still 678 milliseconds.

That’s not really what we had before. If you recall. It was several queries ago with our beautiful in clause query with the top 1000 in it. This all ran in 135 milliseconds.

So that’s really more the time to beat. Everything is 600, 800 milliseconds. That’s a regression. It’s not a huge one. But, you know, it’s not really one.

We don’t tune queries to make them regress, do we? We tune queries to make them faster. It’s a crazy concept, I know.

Now, one thing that I want to point out is kind of funny about the array. The original query is… And all of the other ones are ordered by elements. Yeah.

Mouthful of marbles. Are creation date and then score descending. If we just run this query ordered by creation date and score, no longer descending on the score column, our original query still runs really quickly.

Actually, it runs faster than ever. Interesting. Well, we spent a lot of time rewriting this query to sort of have it suit the index that we had available better. But sometimes, every once in a while, you might be able to change an index.

And if we change our index definition, or rather we’re going to create a new index, I guess, to creation date and then score descending, so this fits the query that we were writing, better suits the query that we had originally, then we get the same fast execution as we did when we changed our query.

So, sometimes there are ways to rewrite your query to better suit the indexes that you have. Other times, if you have options and choices, you might choose to change your indexes up a little bit so that they better suit the queries that you have.

All right. I reached the end of the file. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you next week on Tuesday for Office Hours.

All right. Have a great weekend, everybody.

Going Further

Learn T-SQL With Erik: Aligning Queries and Indexes Part 3

Posted on July 8, 2026 by Erik Darling

Learn T-SQL With Erik: Aligning Queries and Indexes Part 3

Chapters

00:00:00 – Introduction
00:02:15 – Index Alignment and Design
00:04:33 – Query Tuning Magic
00:06:54 – Parameter Sniffing Issue
00:09:40 – Execution Plan Variations
00:11:55 – Relational Columns and Aliases
00:14:28 – Optimizing Queries
00:17:14 – Memory Grants and Indexes

Full Transcript

Erik Darling here with Darling Data, and we’re going to continue our endeavors to learn T-SQL, and in our T-SQL learning endeavors, we are going to continue looking at how we can better align our queries and our indexes. This video is actually one of my favorite demos. It’s a fun one, and it’s a good sort of mental exercise when you’re tuning your own queries and perhaps wondering why SQL Server sometimes uses your indexes and sometimes it doesn’t. So, down in the video description, you will see all sorts of helpful links. Of course, because I am a helpful human being, arguably one of the most helpful human beings to have ever walked the face of the planet.

And humble to boot. And down in the links, you will find a way to buy the full course content. All of the material that you’re seeing around the index align and design stuff is part of my Learn T-SQL with Erik course. There’s a link down in the video description where you can save a hundred bucks off the course if you buy it from there.

You can also do other things that would make me happy, like you could hire me for consulting. I can bring that up. I can bring the same query tuning magic, magic energy to your SQL Servers.

You can choose to support this YouTube channel if you feel like the content is helping you in some way that is worth money. A few is four smackaroos a month. It’s USD.

That’s American for dollars. You can do that. And you can, of course, ask me office hours questions. And if you happen to just like what I’m doing, like what I get up to over here, please do like, subscribe.

And tell every single one of your friends, whether they use SQL Server or not. Perhaps they will find other reasons to come to this channel and watch me. Maybe they’ll just be casually entertained by the shape of my head.

You never can tell. If you are in the market for free SQL Server performance monitoring, my free monitoring tool, totally open source, no sign up, no weird telemetry. Just all the stuff that you would care about monitoring if you want to figure out why SQL Server performance maybe is good, bad, ugly, somewhere in between.

Got a cool knock style dashboard. If you want to just make like just do a little sanity check, make sure all your servers are up and running and all that good stuff. And if you’re a fan of today’s robot companions, I don’t know, maybe before the prices go up, you can also use those things to do some built in MCP stuff.

And you can do some MCP server analysis of your performance data. And it’s isolated to just your performance data. It’s not going out and running weird queries on your SQL Server to look at stuff.

So, coming up in, wow, that is like, it’s like, I mean like 10 days away. I will be at Data Saturday Croatia, June 12th and 13th. And I have an advanced T-SQL pre-con there.

You can, you know, jump in there. Learn about that. Learn about T-SQL for me live and in person. And if you show up to that, you get free access to the Learn T-SQL material that we’re covering today in this video.

I will also be at PaaS Data Community Summit in Seattle, Washington, November 9th through 11th. So, you know, not quite on my birthday, but pretty close there. And, but for now, it is June and we are Juning about going crazy from the heat.

I mean, I’m not going crazy from the heat. It was like 60 degrees in New York today or something. I don’t know, whatever.

Anyway, let’s go learn something about SQL over here. Ah, that’s Management Studio. Now we got it. So, like I said, this is one of my favorite demos. I’ve got a few different things that I build off of this one-store procedure.

But right now, we have got this computed column that I’ve added to the post table. And we’ve got this index that I’ve added, which I know there’s a little red squiggle here, but I promise you this index has been created and the active period column is there.

So, anyway, I don’t know why that was highlighted. We’ve got this store procedure called top lookup. And this store procedure is doing something.

I mean, okay, so like, look, we’re doing select star here. I’ve talked in other videos about how select star is just a convenient shortcut. For a lot of this stuff.

You could list out all the columns if you wanted, and it would still be the same thing. Or you could even just put a few columns that aren’t in your index in here. And you would run into the same potential plan switcheroo.

Because no matter how many columns your nonclustered index is missing, SQL Server has to do lookups to find those columns from the clustered index. And it costs one column the same way that it costs 150.

1,024 columns. It doesn’t matter how many columns are outside of your index. The key lookup costs the same.

Regardless of the number of columns or the data types of those columns. So, I’ve got a hint on this query. And that hint is to set it to compat level 140. It’s not because, I’ll explain why.

The parameter sensitive plan optimization that came around in compat level 150 just makes this demo really confusing. And I’ll talk about why in a moment.

But for now, just understand that this procedure takes two parameters. One of them is post type ID and the other one is gap. The gap parameter works off the computed column.

This date diff year, creation date, last activity date. That works off the computed column and index that I created up here. So, let’s turn on actual execution plans.

Let’s feel fully actualized here. And the reason why I have the compat level set is because it sort of makes this demo it just muddies it up a bunch.

What happens is the parameter sensitive plan optimization kicks in. And you get one plan for the most common value. One plan for the least common value.

And then all of the post type IDs in the middle share the same plan. And it’s just really a bad situation. And it just makes it harder to explain the point of what I’m doing here.

So, to just sort of visualize that breakout if I run this query and we get these Oh, I scrolled the wrong way. We get these results back. Go away SQL prompt.

Come on. Be a pal here. We look in here. So, just to sort of explain a little bit. You get the three plan variants. Post type ID 2 at 11 million rows gets plan variant 3.

Post type ID 8 at 2 rows gets plan variant 1. And all of the post type IDs in the middle ranging from 4 rows to 6 million rows get the second plan variant.

This is not a good situation. And looking at these numbers you may start to understand why that particular feature muddies this demo up quite a bit. So, we’re not going to do that.

Now, the first thing we’re going to look at is the 9-year gap. So, the 9-year gap is very uncommon. Gap ID 9, post type ID 1. And we run this.

This returns very quickly. We get this plan here. It is a parallel nested loops plan with a key lookup to fetch all of the columns that we care about from the post table after we seek both to the post type ID that we care about and the gap years parameter that we care about.

The key lookup down here, there’s no predicate on that. If I move my big head out of the way, there’s no predicate on that up here. It is just outputting columns, right? But it’s outputting all the columns in the post table, which maybe isn’t a lot of fun.

But this takes 28 milliseconds to run and it gets 500 rows and everyone’s pretty happy. If we run this for a gap of 0 years, most posts do not have 9 years of time between when they were first posted and when they were last edited.

A gap of 0 years is much, much more common. And this query runs for about 4 1⁄2 seconds here. Excuse me.

Very dry. 4 1⁄2 seconds. And a lot of the time is spent in this sort that spills because the memory grant that we assign to the initial execution of this procedure maintains for this execution.

So that takes some time. Now, if we recompile this thing and we run this in reverse and we ask for a gap of 0 years first, this does a bit better, right?

About twice as fast plus another 500 milliseconds faster for a gap of 0 years. The plan changes quite substantially, though.

I mean, we still have a plan for a parallel nested loops join plan. But notice that we fully scan the clustered index over here.

And then we have this strange filter operator over here. And the filter operator is on both the post type ID and the gap parameter. Part of the reason for this is because the computed column that I added to the post table was not persisted.

SQL Server expands the definition of the persisted computed column. And if I were to change that to a persisted computed column, we could avoid this part.

But we would still get the same basic plan shape. Now, I think probably the biggest downside of this late filter operator, and if I’ve said it once, I’ve said it a million times, always be suspicious of a filter operator in a query plan, is that we have to fully scan all 17 million rows from the post table, drag them across the couple compute scalars, and then filter stuff out over here.

Okay? What really sucks is that reusing that plan for the gap of nine years, in other words, the very uncommon one, this used to take 28 milliseconds, and now this thing takes almost a second.

Right? And it’s obviously the same plan gets reused. Whenever I’m talking about parameter sensitivity problems, a lot of people get this big idea in their head. It’s like, well, why not just use the big plan for everything?

The big plan for small amounts of data is often somewhat, I mean, not surprising to me, but surprising, not surprising to other people who see this stuff, not a good sort of trade-off there.

It’s not a good fit. So we have, you know, essentially two execution plans. Neither one is a particularly good fit for the amount of data that we’re dealing with. So the mental exercise that I like to put people through when we’re doing this is to mentally in your head separate informational columns from relational columns.

Right? And what we’re going to do in the query below is we’re going to write a self join between the post table and itself. Right?

One alias will be responsible for the relational activities in our query, the join, the where clause, the order by, stuff like that. And the other alias will just be responsible for providing the select list.

Right? And if you can start mentally separating the duties of your queries and the columns in your queries between purely informational stuff that’s only in the select list and columns that are used for relational activities in your queries, you can do a lot of cool query tuning stuff.

This is just one of them. Right? So let me create this and then we will talk through the code just a little bit up here.

So I have the post table joined to itself. ID is the clustered primary key in the post table so we can get away with this. It is a unique column so doing this is very, very easy.

And from P1, right, that’s this one, this is all of the relational stuff. Right? P1 is there.

P1 is there. P1 is the where clause. P1 is further down here in the where clause and in the order by clause down here. The only place we reference P2 is up here in the select list.

Right? This is our star. Right? So if we run this query now, both 9 and 0 will be fast doing this. Right?

The 1 for 9 got even faster. The 1 for 0, instead of taking 4.5 seconds, takes 1.2 seconds. Right? So we’ve kind of come to a little bit better of a situation here.

Now, we still have this sort over here and this sort still spills. Right? So, you know, it does slow this query down a bit but we’re not at 4.5 seconds of sort of crappy spilling.

We’re at one point, actually it’s about 900 milliseconds of crappy spilling there. So this is a lot more tolerable. If we reverse things, just like we did before and we do 0 first and then 9, the plan actually gets a little bit better.

So in this case, the first query, not only does the sort spill a bit less because we get the memory grant for the larger amount of data that comes out of the POST table, but we didn’t finish in about 200 milliseconds.

Now we also get this parallel nested loops plan. And the same thing goes for this gap down here. So this is a pretty reasonable rewrite to get better performance out of both queries.

You might start playing some other tricks with this query if you really were getting, if you really wanted to like optimize, optimize this, you could even say like option use hint optimize for gap equals 0.

So you keep getting that plan for the 0 value and the gap parameter. And talk through some of the important differences between the original and the rewrite.

I’m going to put both of them into the same store procedure. And then I’m going to run them for the gap of nine years. And the thing that I just want to talk about here a little bit is the plan shape.

So what happens in the original query where we do the lookup is we find 4,500 rows here and we do 4,500 nested loops to do the key lookup here.

Key lookup plan, key lookups and query plans are very, very tightly coupled operations. SQL Server cannot move these around, right? SQL Server has to do this stuff all in one place at the same time.

It can put a sort before the nested loops join, it just doesn’t here. And then after we find all that data, then we do our sort. And this is where we start to sort of narrow stuff down for the top, right?

Down here in the one that we rewrote in order to, what do you call it, do the self join.

We still have the same seek, but notice that the post table immediately joins to the users table here. And what’s nice is that this sort cuts down the results to just about all the ones that we need for the top very much earlier on, right?

And then after we figure out relationally what rows we care about maintaining for this query to get the top 500, we do our nested loop.

We do our nested loops for only those 500 rows back to the post table here. And again, there is about a 20 millisecond difference between these. This isn’t big night and day performance tuning stuff, but you do see a general improvement.

And what I think is nice too is you see that general 20 millisecond improvement with the serial execution plan. In other words, you don’t need to get a parallel plan here in order for this to be competitively fast, right?

This plan up here, it runs, goes parallel, gets a DOP eight query plan. And this is one of those like, oh, well, you’re using a bunch of extra resources, but your query’s not getting faster, right?

It’s kind of a weird thing. Anyway, when you’re working with queries, especially parameter sensitive ones, one of the biggest differences that you will see in those queries aside from stuff like the type of join and the size of the memory grants and all that other stuff is going to be the sort of like index usage, right?

And if you can’t get SQL Server to reliably use your narrower nonclustered indexes because you are selecting columns that are outside of them and SQL Server now has this clustered index for its key lookup choice, it might just totally be worth rewriting the query to do a self join so that you can get all of your relational work done that narrows the rows down to just the ones you care about.

And then do the self join back to get the stuff later because the key lookup, again, very tightly coupled. When you write a self join, that tight coupling is sort of removed a bit.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we will talk some more about very similar techniques. All right, thank you for watching.

Going Further

SQL Server Performance Office Hours Episode 67

Posted on July 7, 2026 by Erik Darling

SQL Server Performance Office Hours Episode 67

To ask your questions, head over here.

Chapters

00:00:00 – Introduction
00:04:08 – Memory Grants and Sorts
00:07:59 – Suspicious Plan Operators
00:10:26 – Batch Mode vs Row Mode
00:12:19 – Conclusion

Full Transcript

Your best friend, Erik Darling, here with Darling Data, and we have another absolutely thrill-filled episode of Office Hours For You. This is where I answer five SQL Server user-submitted questions about anything you want to ask about. It doesn’t have to just be SQL Server. If you want to know anything about me, my life, aside from SQLs, I mean, I know that, like, you know, it’s cool to get free advice from someone who charges for advice about SQL Server, but, you know, if you ever just want to get to know me better, the human Erik Darling, you can ask other questions.

Anyway, my hair looks weird today. I don’t know why. It looks lumpy for some reason. I can’t quite figure that out. Ah, I’m not going to get over that. Anyway, we’re going to keep rolling here. I’m going to try to contain my embarrassment at the shape of my head, and we’re going to soldier on here.

If you would like to hire me for consulting, buy my training, become a supporting member of the channel, or continue to ask me Office Hours questions, all of the links to do that are down below in the video description. Every single link you possibly need to achieve any of these goals is down below.

And of course, if you would like to help my channel gain a larger audience, then please do like, subscribe, and tell every single one of your friends to do the same. If you would like free SQL Server monitoring, boy, howdy, I got it.

I’ve been doing a lot of internal work on that thing lately, not a lot of user-facing stuff, but stuff to make my life easier and maintaining the project, hopefully. And I am working on some exciting new stuff, which will all be revealed in short order.

But for now, if you want to download the current implementation of things, you can go to my GitHub repo. Again, this link is all down in the video notes. Video description, rather.

Totally free, open source, no strings attached. You download it, you point it at a server, you start getting useful performance information. I recently hit the 10,000 download mark, which I am very happy about. The emails about people divorcing their paid monitoring tools and using this instead have been steadily coming in, which of course brings my heart great joy and glee, because there is no reason that people should be paying for such…

much junk these days. Anyway, out in the world, Data Saturday Croatia, steadily getting closer. Boy, oh boy, just a couple of weeks away at this point.

And then I guess I’ll be doing nothing for a little while unless something interesting comes up. And then, of course, there is PaaS Data Community Summit in Seattle, Washington, November 9th through 11th. So we will have a joyous time there.

And since this will be publishing… in June, we have switched to the June image in which our database is including this faceless wonder over here. I don’t know where your face went, database, I’m sorry.

I tried to… I forget what the prompt was, but the end result just made it look like a bunch of databases were, like, in that movie Midsommar having, like, a real bad time in a field somewhere.

Yeah, I think they’re all going heat crazy or something. Let’s just leave it at that. Sort of some strange leg situations over here.

I’m not sure what happened with this one. It’s like some sort of appendage there and then over here gets a little dicier. I don’t know.

Anyway, let’s answer some questions before I get too involved in the analysis of that picture. Because, you know, try not to judge art too harshly around here. Anyway, let’s see.

First question here. How do you decide between a store procedure versus a function? Versus a view for things like WhatsApp blocks and SP Quickie store? Well, the store procedure one is easy. If I need to have a variety of parameters and variables and temp tables and loops and all sorts of…

And error handling, dynamic SQL and stuff, then I need a store procedure. I think for most other things, I would prefer to write an inline table-valued function. But I guess the design choice there is maybe…

Again, I know I have, like, WhatsApp blocks, which is a table-valued function. And I think the reason that I chose that there is because there were a very common set of parameters that I would usually pass into that.

And it was easier to sort of just remind myself of them in the definition there. I do have one that is a view called WhatsApp memory. And since I didn’t find myself often filtering on things in there, usually just, like, show me everything and sort it, I just sort of, I just left that as a view.

But there wasn’t, like, I wish that more people would use inline table-valued functions instead of views because they offer sort of better placement of parameters and stuff. So most of the time, I would prefer to not use a view because, I don’t know, then it’d be one of those view people that everyone yells at.

Anyway, when there are multiple seeks on an index, does that suggest that the index gets seeked once per seek key? Yes, I have a couple of videos about that.

It is called multi-seeks or dynamic seeks. And that does indeed mean that the index gets seeked through once per seek key, which can hide a lot of work that you may otherwise avoid with just a single seek through the index.

So careful on that. How much do sorts actually hurt? And when should I worry about?

How much do sorts actually hurt? And when should I worry about? Well, they can hurt quite a bit. Sorts are what’s referred to often as a size of data operation because you must write down all of the columns that you are selecting in order of the columns that you are ordering by.

And over a large enough result set, that can get pretty… turn out to require quite a lot of memory to do. You sort of don’t want to end up in a situation where you have very, very large memory grants stealing lots of memory away.

So that’s why you want to keep your memory away from your buffer pool, because you know, like I said in many videos, most servers that I look at are undersized from a memory perspective. And you know, they need a lot of help to sort of balance the data to memory ratio out.

But as far as when sorts start hurting, well, you know… So let me go through this in a little bit of mental detail here. So obviously large sorts are the obvious one that you would want to track down and sort of look at, right?

you can look at the plan cache or a query store by the size of a memory grant that a query is getting. The plan cache, for some reason, offers more details about how much of the memory grant the query actually used, but you can still look at the size of the memory grants in both. You would look for anything using a large grant and see if there’s a sort operator in there.

That being said, in versions of SQL Server older than 2019, where you don’t have the in-memory TempDB stuff available to you, even small sorts could hurt and cause TempDB contention. I saw this on servers where there was a little execution plan, and the sorts would sometimes get an undersized memory grant and just spill a little bit, but if this query ran a ton altogether concurrently, then all of those sorts would cause TempDB contention. If you’re seeing that, then…

You could address that either via indexing, maybe if you’re on 2019 plus the in-memory TempDB metadata feature helps quite a bit with that, but large sorts are a bit easier to pin the tail on that donkey. The small sort TempDB contention thing, that’s a very high concurrency problem that you might have to deal with. I think in general, I’d probably just look for the big sorts first because those are the ones that are most likely to cause issues with the workload because, A, they’re going to be stealing memory from the buffer pool, and B, refilling that buffer pool space, knocking a bunch of data pages out of the buffer pool, bringing them back in later.

That’s going to cause a lot of variance in your query performance. What plan operators instantly make you suspicious something is wrong? Well, in select query…

I would say spools of any variety make me think, I could do something better here. But very specifically, the eager index spool or the lazy table spool, aka the performance spool, those are two things that give me pause on every single occasion when I see them. Again, not because the spool is evil. The spool is just trying to help. It’s Microsoft that’s evil because they haven’t touched the code in spools, and since SQL Server 7, not 2007, SQL Server 7, not 2007, SQL Server 7, that is the 90s, and so spools have not benefited from any of the sort of improvements that like other temporary objects that when in tempdb have had like temp tables and table variables and stuff.

So spools are one of them. In a modification query of any variety, spools are usually there for Halloween protection, so they don’t catch my eye as much. There’s those. I think the sort of constant scan, like merge concat, like sort into a seek thing, that’s one that always catches my eye because that usually, that either means that you have a join with an or clause, or maybe you have like a mismatch data type, like you’re comparing dates and date times. Like maybe you have like a date time column or a date column, and you’re comparing it to a non-precise match to a date time column.

Or a temporal data type, like a date or a date, right? So I guess I would go for date time too as well. That’s another pattern that I look for. Parallel merges, parallel merge joins, that’s another big one, stuff like that. So top above a scan is another pattern. So it’s not often a single operator outside of spools. Often it’s a pattern of operators that I see that make me suspicious.

Is batch mode always better, or are there workloads where row mode wins? For me, batch mode is really only better when you have a lot of rows that you need to do something with. For me, batch mode is not necessarily better if I’m doing very tiny OLTP-ish things, little seek point lookups. There’s just not a lot of benefit to batch mode in there.

You know, batch mode used to unlock a lot of extra optimizer stuff that row mode has slowly… been inheriting. So for me, no, batch mode is not always better. Batch mode is a very specific optimization for large scans, aggregations, hashes, hash joins, hash aggregates, stuff like that.

You know, I guess the adaptive join thing is nice, but you know, if you’re just working with like sort of reliable OLTP data where there are not large skew-sensitive portions of the data, then batch mode doesn’t really offer all that much as far as, you know, benefit goes. Anyway, that is five questions. One, two, three, four, five. I did it. Counted right this time. Good for me. Five fingers, five questions. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we’ll talk a bit more about aligning and designing indexes and queries. All right. Thank you for watching.