SQL Server Performance Office Hours Episode 19

SQL Server Performance Office Hours Episode 19



To ask your questions, head over here.

Tell me about the perils of using SNAPSHOT for writes. Is learning how to deal with conflict detection really worse than the pains of pessimism?
Why SqlServer doesn’t have parallel rollback ?
How can you be for SORT_IN_TEMPDB and for Accelerated Database Recovery? If tempdb is good for performance, then isn’t ADR bad for performance?
Your experience says to prefer the legacy cardinality estimator. What is fundamentally wrong with the new estimator?
You previously discussed why you use SORT_IN_TEMPDB. Would you default to that in Azure SQL Database given the lack of control over tempdb (including sizing) in that product?

Video Summary

In this video, I dive into answering five questions submitted by my YouTube community during an Office Hours episode. We cover a range of topics, from the practical aspects of using Snapshot isolation in transactions to the more technical question about why SQL Server doesn’t support parallel rollbacks. Additionally, we explore the differences between the legacy and new cardinality estimators, discussing their performance and personal preferences based on testing. The session also delves into the use of Sort operations and TempDB in Azure SQL Database, weighing the pros and cons despite potential limitations. Whether you’re a seasoned SQL Server professional or just starting out, there’s plenty to learn from these real-world questions and my insights.

Full Transcript

Erik Darling here with Darling Data and, well, we are background folk again, aren’t we? We’ve got a dumbbell, barbell behind us. It’s not a dumbbell. Dumbbells don’t bend like that. Barbells bend like that. And we’ve got a Darling Data logo, so that must mean we are doing another Office Hours episode where a helicopter of some sort. Hopefully it’s not a drone strike. Microsoft’s finally after me. Anyway, we’re going to do an Office Hours episode where I answer five entire questions submitted by you, my adorable users. If you want to submit your own question, you can go to this link, which is down in the video description. If you would like to support my channel, if you’re like, wow, this man deserves to get paid for all the work he does, well, you can do that also down in the video description. If you don’t feel like I deserve to get paid, maybe I am well deserving of a like or a comment or a subscription to the channel. I don’t know. I think we’re up to still around 60 paid subscribers and a little over 7,000 paid members and a little over 7,000 unpaid subscribers.

So I think a few of you might like the channel. All right. If you need help with SQL Server, health checks, performance analysis, hands-on tuning, dealing with performance emergencies and whipping your developers into shape so you have fewer emergencies. Well, I happen to be pretty good with a whip. Just saying. Doesn’t have to end there. If you would like to buy my performance tuning content, you can get all 24 hours of it for 75% off. That is about 150 USD and that is for life or 8 life. You can do that with the link down in the video description. And of course, I have a new T-SQL course with me, Eric. Ignore that. I need to fix that at some point. I’ll remember to do that someday. Funny story with Podia. But anyway, it’s on pre-sale price now. 250 bucks. It’s going to go up to 500 bucks when the advanced material drops after the summer. If you’re attending Kendra and I’s past pre-cons, you get access to all of the content here with the price of admission. So that’s a nice deal for you.

If you want to catch me live and in person, I will be at all three of the Pass On Tour events. That’s New York City, Dallas, and Amsterdam taking place August, September, and October of this year. And then, of course, at Past Data Community Summit taking place in Seattle in November of this year. So you could see me four times this year if you were really ambitious.

I don’t know. That might be too much for both of us. I know how social you people are. Let’s not push it. Anyway, let’s go answer some of these questions. Has anything ever been less useful than the Dropbox badge that shows up here? Like, there’s nothing useful about that. Nothing useful about that has ever happened.

All right. Anyway, let’s start here. That’s right at the very top. Let’s see. One, two, three, four, five. All right. Five questions. Tell me about the perils of using Snapshot for rights. Is learning how to deal with conflict detection really worse than the pains of pessimism?

Well, you know, it does depend a little bit on your, I guess, skill and comfort as a developer. Like, the main peril for, like, using Snapshot for rights is you get errors if you try to update, if two queries try to update the same thing in a Snapshot transaction. If you are cool enough with dealing with those errors, then it’s not a big deal.

You know, of course, the pains of, you know, non-snapshot rights can be, you know, under most isolation levels be like lost updates, right? Like, you know, like, like one query could do something and like another query could immediately overwrite it and that’s not a good time. So, really, it just, you know, it’s just kind of like picking your poison.

Like, like, what’s a bigger problem? If you’re cool with dealing with the errors that come along with the, like, like right conflicts, then cool, go with it. I’m not going to try to talk you out of it.

Me, personally, you know, really depends on, like, for me, you know, it’s more of like an application, like, like expectation issue. Like, like, like, like, like what, what, what would make an, what would make most sense to the end user? Like, what, what is the most sensible end result of two queries trying to update the same thing?

Is it one query failing or is it one query overwriting what the other query just did? Like, really just comes down to that for me. Let’s see here.

Why SQL Server doesn’t have parallel rollback? I don’t know. That’s Microsoft. Do I look like Microsoft? I can’t tell you these things. They didn’t, they didn’t implement it.

It’s doable. I don’t know why. Maybe it’s hard or something. I don’t know. Why don’t you go work for Microsoft and put it, write it into the product if it means that much to you. All right.

How can you be for a certain 10 dB and for accelerated database recovery? If 10 dB is good for performance, then isn’t ADR bad? Are you drunk?

These are… What? Huh? This doesn’t even make sense. I can’t answer this.

This is… It’s mind-blowing. Anyway. Ah. I’m just going to forget that. I’m going to start drinking after that one. Maybe I am too sober to answer that question.

I should get drunk and try to reread that one. Your experience says to prefer the legacy cardinality estimator. What is fundamentally wrong with the new estimator? Well, fundamentally, it doesn’t estimate things as well most of the time.

You know, like when I’m writing demos for, you know, my classes and, you know, for my videos and all that other stuff, you know, I always give both estimators a chance to see which one does a job that, you know, I am happier with. And just probably like 75, 80% of the time, it is the legacy cardinality estimator that does the better job. The default cardinality estimator, or as Microsoft calls it, I don’t call it that.

I just call it the new one because that’s all it is. It’s new. Most of the time, the new one, meh, just doesn’t do it for me.

It’s either like a guess that’s close enough to legacy or it’s a guess that’s way wronger, way more wronger-ish than legacy. So, you know, I don’t have any specific things to, like, show you these differences. It’s just, you know, just a general testing that I’ve found.

It’s just not quite as good. You previously discussed why you use Sort and TempDB. Would you default to that in Azure SQL database given the lack of control over TempDB, including sizing in that product?

Yeah, I think I still would, or rather I still do. I don’t really see a need not to. Honestly, I can’t think of a good reason why either of the things you mentioned would prevent me from doing that.

You know, like, the stuff, the limit, the TempDB limitations that I really care about have nothing to do with that. Like, I think both Managed Instance and Azure SQL DB, like, neither one of those still allow for the in-memory TempDB metadata, and sorting in TempDB would have no effect on that.

If TempDB performance, like, if you test it and you find TempDB performance is worse for creating indexes or whatever you’re doing with indexes when you sort in TempDB, then certainly stop. But, like, for me, from just, like, a general, like, I’m going to create this index perspective, I would still prefer to sort in TempDB, regardless of the locality of my database, unless testing proved otherwise.

There may even be times on-prem when a sort in TempDB would be like, hey, why is this slow? I don’t know.

TempDB is created on, like, an old pile of boar’s head Swiss cheese. It’s on some rye bread and salami in there. It’s like, I don’t know.

TempDB sucks, don’t go there. If it’s okay, go there. You know? It’s like, again, it comes back to, like, the public restroom metaphor for TempDB. If you open the door and you don’t like what you see, close the door.

All right? Walk away. Go pee behind a tree or something. Anyway, thank you for watching. Thank you for sending in questions, by the way.

I hope you enjoyed yourselves. I hope you learned something. For the person who asked the question about accelerated database recovery, I hope you have sobered up by now. Perhaps you could restate that question in a way that a sober person could understand.

Not that I’m sober. I’ll, like, permanently. Just when I do these, I tend to be. So perhaps you’re just on a different wavelength there. Anyway, thanks for watching.

I will see you in another video. Doing another thing, I suppose. Makes sense then. All right. Cool. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About TOP WITH TIES In SQL Server

A Little About TOP WITH TIES In SQL Server


Video Summary

In this video, I dive into the concept of `TOP (1) WITH TIES` in SQL Server and address some common confusion around why it might return more rows than expected. Erik Darling from Darling Data explains how the lack of an appropriate tiebreaker can lead to returning all matching rows instead of just one. I walk through a practical example using a hypothetical post table, demonstrating that without an order-by clause or a proper tie-breaking column, `TOP (1) WITH TIES` will return all rows that match the criteria, not just one. The video also covers how to use window functions like `DENSE_RANK()` to better understand and control which rows are returned when using this query construct.

Full Transcript

Erik Darling here with Darling Data and in today’s video I’m going to attempt to answer a question because I posted a video where I described using top one with ties a little bit some time ago and some people still didn’t get it. Did not understand why top one with ties would return a lot of rows sometimes. some ties. So we’re gonna talk a bit about that. I don’t know. Honestly, it’s Saturday here and I’m not feeling terribly creative and I just need something easy to do right now. So screw it. If you like this channel and you would like to support my endeavors to bring you usually very thoughtful, energetic, SQL Server content with the occasional screw it, you can sign up for a membership. There’s a link down below. It says join or become a member or something. I forget what. I don’t watch these things. You crazy listening to my own voice, seeing my own face. Woof! Why would I put myself through that? If you would like to support this channel in some other way, perhaps $4 a month is just too rich for your blood, you can like, you can comment, you can subscribe. And of course you can ask us to ask me questions for free privately that I will answer publicly during my office hours episodes where I answer five user submitted questions at a time. You can ask whatever you want. I don’t care. If you need consulting help with SQL Server, still powerhouse number one. SQL Server consultant outside of New Zealand. Beer Gut Magazine says so, so it must be so. Whether you need health checks, performance analysis, hands on tuning, dealing with SQL Server, dealing with SQL Server, dealing with SQL Server performance emergencies, or teaching your developers to not be such dimwits so you have fewer performance emergencies, all of these things become possible through yours truly at a very reasonable rate. So, get at me with that. Anyway, if you would like to get some performance tuning training content from me, you can get all 24 hours of my currently available stuff at that URL.

Well, with that discount code, it comes down to about 150 USD and that lasts you for the rest of your life. There is no subscription necessary. If you would like to get in on the presale price for my upcoming T-SQL course, you can get it now for 250 bucks. That will not last forever. And if you wait and it goes up to 500 bucks and you’re like, hey, can I get a discount? The answer is no. You missed out. You will have had months to do this. I urge you to do it now rather than later. When it will cost you twice as much. This is, of course, companion material to the pre-cons that Kendra Little and I will be teaching in Seattle this November about T-SQL. So, if you’re going to attend those pre-cons, you get all this stuff for free. Well, not for free. You get it with the price of admission, which if your company is paying for it, that basically makes it for free.

But this is work-related stuff, so I would hope that your company would pay for or at least reimburse you for buying this. All right. It would be kind of crazy to not. Anyway, speaking of leaving the house, I will be on tour with Redgate all summer long. I feel just like Lars Ulrich. I think that’s how you say his name.

New York City, August 18th to 20th. Dallas, September 15th to 17th. Then Amsterdam, October 1st to 3rd. And that all leading up to the main event at the Past Data Community Summit, Seattle, November 17th to 21st. Come hang out. Watch me be a SQL Server monkey. Live and in person.

With that out of the way, though, let’s talk about these top one with ties. And don’t worry, we’re not going to discuss anything lascivious in this video having to do with ties. We’re not going to say any dirty stuff like foreign hand knot.

Because we don’t tie ties like we’re going to our 8th grade dance. We’re adults. We use half Windsor’s because we are grown people with necks. We’re also not in a talking heads video.

So in the post table, for post type ID 3, there are, I think, 167 rows. So if we run this query, we will get back all 167 rows of post type ID 3. All right. 167 right there.

And then the way that top one with ties works, and actually I should show you this first. So the first thing I’m going to do is run this query without the order by. And honestly, this is perhaps something that should happen with any query where top is involved.

But we actually get an error when we do this. The top end with ties clause is not allowed without a corresponding order by clause. Perhaps this should happen for any top that we use.

I don’t know. But the point is that if we order by post type ID, and we’re filtering to post type ID 3, we’re only going to have post type ID.

So there is no tie breaker when we do this. Is there no semicolon there or there or there? I don’t know. I feel very foolish now. But when we do this, we still get back all 167 rows.

If we make that a little bit wider and we scroll on down, without some sort of tie breaker in place, there is no tie to break. This is all post type ID 3 going out through all the results.

So there is nothing to break our tie. We could break the tie very early if we added in a unique column to the order by. So the ID column is the clustered primary key in this table, which means that it is all unique and every row is unique.

And so we don’t get past the first row when we do this. This just says one row. Cool.

We broke the tie early. There were no ties after that. There were no duplicates in ID. So there are no ties after the ID broke the ties very, very early. We could also add in sort of a late tie breaker, right?

So if we run this and we say top, like select the top one with ties, and we order by owner user ID, we don’t get back 167 rows anymore. We get back eventually after waiting some indeterminate amount of time, we get back 164 rows.

So if we come down here, we will see that we only got 164 instead of 167. There are three missing rows here. Now the owner user ID for all of these is negative one, which if you’re keeping track at home, I think that’s the ID for the community bot for Stack Overflow.

So this is all negative one. The only thing that we can infer from this is that something different than negative one eventually happened, and then we broke the tie there, and then we got nothing further back, right?

So we were able to not return three rows that happened after the tie was broken. If you want to see what breaks the tie, you might want to consider using the dense rank windowing function, where you will see what dense rank does is ranks everything together where the partition is the same, and then as soon as the partition resets, then we get a new one, right?

So rather than like row number, which will give us a contiguous number going up, and rank, which will give us like a weird like broken set of numbers when the new, if there are any ties in a new ranking starts, dense rank gives you contiguous numbers, right?

So if we run this, and we say I want the dense rank of all this stuff partitioned by post type ID, ordered by owner user ID, and then we order by owner user ID on the way out, this first column is our dense rank.

So like I said, unlike row number, which would give us like contiguous numbers counting one through whatever until we got to a new partition thing, this just gives us all one, right? And this kind of makes sense for how the tie is broken, because eventually after all these ones, we get down to a new owner user ID, and then we get 234.

Now if we used rank, this would give us non-contiguous results after the tie. I think there’s a start of like three or something. Actually, let’s just find out real quick. All right, we’ll just take the dense out of that, and I’ll show you.

All right, that makes the most sense to do, right? So if we run this now, oh no, it goes 165. So we had 164 rows of one, and then we have 165, 166, and 167.

Boy, was I silly. So learn T-SQL with Eric. So we have these numbers in here.

So if we put the dense rank back, which is what we wanted to do here anyway, then we will get back all the ones, right? And then 234.

And of course the 234, if we scroll across a little bit to owner user ID, we’ll see that’s where we started getting new values back. So this is what top one with ties does. As long as your ordering elements continue to supply the same value over and over again, you will continue to get rows back until you reach the end of the result set.

Or, I mean, presuming in this case that you never return a tiebreaker like we did with just ordering by post type ID, you’ll just keep getting rows back until you reach the end of the result. Only when you add in a column that eventually breaks the tie do you stop returning ties because the tie has officially been broken and you have found all of the ties available.

So anyway, I hope you enjoyed yourselves. I hope you learned something. I mean, I just spaced on the rank function. That’s my bad.

Oh, man. I’m still going to publish this too. That’s where I’m at. So this is how top one with ties sort of works. And this is how you can get lots and lots of results back if you do not have an adequate tiebreaker at some point in your query.

What that adequate tiebreaker is is between you and your database. I can’t tell you what it should be. I can’t tell you what you should use there.

All I can say is use your best judgment. All right. Cool. Thank you for watching. Goodbye.

Bye. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Subquery Unnesting In SQL Server

A Little About Subquery Unnesting In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Forwarded Fetches In SQL Server

A Little About Forwarded Fetches In SQL Server


Video Summary

In this video, I delve into the fascinating world of SQL Server heaps and forwarded fetches. Starting off with a brief detour through my upcoming speaking engagements and consulting projects, including Zoomit’s recent webinar, I transition to explaining why understanding these concepts is crucial for database administrators and developers alike. We explore how rows in a heap can move around due to data growth, leaving behind “forwarded record pointers” that point to their new locations—a phenomenon that doesn’t occur with clustered tables. By creating a sample table and inserting a specific number of rows, I demonstrate the process of updating these rows and observe the resulting forwarded fetches through DMVs (Dynamic Management Views). This hands-on approach helps clarify how nonclustered indexes inherit certain information from the base heap or clustered table, adding depth to our understanding of SQL Server’s internal workings.

Full Transcript

Your friend, Erik Darling here with Darling Data. And we’re going to take a small break from the T-SQL preview material to talk about… Well, I mean, it’s all SQL Server here all the time, isn’t it? We are not traitors to our beloved SQL Server. I don’t know, maybe someday I’ll get bored enough to do some videos about like DuckDB or something. I don’t know. Well, my luck, you know, I do a video about DuckDB and then like Databricks would buy them for $10 billion. Uh, I don’t know. It seems, it seems like the fastest path to becoming a billionaire these days is just to fork Postgres and do something.

So, uh, perhaps, perhaps my business plan for 2026 will be fork Postgres, do something, become billionaire. Seems pretty logical to me. Anyway, uh, today we’re going to talk about something special that happens with, uh, heap tables or, uh, tables that do not have clustered indexes on them in SQL Server called forwarded fetches. Uh, and we’re going to try to answer the question of why SQL Server uses forwarded fetches rather than, you know, well, something else.

Uh, so, I don’t know. Buckle up for that, I guess. We’re here. We’re here. We’re here for it, aren’t we? It’s the whole reason we woke up this morning just to do this. Uh, if you like this channel and you would like to sign up for a membership, you can do so, uh, in the, in the, in the, in the links down in the video description.

So, uh, you should, you should use it. Probably. Uh, if you would like to hire me to do some consulting work for a SQL Server, uh, I, I do, I do that. And of course, according to the, the beer gut magazine magic quadrant, I am still the number one SQL Server consultant in the world outside of New Zealand.

So, uh, we’re, we’re, we’re still, we’re still very proud of this, this magic metric. Um, if you need a health check, performance analysis, hands-on tuning, dealing with SQL Server emergencies, and of course, uh, training your developers, whipping them into the fine shape so that you, you don’t have SQL Server performance emergencies anymore, and you can, you can finally see your family on the weekend.

Well, I’m, I can, I can do all that stuff. And as always, my rates are reasonable. Uh, if you would like to buy some, uh, performance tuning content from me, I have about 24 hours of it. Uh, it’s available for 150 US dollars of just about, or thereabouts, when you use the 75% off coupon code right there.

This is also fully assembled for you down in the video description. Uh, if, if you would like to, uh, get, get in on the presale, for my new T-SQL course, T-SQL with Eric.

It’s a me. Uh, you can, you can buy the, at the presale price of $250. That will go up, uh, after the summer when, when the advanced material, uh, becomes available. Um, and then it’ll be $500.

This is, of course, companion content to the pre-cons that Kendra Little and I will be teaching about T-SQL at Pass Data Community Summit in November. And speaking of speaking, I am doing a whole bunch of speaking.

Uh, but Breadgate is kind enough to, uh, hire me as a, hire me as a full-time roadie on their, their Pass On Tour event. So I’m going to be helping them load in and load out. And, uh, they, they’ve apparently given me an empty room to, to just talk to myself in.

So if you want to come hear me talk to myself about SQL Server, you can do so in New York City, August 18th to 20th, Dallas, September 15th to 17th, and Amsterdam, October 1st to 3rd. And then, of course, for the, the, the, the Grand event, uh, the, the Grand Bon event, not the Grand Mal event.

That’s not, those aren’t good. Uh, Pass Data Community Summit taking place in Seattle, November 17th to 21st. Uh, and with that out of the way, let’s talk about these heaps and these forwarded fetches and, I don’t know, whatever else comes up along the way.

So we’re going to do a little bit of talking first, and then I’m going to, I’m going to demonstrate some things for you. Because what’s, what’s life without demonstrations? One, one wonders.

One wonders aloud about these things. Uh, so with heaps, all right, that is a table that does not have a clustered index on it. And it may have nonclustered indexes on it. And it may even have a non-clustered primary key.

It may have unique constraints. It might have foreign keys, but it is still a heap. It is still an unorganized clump of pages, uh, with no clustered index. SQL Server still requires a way to uniquely identify each row in the heap.

Uh, this is an important thing in database. This is an especially important thing for SQL Server to do things like, uh, lookups, right? When you can still create a nonclustered index on a heap, on a few columns, and you might have a query that, that queries those few columns, plus a bunch of other columns.

And then SQL Server still has to go get those columns from somewhere. And, and SQL Server needs a way to identify which rows to go look up and, and provide those columns for. Um, and, and to do so, it uses something called a RID, a row identifier.

Wow, that is an apt name for something. Remember when Microsoft used to be good at naming stuff? Not so much anymore.

Uh, but that RID is based on three components, uh, three metadata components about, uh, the row itself. That is the file ID of the file that the row lives in. Uh, the page ID of the page that the row lives on.

And the slot ID of the page that the, the row lives on more, more, more specifically. Um, so, uh, we, we have all that going for us, right? Uh, Microsoft has called this, this storage mechanism very efficient, or, uh, incredibly efficient, or some modifier plus efficient.

And I forget the exact wording they use. Uh, but when you have a unique clustered index or a clustered primary key, the key of that index is used instead. Right?

Of a, in place of a row identifier uses the key of the, the clustered index. Uh, when you have a non-unique clustered index, then the key plus a unique-ifier. Unique-ifier is the single hardest word in the English language to spell. Uh, the day that they throw that out at one of those national spelling bees, I’ll start believing that they’re actually using.

They actually mean something. Um, a unique-ifier is a four-byte integer. Uh, Microsoft was a real cheapskate on that one.

Uh, because you can run out of them. But the unique-ifier is a four-byte integer that gets added to rows when a non-unique value is inserted into your clustered index. Right?

So, uh, if you, if you have a, uh, uh, uh, if you decret, rather, if you create your clustered index is, without saying it’s unique, but you never add a unique value, then you don’t have to worry about the unique-ifier. Right?

You maybe should have called it unique, but you didn’t. You know, you forgot the, you forgot the word. Uh-oh. Uh, and so you have this, uh, this potentiality for a unique-ifier getting added. And like I said, the unique-ifier is an integer.

Four bytes. And you can run out of them. There’s even an especially devilish message, uh, that, that tells us about this. Uh, and, uh, that message is, of course, message ID on the, the triple six.

If Zoomit will wake, wake the hell up. Uh, and that message says that the maximum generated unique value for a duplicate group was exceeded for index with partition ID. Uh, that is, that is a big int, isn’t it?

I-64. Uh, and, uh, dropping and recreating the index may resolve this. Otherwise, use another clustering key. Just upend your whole life, ruin your whole weekend, because Microsoft couldn’t be bothered to use a big int. Thanks, Microsoft.

They’re really great at ruining weekends, aren’t they? Those of you with availability groups know this especially well, don’t you? You’re particularly keen on, um, just how many nights and weekends have been ruined by your high availability solution. Uh, but anyway, the point here is to sort of answer the question of why, the, why SQL Server uses forwarded record pointers when rows move around within a heap.

Right? Uh, these are, these are, these are the ones that are working for multiple people. And these of course occur, these forwarded record pointers occur, when a row, the values on a row, grow to the point where that row no longer fits on the page it was initially assigned to.

SQL Server will move it to a new data page, not to a new data page, just to a different data page that has enough space on it. It might be another existing data page that has adequate space, or it might be a brand new page.

Who knows, right? Crazy things happen. But when that happens, when a row moves, it leaves behind this little pointer that says, I have moved here.

I live down here now. I’ve changed, I’ve left my apartment. I’ve moved out. This is my new address. It forwards that down there. So that’s different than what happens when you have a clustered table.

Now, the term clustered index is good because, you know, it is an index. But it does lead people down this strange path of thinking that the clustered index is a separate copy of the table.

It’s just another index on the heap. When it’s not, I find that I find the terminology clustered table prepares people mentally more correctly for exactly what a clustered index is. So when you have a clustered index on a table and a row no longer fits on a page, SQL Server will take roughly 50% of the data pages and put them on a new data page.

All right, that is called a page split. Now, one of the key things about, like, why SQL Server has these rids and uses and does these things is that nonclustered indexes on your tables inherit certain information from the base table.

The base table could be a heap. The base table could be a clustered table, right? In other words, a table with a clustered index on it. When you have a heap, the rid is added to your nonclustered index. When you have a clustered index, cluster table, non-unique, for non-unique, nonclustered indexes, they get added as a key column.

Like after, like, you know, if you have, like, just a single key column index, your clustered index key will be the index key after that. If you have a multi-key index, then your clustered index will be the unique, well, right, sorry, your clustered index will be added to the key after whatever columns you’ve explicitly listed out.

If you have a unique nonclustered index, then the clustered key gets added as include columns to the index. All right. With all that stuff out of the way, let’s move on and let’s create a table and then let’s, we’re going to drop it if it exists, I suppose, and then we’re going to create it.

And then we are going to insert a very special number of rows, it’s not really special at all, into the table. 104, 8, 5, 7, 6. All right.

Okay. That many rows went into the table. There’s nothing special about those rows. And now we are going to add a nonclustered index to that table. All right. Following this, I saw on that date column right there. Okay.

And so we’re going to add this and our table is still a heap. We just have a nonclustered index on it now. All right. Nothing, nothing terribly special about that. Now let’s look at the table itself. I’m going to run this query.

And this table, this query is going to tell us some stuff about our heap. All right. Namely, it’s going to tell us that we are, our table is named El Heapo. It lives in the DBO schema.

Right now we have zero forwarded fetches. We have had 1, 0, 4, 8, 5, 7, 6 records inserted into the table. And we have had no updates or deletes against the table. I’m not entirely sure why this number does not quite match this number, but, you know, what’s a few rows difference amongst friends, I guess.

So to start things off, because our table is just freshly inserted into, there has been no opportunity for rows to move around, for a pointer to occur. All right.

So if we run this query and we look at the DMOS performance counters for forwarded records, this number will be the same before and after this query runs. All right.

So we have, oh boy, that’s tough to frame, isn’t it? We have a counter value. Let’s move that up a tad. Let’s rearrange things a little bit. There we go. That should make life easier. We have 4, 6, 0, 2, 5, 1 up here and 4, 6, 0, 2, 5, 1 down here.

So this counter has not budged. Now, one thing I do want to point out is that the a string column is an envarchar 200, right? So it’s 200 potential bytes of characters.

But when we did our insert, we didn’t fill up every single row with the full size, right? We did some, we did a little bit of funny math on here. And I don’t quite understand why IntelliSense is giving me red squiggles when everything worked correctly.

Perhaps SSMS 21 will eventually fix this issue. I’m not sure, but anyway, we didn’t fill up every row to its potential, which means that we can update rows to their full potential.

And what does that mean? We can have some forwarded records. So what we’re going to do is come over here.

Now, what I want to, what I’m doing in this window is pointing out two things. And this is going to look different when we add the clustered index and come back to this. I’m going to point out two things here.

One is that page splits do not occur with a heap. Heaps are just big flat structures, right? There’s no order to them. They’re not B trees the way indexes are, like both clustered and non-clustered. So we don’t have page splits when we have heaps.

So let’s run this. And what we’re going to do is update a bunch of rows in this table. And we’re going to see after the insert runs that there are no page splits that happen, right?

But because we don’t have page splits now, remember this, there were zero forwarded fetches here when we started. Now, when I rerun that query, we have had a bunch of updates. And those updates have created forwarded fetches, which means when we come back over here, right, and we run this query, which hits our heap table, we are going to have our forwarded fetch counter.

I should remember to move that, shouldn’t I? Our forwarded fetch counter is going to increment before and after, or after this query runs. We start with 546105, and now we have 55184. Now you may notice a slight difference here too, in that all of these columns now have a whole bunch of stuff in them that was not in there before.

The eyes look a little bit shorter, but that’s really only because they’re much thinner. These big fat letters down here take up way more visual space on the screen, but I promise you there are 200 lines going across there.

So anyway, if we look at the query plan for that select, we will see some of the stuff that I had pointed out earlier, where we have a seek into the nonclustered index. Next, then we have a nested loops join, and that nested loops join joins us back to the heap in order to fetch the columns that are not in the nonclustered index that we created.

Remember, the nonclustered index that we created was just on that date column, but we are selecting everything from the heap table. And when we look down here, we will see the seek predicate of the…

This is the bookmark, right? The BMK. And this is the RID that I was talking about earlier, right? So the RID lives in there now.

Now, you’re not going to see anything about the forwarded fetches in the execution plan. But now is as good a time as any to tell you about why SQL Server does forwarded fetches rather than just move the row entirely.

The answer is because of what I was talking about with that inheritance, right? So think about how SQL Server stores a RID, right? It’s the file ID, page ID, slot ID.

If every time we updated a row and it moved, and if we didn’t use a forwarded record pointer, we would have to not only update the heap, we would have to update the clustered index. Sorry, we would have to update all of the nonclustered indexes to, like, rearrange things so that, like, the row identifier was now rewritten to page ID, file ID, slot ID, right?

So it saves a lot of work with nonclustered indexes to do that. Now, I guess, you know, if you have a heap with no nonclustered indexes, it might make sense to just, you know, like, rewrite the thing since it’s just the heap anyway, but we are not that lucky.

But, oh, stop zooming, fool. But if we look over here, right, and if we look at the execution plan for the update that we did, right, like, we certainly scanned the heap and we certainly update the heap, but we do not also update the nonclustered index because we updated that string column, but we did not update any columns that are part of the nonclustered index.

So we only had to update the heap for this. So that is why SQL Server uses forwarded fetches. It is an optimization to help us from, help us not need to do unnecessary write work to nonclustered indexes to now, like, rewrite the, all of those so that they have to identify rows with the new file ID, page ID, slot ID of where the row moved to.

We can just use a forwarded record pointer to go find that row when we scan through the table and hit it. Now, what we can do to get rid of those forwarded record pointers is rebuild the table, right? Oops, we are not in the right database, are we?

Shame on me. But when we, when we, when we, when we rebuild the table, notice that we rebuild both the table and any nonclustered indexes on the table. I don’t know why that just showed up there.

I’m not sure what key combination I hit. So this is rebuilding the, the heap, right? We rebuild the whole heap here and we rebuild the nonclustered index on the heap here. Doing this will get rid of the forwarded fetches.

This will rewrite both of the indexes so that we, this will rewrite both of the indexes so that the new file ID, page ID, slot ID for all of these things just kind of, you know, is now the new row location, right? We can remove all the forwarded record pointers, rebuild the heap and like to get rid of the forwarded record pointers, rewrite the new rows with the new file ID, page ID, slot ID, and then rewrite the, like rebuild the nonclustered index so that it has that new link to it. So let’s come back a little bit.

And now let’s, let’s, what I’m going to do is run this whole portion to drop the table. And now this is going to be a clustered table. And because this table, because this table has a clustered index on it, this query is no longer going to return anything.

But what, what, and you know, like if I run this query now, rather than having a bookmark lookup, we have a key lookup. This is looking up based on the key of the clustered index rather than a bookmark or a rid lookup for the heap. But what I, what I do want to do real quick before we close this thing out is show you what an update does when there’s a clustered index.

So before when we ran this, we had the results, there were no page splits before and after the update query ran, or rather there were no, there were no new page splits after the update query ran. But now when we run this with the clustered index, you will see that there were quite a few page splits that occurred. Now, are page splits good, bad, ugly?

Well, I guess, you know, if you, if you, if you have enough of anything happening in a database, it can get bad. Right. You can certainly produce a lot of extra work if you have a lot of page splits happening.

I guess transaction logging and, you know, IO and stuff, but whether it’s, whether it’s, whether it’s a significant enough problem for you to attempt to address in some way is, is left as an exercise to the reader. So anyway, that is about as much as I can fit into this video about why SQL Server uses forwarded fetches in heaps rather than like doing anything else. And sort of some differences between clustered tables and heap tables.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video where we’re going to talk about a little bit about sub query unnesting and why the optimizer is kind of a goofball about that unless you, unless you do some extra stuff. So anyway, I’m going to just throw this out there and we’re going to go, go do this.

Honestly, we’re just going to wait for this to upload and then record another video. So whatever. I’m going to catch my breath first.

Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Derived Tables

Learn T-SQL With Erik: Derived Tables


Video Summary

In this video, I dive into the world of T-SQL and specifically focus on derived tables, explaining why they are a preferred tool over common alternatives like Common Table Expressions (CTEs). Derived tables not only simplify complex queries but also provide a cleaner way to reference expressions across different parts of your SQL statements. I discuss how logical query processing works in SQL Server and highlight the limitations of using aliases in certain clauses, such as `GROUP BY`, which can make writing clean T-SQL code challenging. By leveraging derived tables, you can avoid redundancy and improve readability without affecting the underlying query plan or performance. I also touch on some personal frustrations with modern tools like SQL Prompt that try to analyze my code, emphasizing that sometimes old-school techniques are still the best approach.

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we are going to talk a little bit more about the old T-SQL and why you should probably learn it from me and not some other useless hump out there trying to pretend that they know about T-SQL. Really just stinks at T-SQL and probably just had an LLM write everything anyway. And in today’s video, we’re going to talk about T-SQL. We’re going to talk about derived tables. Derived tables are, of course, my preferred, my generally preferred mechanism over CTE for a variety of reasons. Mostly, number one reason is irking people who white knight for CTE. That’s top of the charts for me. But they can make queries, they can, rather, they can simplify a lot of things in queries that would otherwise require quite a bit of syntax. Now, part of the reason why derived tables are necessary is sort of because of, again, the way that logical query processing does stuff. Like, we talked about how, like, you know, you have, like, it starts at, like, from rather than select and, you know, makes it way, like, down to joins and where’s and group by and having an order by or select and then order by. And because of that, there are certain things that you write in the select list that aren’t available in the group by, like, unless you nest them, right? Unless you nest the query and you give, like, another from to start the logical query processing thing over with again.

Which, you know, like, I make fun of people who, like, are like, we’re rewriting SQL and we’re gonna have it so you write from first and, like, you can pipeline syntax. Because, like, every time someone critiques SQL, the only two things they ever come up with, despite, like, decades of, like, time that they could have, like, thought of something different. It’s always the same two things. It’s always, oh, from first, pipeline syntax.

Now, the pipeline syntax thing I sort of get, but it’s the way that they, the way that they name it is stupid. You know, it just looks like crappy PowerShell and I hate it. DuckDB has a much smarter name for it.

They, DuckDB calls it, like, like, function chaining or expression chaining where you can, like, like, take, like, in one select list, you can take, like, like, the result of one function, like, alias to something and then use it in another, like, function called within the exact same select list. Oracle also has the ability to reference aliases in the group by clause.

So you don’t have to rewrite expressions in the group by clause, which is amazing because, like, you can write some pretty, like, gnarly expressions and all of a sudden you’re like, wait a minute, I have to group by that now. Jeez, wait. And so, like, you just, like, you have to, like, copy and paste it and remember to take the aliases out because you can’t have, like, you can’t, like, group by some expression, like, or rather, like, some, something equals some expression and my SQL Server freaks out.

It can’t handle it. So derived tables can, like, at least for the way that T-SQL is engineered today, be useful for simplifying that stuff. What they don’t do, though, is change the way that your query plan physically looks, right?

Because SQL Server still has to, like, process the query the same way, just with an outer sort of layer of things. So this is the query that we’re going to start with, right? And we’re going to select post ID and we have this upvotes thing, which is a sum, and then this downvotes thing, which is a sum.

And then, like, we’re not going to bother with the group by thing in here because we’re not going to group by those, right? Those aren’t worth, those aren’t things that we need to group, need or want to group by for this query. But if we wanted to write a having clause that did some math on those, like, we would have to write a, oh, get out of here.

We would have to write a lot of, like, extra code to do all this stuff, right? Like, this is a, just honestly, this is a nightmare, and it makes me completely understand why people get mad at SQL. What makes things worse is that, like, when you talk logical query processing, like, you can absolutely reference expressions in the order by most of the time.

But if we were to unquote this, we would immediately be greeted with red squiggles. And if we tried to run the query with, like, the alias downvotes minus upvotes descending, this would, like, this, like, we would get errors there. So we have to, like, actually redo the whole order by thing in order to do that.

Where derived tables come in, it can be useful is to sort of simplify that thing. For some reason, oh, man, so, like, I just switched to, like, the SSMS 21 general availability, and I just upgraded SQL prompt to a new major version. But it’s revolting on me, and it’s trying to analyze my code, and I don’t need your code analysis.

SQL prompt, I know what I’m doing. I teach the T-SQL here. You don’t teach the T-SQL to me. SQL prompt.

But what we can simplify things a bit with a derived table is we can just write our inner query like we would normally here. But then outside of the query, we can reference these aliases, like, what I think is in a lot more clean way, right? Just where upvotes is greater than zero and downvotes is greater than upvotes times 10.

And then we order by downvotes minus upvotes the way we did in the thing. But it was all, like, expressions, and it was a big mess. It was very, like, it gets very unclear and very tangled up very quickly.

But what I want you to notice about the query plans for both of these, if the good Lord will allow me to live long enough to highlight them both, is that they’re the same, right? Like, this doesn’t change the query plan. This doesn’t make the query faster, right?

This doesn’t help SQL Server, like, do anything. Just like CTE, the results of derived tables aren’t materialized in any way. It doesn’t matter as much for derived tables because you can’t re-reference derived tables in a way that you have to re-execute the query.

But, like, the point is that, like, you still have to apply filters the same way to this. It doesn’t help you, like, make, like, push filters down, like, any step further, right? So, like, if we look at what this filter does, it’s just, like, where expression is greater than expression times 10 and expression is greater than 0.

SQL Server has to do the same thing here, right? Like, it’s the exact same filter that we have to apply to both of these, right? And, like, we still have to sort by up here, right?

When we look at what we’re, like, this is our order by clause. It’s expression 1, 0, 0, 3 descending. And this one, it is still expression 1, 0, 0, 3 descending. So no matter which way, like, you write the, like, this is more like a query cleanliness.

This is, like, a hygiene thing that this makes, like, when we talk about things that makes queries, like, easier to read and easier to understand, like, this is what does it for me. It’s, like, having these expressions just written once in the main part of the query and then, like, just being able to reference those aliases because we have an outer, we have, like, we have the nesting. We have the derived table and then we can talk to those aliases outside of the derived table, right?

So that’s where they really come in handy. That’s, like, where they can really make a big difference. Now, there’s all sorts of reasons why you would put a derived table in a query, of course. There’s many, many uses for it.

This is just one kind of, like, good sort of, like, code hygiene cleanliness one. So important things about derived tables, they’re much cooler than CTE. Good job on those.

And, of course, they can make your queries a lot, like, more compact, cleaner, easier to read and understand. And they can, like, you don’t, like, it’s not going to change performance. It is going to change your performance.

It’s going to change how fast you can figure out what the hell that query is doing. So thank you for watching. Hope you enjoyed yourselves. I hope you learned something. All of this content is still available at the presale price down using the link in the video below.

It is still companion content to the pre-cons that Kendra Little and I will be teaching in Seattle. And if you are attending those, of course, you will get free access to this content with pre-con admission. So I think that’s everything, right?

Do I have anything else to say? I don’t think so. I don’t know. Check out this neat SSMS 21. Yeah.

Yeah. Look at that. Real, real nice looking. Dark mode. Got co-pilot up there that sucks. God, what a piece of crap that is.

Anyway, before I get too far off track, I’m going to go now. All right. Goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 18

SQL Server Performance Office Hours Episode 18



To ask your questions, head over here.

What are the pros and cons of using clustered indexes on #temp tables (as opposed to non-clustered)
I am a junior DBA, I have been reading about performance tuning online. Can you pls let me know how do i start looking for the queries that are worst performing(timing out) and how do I see the actual execution plan of them?
What’s the best way to deal with ascending key issues? Would using OPTIMIZE FOR UNKNOWN or local variables be a good approach?
Hi Erik! Azure SQL Database vs SQL Server on Azure VMs. What do you think are the pros and cons of Azure SQL Database?
I’ve fallen in love with adaptive joins. They’ve solved problems with parameter sensitivity for me. I’ve never seen you mention them in that context. Any reason why?

Video Summary

In this video, I delve into the world of temporary tables and indexes, specifically addressing the pros and cons of using clustered versus non-clustered indexes on temp tables. I also share tips for identifying poorly performing queries through Query Store and offer advice on dealing with ascending key issues in SQL Server. Additionally, I provide a candid opinion on Azure SQL Database, expressing my strong preference for Azure on a VM over Azure SQL Database due to its numerous shortcomings. The video wraps up with a Q&A session where I answer five questions submitted by viewers, covering topics from performance tuning and query optimization to the nuances of adaptive joins and parameter sniffing. Whether you’re a seasoned DBA or just starting out, there’s plenty of valuable information here to help improve your SQL Server skills.

Full Transcript

Erik Darling here with Darling Data. And you might be able to, I don’t know, there’s a background today. So, we’re doing Office Hours. This is where I answer five entire whole questions that you submit. Where do you submit them? That is by itself a great Office Hours question. If you go to this link, which is down in the video description, you can submit questions that I will answer here. Okay? So that’s exactly how it works. No doubt. It’s very different any other time. Usual stuff. If you’d like to support my channel, you can sign up for membership. Also down in the video description. If you like the content, but not like, you’re not like, wow, I would totally pay this guy to keep talking. Proposition there. You can like, you can comment, you could subscribe and all that other good stuff. If you need help with your SQL Server beyond, far beyond what a YouTube can do, then I am available for for consulting help, health checks, performance analysis, hands-on tuning of your SQL Server malfeasance, dealing with SQL Server performance emergencies, and of course, training your malfeasant developers so that you have fewer SQL Server emergencies. If you would like to buy my performance tuning training content, you can do that for 75% off. That’s that combination of link and code there, which is hopefully assembled for you down in the video description.

And if you would like to get in on my new T-SQL course, you can buy that at the pre-sale price. Videos for that will start dropping. Actually, by the time this publishes, they will already have started trickling out into the internet. If you’re going to attend, well, I mean, that information is obviously wrong. Things that held up a little bit with tech review, and there’s a funny little quirk with Podia, where when you put a release date on something, you can’t release before the release date. So ignore that, because that’s wrong. Obviously, obviously. If you’re attending Kendra and I’s past pre-cons in Seattle, you will get access to all this material. This material is considered companion material to that. So, you know, get in on all that stuff while supplies last. If you want to see me live and in person, I will be at all three Pass On Tour events. That’s New York City, Dallas, and Amsterdam. You can see the dates right there. If you can’t, for some reason, send me an email, and I’ll call you and read them to you.

And, of course, Pass Data Community Summit in Seattle, November 17th to 21st. That one’s important enough to say the dates for, I guess. But with that out of the way, let’s answer some questions from you, my lovely users. So here’s the first one.

What are the pros and cons of using clustered indexes on pound sign temp table as opposed to non-clustered? Well, if I’m going… So if I have decided that I am going to test indexing a temp table, perhaps because that temp table gets loaded with a significant enough amount of data, or the way that the temp table is queried tells me that an index would be appropriate here, I am only going to start with a clustered index.

Why? Well, great question. I’m glad you asked. When you create a nonclustered index on a temp table that does not have a clustered index, you have a heap, and you have a nonclustered index. Typically, your nonclustered index is going to be on a rather narrow array of columns.

If that is completely satisfactory to your query, well, maybe that’s good enough for you. But if it’s not, if your nonclustered index is not what we call a covering index that covers all the columns that your query requires, then SQL Server is completely at its own optimization will to ignore your nonclustered index and scan the heap anyway.

So if I was going to offer any advice here, it would be rather than starting with a nonclustered index, start with a clustered index. And if you’re going to start with a clustered index, my preference is to create the clustered index after populating the temp table because it’s generally easier to get a fully parallel insert into the temp table, which can be meaningful if you are loading a lot of data into it.

That’s a lot easier when you just have the heap. And when you create the clustered index afterwards, there’s an additional benefit that you create the index with full scan statistics of the temp table rather than just getting sampled statistics for the temp table when a query finally hits it in a way that SQL Server needs to do cardinality estimation.

So there we go. Next question. I am a junior DBA.

I’ve been reading about performance tuning online. Well, that’s your first mistake. There’s nothing but bad advice online. Can you pose? Let me know. How do I start looking for the queries that are worst performing? And how do I see the actual execution plan of them?

Well, two separate questions there. My big preference for finding poorly performing queries is to use Query Store. I’ve got my pre-store procedure SP Quickie Store available at code.erikdarling.com.

That is great for mining through Query Store data by default. It gives you the top 10 queries that have used the most average CPU over the past seven days. There are, of course, all sorts of parameters that you can use with my store procedure to search through Query Store data that are not available in the Query Store GUI because Microsoft hates you and I love you.

So, that’s the first part. I prefer to look in Query Store for that. And then how do I see the actual execution plan for them?

Well, Query Store is a bit on par with the plan cache for, like, which query plan it shows you, which is, like, the cached plan or the, like, estimated pre-execution plan, which doesn’t have all the actual runtime metrics in there. So, if you want that, you still have to run the query and get the actual execution plan. Now, if the query is so poorly performing that, like, you can’t get it to finish, you can still cheat a little bit here.

So, what you’d want to do is start running the query, but turn on actual execution plans for the query. Don’t turn on live query plans. Live query plans are bugged pain in the butt.

Turn on actual execution plans and start running the query. Then, in another window, use spwhoisactive with the get plans parameter set to true. And then what that’ll do is look at, like, the in-flight query plan.

And sometimes, like, you can catch enough of the bad stuff that’s happening, like, while the query is running. Like, it’ll start updating with metrics on, like, wait stats and, like, operator times and stuff like that. So, you can get, like, a partial view of the actual execution plan while the query is running.

You, of course, could do other stuff if you know what queries you want to catch. It sounds like you don’t, but if you knew what queries you wanted to capture, then I would recommend sort of targeted extended events to capture specific, like, post-execution plans for specific, like, store procedures or queries or something like that. But it sounds like if you’re just on the hunt for ones that are bad, then I would do query store to find the worst-performing ones.

All right. Another one here. What’s the best way to deal with ascending key issues? Would optimize for unknown or local variables be a good approach?

I mean, like, you hate to say no because, like, clearly if you’re dealing… Well, actually, this is an interesting one. So, like, the local variable or optimize for unknown thing would not help specifically if you are using the new cardinality estimator or, as Microsoft so presumptuously calls it, the default cardinality estimator.

Because for ascending key issues where… I assume you’re talking about, like, index key values that are off the histogram, right? So, like, let’s say your histogram goes, like, it’s 200 steps, so let’s just pretend it goes from 1 to 200.

You’re worried about numbers that are over 200 that are not in the histogram, right? So if you’re using the new cardinality estimator, you are already getting the density vector guess, which is the sort of, like, nerdy name for optimize for unknown and local variables. So you’re already getting that.

If you’re using the legacy cardinality estimator, you usually get to guess one row for those off-histogram steps. So, like, optimize for unknown and local variables, you would get the density vector guess regardless of what the off-histogram value is. So, but I don’t know if that’s going to be a good guess for you either, right?

So, like, I don’t, like, I hear that and I’m immediately like, well, I’ve just seen optimize for unknown and local variables cause too many bad cardinality estimates. They still might be better than the guess of one that the legacy cardinality estimator gives you for those things. But, like, I still don’t think they’re going to be, like, great all around for all the other queries that are actually hitting the histogram, right?

Like, those you’re like, well, use the histogram, buddy. So, like, my first approach would be probably more frequent stats updates or, like, using, like, there’s, like, some database scope configuration stuff and some trace flag stuff you can do to help SQL Server with the ascending key stuff. But, like, I would, like, I have clients where I update, like, I’ve set up stats updates jobs, like, specifically on some statistics for, like, like, every 20, 30 minutes because data is in flux so much and, like, bad query plans happen frequently for those.

We’re like, like, just updating stats to keep, like, keep up to date with a few specific ones was the best option. So, like, don’t overlook that as a possibility. All right.

Next question. Let’s see what we got here. Hi, Eric. Azure SQL Database for SQL Server on Azure VMs. What do you think the pros and cons of Azure SQL Database? What’s that old saying?

I’d rather have a sister in a whorehouse than a brother in the Navy? Well, that’s how I feel about Azure SQL Database. I would rather not have a brother or a sister using Azure SQL Database.

I hate that thing. I mean, it’s hard to tell if I hate it more or less than managed instance. Maybe less only because I had such high hopes for managed instance and Microsoft screwed that one so badly.

Like, oh, God. Like, whoever is in charge of that. I don’t know. It’s, I don’t know.

Bad stuff. Anyway, no. I would much prefer Azure on a VM to Azure SQL Database. Anytime someone tells me they’re using Azure SQL Database. Like, aside from the fact that, like, Microsoft was like, hey, the default isolation level here is recommitted snapshot isolation because we don’t want to deal with your stupid blocking problems.

Like, that’s nice, but, oh, the rest of it. Oh, the rest of it. Mm.

Mm. Have you ever seen a company, like, screw up an offering for their own product as badly as Microsoft has screwed up their platform as a service for SQL Server? Like, what happened?

Like, how is AWS so much better at this than you are? It’s embarrassing. Like, your pants left you. Like, how?

How did that happen? Oh, it’s a joke. Anyway, I’ve fallen in love with adaptive joins. Well, there’s someone out there for everyone. In the case of adaptive joins, there’s two someones apparently out there for you. There’s hash joins and nested loops joins.

They’ve solved problems with parameter sensitivity for me. Well, that’s nice. I’ve never seen you mention them in that context. Any reason why? Well, there are a few reasons why.

And to start with the reasons why, you kind of have to start with where adaptive joins really kick in, which is, of course, like, you know, either like something where, like, columnstore is involved because, like, you know, you do need something batch mode-y for adaptive joins to happen. And then where batch mode on rowstore kicks in, which is compat level 150 plus enterprise edition only. And only when SQL Server’s internal heuristics say, hey, I think some batch mode on rowstore might be good here.

And, hey, we could use an adaptive join here because we’re using batch mode on rowstore. So, like, there’s times when it might kick in and be very helpful. Even in some of my demos about parameter sensitivity.

I, like, you don’t have to use the old compat level to have them still work. It’s just that they work differently, right? So, the thing with the heuristic thing, like, if you create, like, some sort of columnstore index or object so that you sort of force the optimizer down the batch mode path, that’s one thing.

If you don’t and you’re relying purely on batch mode on rowstore, my experience with it is that when you use the little, when you compile for the little plan first, you don’t get batch mode on rowstore very often, right? Because SQL Server’s like, this is a tiny amount of rows. We don’t need batch mode anything.

We’re just going to do some nested loops, join some key lookups, and move on with our life. So, like, my experience with it is that, like, if it happens to kick in and solve a problem for you, that’s great. But you have to consider situations where the query might run and batch mode on rowstore wouldn’t kick in and you wouldn’t get an adaptive join where all of a sudden you’d be using, like, the old, like, just row mode, no adaptivity type execution plan, and it wouldn’t work for you.

So, like, if you’re going to consider that, like, a solution for parameter sensitivity, you better think real hard about, like, what luck you ran into or what you did to get SQL Server to consistently use a query plan where adaptive joins happen. So, like, when they kick in, they can absolutely be useful because SQL Server’s like, well, I’m going to start with this hash join, and if I don’t get enough rows, I’m going to switch to nested loops, which is great because you have that choice in there, right? Like, SQL Server’s like, oh, do this.

But if that doesn’t kick in, like, reliably for you, which I find is the case for a lot of the stuff I do, that, like, the heuristic-based stuff doesn’t kick in reliably enough for me. And then when I suggest a non-clustered columnstore index, people are like, oh, but I’ve read 17 million blog posts about why I shouldn’t use non-clustered columnstore indexes because I do an update. Oh, dear God.

I updated a data, and now I can’t use a columnstore. Like, make things much harder on yourselves than they need to be most of the time. All right.

Anyway, that’s five questions, right? One, two, three, four, five. Cool. I feel like I’ve done my good deed for the day. I’m going to call this one here. Thank you for watching.

Thank you for sending in questions. I hope you enjoyed yourselves. I hope you learned something. And I will see you in another Office Hours video probably, probably, like, next week. All right.

Cool. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: EXISTS, NOT EXISTS, and Row Goals

Learn T-SQL With Erik: EXISTS, NOT EXISTS, and Row Goals


Video Summary

In this video, I delve into the nuances of using `NOT EXISTS` in T-SQL queries, particularly focusing on why many data scientists and developers often misuse the `EXISTS` pattern to find rows that don’t exist in another table. I explain the inefficiencies of the traditional approach and how Microsoft has introduced optimizer rules like “Left Outer Join to Left Anti-Semi Join” to optimize these queries. Additionally, I demonstrate practical examples using SQL Server 2025’s compatibility levels and query optimizer hot fixes, showing how `NOT EXISTS` can significantly improve performance compared to left joins. The video also touches on the complexities of row goals in the query optimizer and provides a workaround for scenarios where index modifications are not feasible.

Full Transcript

Erik Darling here with Darling Data, and you wouldn’t believe what we’re going to talk about today. More T-SQL stuff. Why? Because I need to promote two things. One, the T-SQL course that you can buy at the presale price down in the video description. And two, the pre-cons that Kendra Little and I are doing in Seattle this November. The material that you will see here and that you can buy at the link down there is, they’re related. They’re like cousins of some variety. And today we’re going to talk about, well, we’re going to talk specifically about not exists for the most part today. There is, of course, this is, of course, the beginner material. This is, of course, just a small preview of what’s in the full course material. Because if I just put all the course material here, what would be the point of buying? Be quite silly. All right. Anyway, let’s talk about this. So every, almost every time I see a, like, learn T-SQL, with smart data scientist person, they screw up. And they say, this is how you find rows in one table that don’t exist in another table. And they even use the words don’t exist. And for some reason, they use the words don’t exist, but they completely leave the SQL, like, like, pattern not exists out of out of anything they say. And it makes me want to hit them with hammers.

They’re the, why? What is the point of view? So let’s talk a little bit about why this query pattern doesn’t often work out. And then something that Microsoft actually did to try to help idiots like that who write queries like this. And then how, of course, how to write this query with not exist to avoid the issue. So good, good, good. All things that we can look forward to. So let’s run this query. So let’s run this query. And let’s get a result. And let’s take careful note of this result here. 103.0987. Cool. Here’s the query plan. And here is why I generally don’t like this query pattern. So SQL Server was kind enough to execute this query pretty without reordering joins or anything. So let’s look at this. And we’ll see that we, zoom it, we have a clustered index scan on the user’s table. And we take all 2.4-ish million rows out of this thing.

And we have an index scan on the post table. And we take all 17 million-ish rows out of this thing. And then we fully join all of those rows together, right? We join everything from the user’s table. We join everything from the post table. And we do a hash join here. SQL Server chooses not to pre-aggregate any of the owner user IDs coming out of the post table, even though owner user ID has a lot of dupes in it. And then over here, we have a filter operator. And look how many rows leave this filter operator prior to aggregating them down to do the count.

1, 0, 3, 0, 9, 8, 7. That is exactly how many rows we got for our query result. Now, the reason why this is terribly inefficient is because SQL Server joins every single row together. And then only later removes rows that don’t match with this, right? So where p.id is null, right? So that is what we’re, these are the rows that we’re moving in the filter.

This query pattern is so prolific and so profoundly used that Microsoft actually went and added a couple new optimizer rules called Lodge to last-je and Rodge to last-je. And what do those mean? Left outer join to left anti-semi join and right outer join to left anti-semi join. And what do these do?

Well, under, if you’re on SQL Server 2025, which recently became public domain, right? So that’s where we are now. Those rules exist.

If you’re on compat level 170, which is the new highest compat level for SQL Server for 2025, or you enable query optimizer hot fixes, that’s, you know, this is a use hint, but you can also, there are, there’s also database level settings that control this. But let’s look at the differences in these three execution plans.

I’ve just put a little bit of staging data on my SQL Server 2025 instance. I haven’t, I haven’t fully moved or fully restored Stack Overflow to it and started doing anything with it there. I just want to show you these couple things.

So using compatibility level 160, which is the previous high for SQL Server 2022, we have a fairly similar plan to what we had in, well, when we were actually just running this on SQL Server 2022, where we have a join here that fully joins both tables together. And then we have a filter that brings all the rows out.

And we have the same number 1030987 here. But for the two queries below, in the first one where I have the enable optimizer hot fixes, and then in the second one where I have compat level 170 in use, these two query plans do what it would actually, like, this is what it looks like when you write the query as not exists for the most part.

Now, these did get moved around a little bit, right? Like, SQL Server reordered our joins because the post table is on the outer side of the join for both of these. And the user’s table is on the inner side of the join for both of these.

And now, SQL Server has actually been kind enough to pre-aggregate some of those duplicate owner user IDs from the post table for us. But notice that in both of these queries, there is no longer a filter operator. And that’s because SQL Server has replaced that left join with a right anti-semi join, right?

So we have taken, we have removed that, like, weird, like, full join null check with a filter and just changed it to an anti-semi join, which is about the equivalent query pattern that you would get if you wrote your query using not exists to begin with, right? So that’s this pattern, right?

So if we run this query, we’ll see something fairly similar, not precisely similar, but fairly similar, where, you know, SQL Server has, like, here it left the join order intact where users is on the outer side and post is on the inner side. But here, SQL Server has chosen to pre-aggregate some of the owner user IDs.

And then we have just about the same pattern as before, where there’s an anti-semi join, where the non-matching rows are removed at the join rather than in a filter operator later, right before getting counted. And you’ll notice that at the join, we get down to R1030987 rather than having to fully join all the rows together.

I forget what the full number was for this one. Let’s take a quick look. Yeah, so 17945522, that is an eight-digit number of rows that come out of there.

And so that’s no fun, right? Now, so when you’re writing queries that need to find rows that don’t exist, if you’re living in, you know, a compat-level 170 land or you enable optimizer hot fixes in, like, a cloud product, like Azure SQL database or managed instance where a compat-level 170 new stuff is available, you may not have to worry so much about, like, how you write the query because SQL Server will kind of fix it for you.

But if you’re not living in that world, you should always test the not exist pattern first. Generally, not exist works out faster than the left join. So coming back to the execution plans, and the difference here, I agree, is not dramatic, but, you know, the drama will increase or decrease depending on the state of your server.

This takes about 1.2 seconds, and the not exist pattern down here takes about 600 or so milliseconds. So this is twice as fast locally, though the twice as fast difference isn’t huge here, right? It’s about, like, 1.2 seconds versus 600 milliseconds.

Now, there are some cases where you’ll use exists or not exists or something like that, and sort of like a check in your store procedures or whatever query you’re writing. It could be an ad hoc batch, too, where it’ll not go so well.

And the reason why it won’t go so well is going to have something to do with another optimizer thing called row goals. Row goals are sort of like the optimizer saying, I bet I can find all the rows that I need for this really quickly, and I’m going to use a slightly different execution plan because I bet I can find these rows quick, which, when it works out, is great.

When it doesn’t work out, it’s terrible. So here’s an example of a row goal not working out well using if exists, right? So if you have a store procedure or some other ad hoc batch and you’re like, why the hell is this so slow?

Well, one thing that you should look at is the query plan because the query plan will tell you most everything you need to know. So if we look at what happens here, notice how terrible this estimate is. SQL Server thinks that it’s only going to get three rows out of this, and it’s going to be able to figure out everything that it needs with those three rows, but it doesn’t.

This actually gets 3, 7, 5, 6, 5, 9, 7. That is 3.7 million rows here, right? And down here, well, we go into a nested loops join 3.7 million times, and we find nothing here.

This takes about five seconds, and this takes about five and a half seconds, which means this query as a whole takes just about 12 seconds in total to finish, 11.6 seconds. Now, you might be wondering why this thing has such a terrible cardinality estimate, and it’s not necessarily just a terrible cardinality estimate, but it is related to SQL Server making this guess on row goals.

So SQL Server bets that it can find, right? It estimates that it will take 2.8 rows, which gets rounded to three, but notice this estimated rows without row goal here, right?

That’s where things get messed up. This is where things get messed up. Introducing the row goal means that this thing, SQL Server is like, I bet I can do it in three rows, but it didn’t do it in three rows.

It took all the rows, right? It took all the row goals to do it. So these things can certainly backfire and cause problems. Now, if you wanted to fix this without creating indexes, because sometimes you’re just in a situation where you can’t just add, either add new indexes immediately on the fly, or like you’re just not allowed to add new indexes, period, you can change the query so that in force SQL Server to do some more work, which, and like, say, for instance, counting all the rows, and this will give you the ability to not have those row goals introduced, because SQL Server is like, crap, I’ve got to count all these rows.

I can’t just bail out and think that I can get what I need with three of them. So what I’m doing here is a little sloppy, I admit, because this will return a decimal 38, 0, but this is the big int max for SQL Server.

So saying select power 2 dot to the 63rd power minus 1 gets you to that big number, but when we do that, and we say, if exist select here, where the count is between 1 and that big int max, then SQL Server can’t choose that row goal plan, because now it has to count all those rows, and instead of taking 12 seconds, this takes about 1.5 seconds, and notice that now SQL Server is like, oh yeah, 3.7 million.

That sounds more realistic to me anyway. All right, cool. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something, and I will see you in the next video, where we will talk about equally interesting and exciting things for another 7 to 12 minutes.

All right, cool. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Subqueries With NOT IN

Learn T-SQL With Erik: Subqueries With NOT IN


Video Summary

In this video, I delve into the intricacies of subqueries in T-SQL, focusing on a specific scenario involving the `NOT IN` operator and nullable columns. Erik Darling from Darling Data provides a detailed walkthrough using the Stack Overflow data dump as an example, illustrating how null values can lead to unexpected results when used with `NOT IN`. The video explores different query approaches to achieve accurate results, including modifications to handle nulls effectively and comparing `NOT IN` with `NOT EXISTS`. By examining execution plans closely, viewers gain insights into why certain queries perform differently and how to write more robust T-SQL code. This is part of a preview for upcoming beginner-level material, which can be purchased now at half the future price. If you’re attending the PaaS Data Community Summit in November, this content will align well with the pre-cons taught by Kendra Little and me, offering free access to all materials.

Full Transcript

Erik Darling here with Darling Data. In today’s video, we’re going to learn a little bit more about T-SQL here. In this video, we’re going to talk about subqueries. Now, of course, this is a preview of the beginner material that I’m putting out. This is for presale down in the old video description somewhere about in this area, this vicinity somewhere over in here. You can purchase everything now for half the price that it will be when the material is fully published. This is, of course, companion material, not like full material, but companion material to the pre-cons that Kendra Little and I will be teaching at PaaS Data Community Summit this November. And if you’re coming to those, you will get all of this stuff for free because they will mesh fairly well. Anyway, let’s talk a little bit about like the particular form of subquery with not in. Now, it’s at least fairly well known amongst SQL people that when you use not in and one side is or both sides are nullable, things get kind of weird and you need to think a little bit harder about how you write your queries. So I promise this isn’t like just that isn’t the lesson I’m going to teach you. We’re going to go further than that. But that’s sort of the starting point. All right. So like if we say if we run this query, and so what we’re looking at here is a query where both sides of the not in expression are nullable. So in the votes table, wake up, zoom it. Come on, buddy, you can do it. And the votes table, the user ID column is not only nullable, but contains lots of nulls. Because this table is heavily scrubbed in the public data dump because like it would be unfair to publish people’s like voting records out in the wild, right? You vote and it’s supposed to be a private thing, right? How you vote is supposed to be between you and the piece of paper you jam a hole in.

Don’t want any hanging chads in the Stack Overflow community. And then like in the comments table, some of the user IDs are null. This column is again nullable and contains nulls, right? So there are nulls on both sides of this not in expression. When, of course, it’s only if there are nulls on the inner side of the not in expression that we get the sort of incorrect result-y stuff from it, right? So if we run this, this is going to run for about four seconds and it’s going to return a count of zero. So it’s going to look like there are no matches in these.

But of course, we look at the execution plan. We’ll see that, I mean, we’re not going to see that there were matches. What we’re going to see is YSQL came to the conclusion that there were no matches, right? So we have what, so up at the very, very top here, this would be the actual join between votes and comments, right? This set, like this here, this is the comments table. This is the votes table. And this is where they would get joined together to produce a result from that subquery.

But you can see about right here is where SQL decides that there are no matches whatsoever, right? That’s where we, that’s where we first hit a zero. So let’s examine the query plan a little bit. What we have down in this section where SQL Server decides that we have zero rows is, like, we have the votes table, which gets left anti-semi joined to a, to the comments table here. And there’s a row count lazy spool here that counts a whole bunch of rows.

And then there’s another left anti-semi join, which we just talked about, which is where we hit zero. And then down here, there’s another copy of the comments table and another row count spool. So a row count spool just does exactly kind of what the name implies. It counts rows.

This iteration of the, or rather this reference to the comments table is just counting all of the rows in the comments table. It counts all, like, 24.5 million rows or something. Then this one down here is counting something a little bit different.

This one is looking for nulls, right? Because this column is nullable, but may not contain nulls, right? Just because a column allows nulls does not mean anyone has put a null in there.

SQL Server needs to do this to figure out if there is a null. And then if there is a null, right? Like, this is where the semi joins get kind of funny. So, like, here we’re just counting rows, right?

And we’re joining that count of rows to the votes table here, right? This left anti-semi join, which means, like, not exists, not in. And this one, but the predicate on this one is just where the votes, the user ID column in the votes table is null, right?

So, v.userID is null. So, it’s like, okay, are there nulls here? I don’t know.

And then this one doesn’t actually have a predicate on it. What are we left anti-semi joining to? Well, we’re just figuring out if there are nulls in this table. And if there are nulls, then this returns no rows, right?

This is where we go from 3, 5, 8, 2, 7, 1 to 0. So, what we can do to get a somewhat correct result, somewhat, is do this, right? We can say, select count from votes where the user ID column is not in this again.

But this time, we’re going to get rid of nulls from the comments table, okay? Now, I want you to pay attention to what this number turns out to be. Do-do-do-do-do.

This does not really get all that much faster. It may be, like, about a second faster. The last query took about 4 seconds. This one takes about 3 and change. But the result that we get back is 293716. I just want you to keep that number in mind.

Because it is a number, but is it the right number? We’re going to figure that out. And we’re going to ask a probing question about data to help you write your queries correctly, right? So, if we look at the execution plan for this one, we are down to one row count spool right here.

We don’t have the two row count spools anymore. And this one is just looking, this one is just counting where c.userID is not null. And then we have our left anti-semi join here to where v.userID is null.

And then we have our actual join up between votes and comments here on userID. So, we get 293716 back from this. Is that number right or is that number wrong?

Well, one way to sort of visualize what that thing is actually counting is to write the query like this. Oh, and I have all these ridiculous things popping up on my screen now. So, if we add in this predicate, we will get rid of both row, we will have no row count spools in the query execution plan anymore.

Because SQL Server will no longer have to track this. And we’re going to get the same number back. But this is where I want you to ask yourself if this is the right number or not.

Because SQL Server can only count where userID is not a null mark, which is what it gave us in the previous query. And it’s exactly what we get in this query too. Right?

We run this. It still takes about three seconds. We still get 293716. If we look at the execution plan, there are no more row count spools. There is just one join here.

And the number of rows that leave that join is 293716. And that’s what we count here with the stream aggregate. Is this right or wrong? Well, one way to find out if you think this is right or wrong is to write the query a little bit differently.

Now, this does help us move into the next topic that we’re going to talk about, which is going to be exists and not exists. But what happens if we run this query? Because, see, like whenever people talk about not in and how it can return incorrect results if there are nulls on the inner side of the not in, is they will tell you to either screen out nulls from the query itself.

Right? That’s one way to protect yourself from getting incorrect results or incorrect results. Or they’ll tell you to use not exists.

But not exists will fundamentally give you a different result here. Okay? So let’s look at, if we run this query, this is going to run a lot faster. Right?

This finishes in about a second. But look at the number we get back. Right? Four, nine, six, three, nine, six, five, five. That is an eight-digit number of rows that we get back from this count, which is much higher than the 293,716, I believe, that we got back from the other query. So this is counting all of the rows in the votes table where the values don’t exist in the comments table for user ID.

Right? And that includes the nulls. Remember, like, not exist and exist and not exist handle nulls in ways that, or rather, not exist handles nulls in a much better way than not in handles them.

So we still do that join, but we’re counting all of the rows and votes that are still null. Right? Because those don’t match.

So depending on what you want to actually count, you might need to write this query differently in either case. If you want to count of all the rows, including nulls, in the votes table that don’t, where the user ID there doesn’t exist in the user ID column in the comments table, then just write not exists. But what happens if we change that not exist query to also get rid of nulls from the outer side?

Right? So we’ll no longer count null rows from votes. What do we come back to?

293.716. So you can kind of get a sense here that the data, like, what query results are correct depends on what you’re actually looking for. If you’re purely looking for matches between votes and comments, then you’d, like, you would want to screen out nulls from the votes table.

If you want to count all the rows, including nulls from the votes table that don’t match the user ID in the comments table, then you could just write not exists and leave it at that. So it really does depend on what you’re looking for here, how you want to write this query to get back the correct results. Anyway, that’s about what I wanted to talk about here.

Thank you for watching. Thank you for watching. Again, this is just a small portion of the material. You can see we’re down in, like, the 300 rows here. There’s a whole bunch of other stuff that I didn’t cover because it’s a preview. It’s supposed to urge you to buy the video course so you see all the content.

Cool. So, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that you will now have something very interesting to think about when you are writing your queries.

And, what else? I don’t know. I think that’s about it. Anyway, it is Saturday here, so I’m going to go do something else now with my entire day.

So, thank you and goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Getting Defensive with CASE Expressions

Learn T-SQL With Erik: Getting Defensive with CASE Expressions


Video Summary

In this video, I delve into the importance of writing defensive case expressions in SQL queries, drawing from my experience at Darling Data. I highlight how unexpected or problematic data can lead to errors and bad results, emphasizing the need for robust coding practices. By using examples like dividing by zero or handling null values, I demonstrate practical techniques such as CASE expressions and NULLIF functions to ensure your code remains error-free even when faced with unforeseen data scenarios.

Full Transcript

Erik Darling here with Darling Data, and today’s video we are going to talk about defensive case expressions. Now, I don’t mean the kind of case expressions that, you know, practice like beating up mannequins and whatnot. I mean the kind of defensive case expressions that keep you from hitting all sorts of weird errors and bad things that happen in your queries. Writing defensive code is a big part of your queries. You know, it’s hard of working with databases because you never know what’s going to end up in them. So, uh, be prepared, right? Anyway, uh, this, this, like this particular module, um, there’s way more to it in the full material, but this particular module is very near and dear to me because I have had to do this in so many of my stored procedures where I started running into things that, like, I didn’t anticipate or like, like, that doesn’t make any sense. Like, why, why is that an error now? Like, oh, there’s a zero, right? So, or like, oh, there’s something wrong with this string. Okay. Well, we got to write, got to be more defensive with our code. So if, if you look through this, the code that I write, a lot of it is defensive by default because I have been to too many bad places. Um, not talking about restaurants in Boston. So let’s create a table and let’s put some initial data into that table. Uh, just 10 rows is all we need to start with.

But, uh, now let’s say that, uh, we want to, um, figure out the percentage of something from this table, right? So we’re just going to run this query. And this isn’t a very, like, honestly, it’s a very simple table, very simple query. We’re not going to see anything amazing or groundbreaking here, but I just want to show you what happens when, uh, unexpected or even unwanted data ends up in our, in our databases. Now, uh, up in the table create statement, there were a couple notes like, Hey, should we have a check constraint to reject values of zero, like zero valid data points? So some stuff to think about, but you know, not necessarily stuff that you need to think about immediately. Um, so one thing, uh, before we go on is that when you’re returning results like this, uh, you do need to be explicit about converting data to the type that you expect. If you look at the results that we got back here, we got 20 with a whole mess of zeros.

All right. We don’t, do we need all those zeros? Do we want all those zeros? Do those zeros add anything to us? No, but because of the way that SQL Server chose to implicitly convert that, that division math, we ended up with a whole bunch of zeros. So if we want to control our results, we must convert our results to, uh, to display the way that we want. So here we go with a percent of 20 with only two zeros. And again, honestly, don’t add much to the equation. Right. We could just do, we could just have a, we could just have an integer there, honestly, but you know, makes it a little, makes it a little bit more interesting.

But now let’s pretend that a very problematic row appears in our table. And all of a sudden we are inserting a number zero into our table, right? And this is going to mess up everything. The whole works are going down. I thought the Titanic had a bad night. Wait till we start running this. Now, uh, if we try to run this query now, we are of course going to get a divide by zero error. All right. SQL Server will have returned some results, right? We get the first 10 rows back, but row 11, no can do, no can do. Now, of course we could write a where clause.

We could add this where clause in and say, Hey, divisor greater than zero only. We don’t get any results, but now we’re missing that row from the results. And we don’t necessarily want that either. So this is where we have to get a little defensive in our code, right? Again, this is where we get to strike the mannequin or the boxing dummy or whatever you call it. But, uh, what we can do here is we can use a case expression and we can say when the divisor equals zero, then just replace it with zero dot zero zero.

And if the else condition would be to, uh, do our convert to a decimal 10, two, uh, and then just have our division math in there. Right. And this will get us, uh, error-free code back, uh, which, which includes, uh, row 11 without having, uh, without, without any issues, right? No, no red text. That’s the big important thing. Uh, another way you can do that is by using the nullif function.

And I’m totally okay if you want to use the nullif function. Where this gets a little bit wonky though, at least for me, is that, uh, when you, when you use nullifs, what we’re going to say is, uh, if the divisor, right, the way nullif works is the first thing is the test, right? So if this thing is going to say nullif, like, there’s going to, like nullif means, uh, if this condition is true, then it’s going to, then it’s going to emit a null.

Right. So nullif divisor zero, right? So if divisor equals zero, then we’re going to emit a null here. This does, this does run successfully. And this does produce the results that we want, kinda, except now we just have a null down here. And where this, it’s kind of annoying is that now we have to wrap this whole thing in is null.

And we end up with a much, much larger expression than we did when we just use case to do the case when zero, then zero dot zero, zero. So we can run this and get back what we want with replace, replacing the null at the end with zero dot zero zero. And all is well here. So again, when you’re, when you’re writing, uh, queries, uh, and like there’s anything involved, uh, whether it’s division, substring, left, uh, you know, like so many different ways that you can write a query where the results might hit some sort of internal error, whether it’s mathematical or, uh, like invalid, like substring, uh, chopping, like, uh, bite placement, uh, please be very, very careful.

Please always write your code defensively. Uh, even if you know the data very, very well as it exists today, that does not stop bad data from eventually and they are just different data than you expect ending up in there in your code, all of a sudden throwing problems. So always think ahead a little bit when you’re writing these queries and always try to write your queries in as defensive as a way possible so that you don’t end up getting surprised by weird errors.

And all of a sudden, like getting page in the middle of the night and someone saying, Hey, the ETL process won’t run. Everything’s dividing by zero. It’s all falling over. It’s burning. Help us. Save us. We can’t live without you. And then you go find that one place where you’re doing some division and you, and you fix it and then you look like a big hero and the ETL runs and, um, I don’t know.

You have a dashboard in a day. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video where apparently we’re going to talk about subqueries. What are we going to talk about?

Well, you just, you just have to show up, won’t you? All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Getting Fancy With CASE Expressions

Learn T-SQL With Erik: Getting Fancy With CASE Expressions


Video Summary

In this video, I delve into the world of case expressions in T-SQL, dispelling any confusion around their terminology by clarifying that they are indeed “case expressions” and not “case statements.” I walk through a simple count query example to illustrate how using subqueries can lead to multiple table scans, significantly impacting performance. To address this issue, I demonstrate how to use case expressions to selectively apply aggregates, ensuring more efficient execution plans and reducing the number of times the table is accessed. The video also explores the nuances between using `COUNT` and `SUM` with case expressions, emphasizing the importance of correctly handling null values for accurate results. Throughout the discussion, I highlight common pitfalls when defining ranges in these expressions to avoid data exclusion or double counting issues, providing practical advice on how to write robust and efficient queries.

Full Transcript

Erik Darling here with Darling Data, and in today’s video we are going to continue on talking about, this time, case expressions. I suppose it’s good to get it out of the way, first and foremost, that they are case expressions, they are not case statements. If you run into anyone in the wild who calls it a case statement, you can very, very safely disregard their opinion about basically anything and everything that they might say. So it is kind of a nice litmus test for a person. Call it a case statement, out you go. Anyway, let’s say that we have a, just a simple count query like this. But the problem with this count query is that we need to count all different sort of reputations in different ways.

You might end up, if you kind of knew it at T-SQL, writing a query that looks a little bit like this, right? Where, you know, for some reason you’re going to be able to write a query, and you’re like, well, I’ll just, I know, I know it’ll work. So I’ll just write some sub queries that count for reputation equals one, and then reputation between one and 9,000, and then reputation over 9,000. So this, this will work. This is perfectly fine.

But if we look at, if we run it, for the small table that we’re working with, this is fast enough. But if we look at the execution plan, we will see that we hit the user’s table once, twice, three times. And each time we hit the user’s table, oh, the tooltip went away. We read the whole thing, and, you know, we have sort of a not good time, right? So if your queries are big and more complex and slow for each one of these sub queries, you will have a 3x slowdown for your query, rather than just hitting the table once.

Now, what you can do is you can use case expressions to apply aggregates selectively. So if you do some, but you have to be a little bit careful with it, like, like which aggregate you choose. Like up here, we’re, we’re really, like, we’re just getting a count of all the users whose reputation falls into, you know, these various sort of segments. If you use count like this, you’re going to get wrong results back.

Right? Because we know when we ran this query for reputation equals one, we get 1,090,000 or so users back. If we run this query, what we’re going to get back is 2.46 million. Why? Because count is counting the ones, and the zeros. Count doesn’t exclude zeros. It’s like, what’s a value? So it gets counted. If you wanted to use count accurately for this, then you would have to, then you would have to suppress non-one reputations with null.

And then you could do this and get the right number of rows back. We get that 1,090,000. A lot of developers find it a little bit more intuitive to use sum rather than count, right? Because with sum, you’re saying if reputation equals one, then add one to it. If reputation doesn’t equal one, doesn’t equal one, then add zero to it. And that gives you a little bit easier time getting the correct count back.

But what that’s leading up to is that you can do something even more interesting with both sum and count. But sum is usually a little bit more common to do this with, is where you don’t have to just supply a static value for your then or else expressions. You can actually do like a column value. So let’s just say we wanted to sum up everyone’s reputation for 2013.

We could say where creation date is greater than equal to 2013-0101. And if that’s true, then we sum the reputation column. And if that’s not true, then we add in a zero, right? So you can do this and you can replace a column with, you can sum up a column doing that. Now, what that’s leading up to is that when you want to do that across a range of things, there’s a very easy way to do that to hit the entire users table, but only hit it once and just use case with the selective, selective aggregates to get you what you want.

The only thing that I’d say here is just when you’re writing stuff like this, be really careful, right? Because if you don’t write these, the ranges correctly. Like this, this query is very simple. So this query is hard to mess up. But as you write more complicated expressions for these things, you have to be very careful that you don’t write overlapping expressions, at least where you don’t intend to.

So like, it’d be like very easy to make a mental error and just say, like accidentally say less than 9,000 here and greater than 9,000 here. And what we would miss everyone with a reputation of exactly 9,000, right? So we would leave them completely out of the results. Same thing would be like if we accidentally did greater than or equal to 9,000 here, we would be double counting 9,000s across both of these.

So just when you’re writing these things, just take time, take your time and be careful about like when you’re setting up these ranges that they don’t overlap or exclude things accidentally. But this can be a very handy way of getting you a like full result set like we saw without having to write repetitive sub queries without having to hit the user’s table multiple times. This just does this just as one big scan of the user’s table makes life a lot easier and more simple, especially if the query that you need to pull those things from gets big and complicated, right?

This is just a very simple single table. If you have like a big long set of joins and you’re joining in views and functions, who knows what else then you know, like just having to hit that once rather than three, four more times that can be a lot easier on your server. Anyway, that’s just a little bit about case expressions and sort of more interesting ways of using them.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video where we will talk about using case expressions defensively in queries to avoid errors and other strange issues. So, hope you’ll join me there. Anyway, thank you for watching. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.