Learn T-SQL With Erik: Derived Tables

Learn T-SQL With Erik: Derived Tables


Video Summary

In this video, I dive into the world of T-SQL and specifically focus on derived tables, explaining why they are a preferred tool over common alternatives like Common Table Expressions (CTEs). Derived tables not only simplify complex queries but also provide a cleaner way to reference expressions across different parts of your SQL statements. I discuss how logical query processing works in SQL Server and highlight the limitations of using aliases in certain clauses, such as `GROUP BY`, which can make writing clean T-SQL code challenging. By leveraging derived tables, you can avoid redundancy and improve readability without affecting the underlying query plan or performance. I also touch on some personal frustrations with modern tools like SQL Prompt that try to analyze my code, emphasizing that sometimes old-school techniques are still the best approach.

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we are going to talk a little bit more about the old T-SQL and why you should probably learn it from me and not some other useless hump out there trying to pretend that they know about T-SQL. Really just stinks at T-SQL and probably just had an LLM write everything anyway. And in today’s video, we’re going to talk about T-SQL. We’re going to talk about derived tables. Derived tables are, of course, my preferred, my generally preferred mechanism over CTE for a variety of reasons. Mostly, number one reason is irking people who white knight for CTE. That’s top of the charts for me. But they can make queries, they can, rather, they can simplify a lot of things in queries that would otherwise require quite a bit of syntax. Now, part of the reason why derived tables are necessary is sort of because of, again, the way that logical query processing does stuff. Like, we talked about how, like, you know, you have, like, it starts at, like, from rather than select and, you know, makes it way, like, down to joins and where’s and group by and having an order by or select and then order by. And because of that, there are certain things that you write in the select list that aren’t available in the group by, like, unless you nest them, right? Unless you nest the query and you give, like, another from to start the logical query processing thing over with again.

Which, you know, like, I make fun of people who, like, are like, we’re rewriting SQL and we’re gonna have it so you write from first and, like, you can pipeline syntax. Because, like, every time someone critiques SQL, the only two things they ever come up with, despite, like, decades of, like, time that they could have, like, thought of something different. It’s always the same two things. It’s always, oh, from first, pipeline syntax.

Now, the pipeline syntax thing I sort of get, but it’s the way that they, the way that they name it is stupid. You know, it just looks like crappy PowerShell and I hate it. DuckDB has a much smarter name for it.

They, DuckDB calls it, like, like, function chaining or expression chaining where you can, like, like, take, like, in one select list, you can take, like, like, the result of one function, like, alias to something and then use it in another, like, function called within the exact same select list. Oracle also has the ability to reference aliases in the group by clause.

So you don’t have to rewrite expressions in the group by clause, which is amazing because, like, you can write some pretty, like, gnarly expressions and all of a sudden you’re like, wait a minute, I have to group by that now. Jeez, wait. And so, like, you just, like, you have to, like, copy and paste it and remember to take the aliases out because you can’t have, like, you can’t, like, group by some expression, like, or rather, like, some, something equals some expression and my SQL Server freaks out.

It can’t handle it. So derived tables can, like, at least for the way that T-SQL is engineered today, be useful for simplifying that stuff. What they don’t do, though, is change the way that your query plan physically looks, right?

Because SQL Server still has to, like, process the query the same way, just with an outer sort of layer of things. So this is the query that we’re going to start with, right? And we’re going to select post ID and we have this upvotes thing, which is a sum, and then this downvotes thing, which is a sum.

And then, like, we’re not going to bother with the group by thing in here because we’re not going to group by those, right? Those aren’t worth, those aren’t things that we need to group, need or want to group by for this query. But if we wanted to write a having clause that did some math on those, like, we would have to write a, oh, get out of here.

We would have to write a lot of, like, extra code to do all this stuff, right? Like, this is a, just honestly, this is a nightmare, and it makes me completely understand why people get mad at SQL. What makes things worse is that, like, when you talk logical query processing, like, you can absolutely reference expressions in the order by most of the time.

But if we were to unquote this, we would immediately be greeted with red squiggles. And if we tried to run the query with, like, the alias downvotes minus upvotes descending, this would, like, this, like, we would get errors there. So we have to, like, actually redo the whole order by thing in order to do that.

Where derived tables come in, it can be useful is to sort of simplify that thing. For some reason, oh, man, so, like, I just switched to, like, the SSMS 21 general availability, and I just upgraded SQL prompt to a new major version. But it’s revolting on me, and it’s trying to analyze my code, and I don’t need your code analysis.

SQL prompt, I know what I’m doing. I teach the T-SQL here. You don’t teach the T-SQL to me. SQL prompt.

But what we can simplify things a bit with a derived table is we can just write our inner query like we would normally here. But then outside of the query, we can reference these aliases, like, what I think is in a lot more clean way, right? Just where upvotes is greater than zero and downvotes is greater than upvotes times 10.

And then we order by downvotes minus upvotes the way we did in the thing. But it was all, like, expressions, and it was a big mess. It was very, like, it gets very unclear and very tangled up very quickly.

But what I want you to notice about the query plans for both of these, if the good Lord will allow me to live long enough to highlight them both, is that they’re the same, right? Like, this doesn’t change the query plan. This doesn’t make the query faster, right?

This doesn’t help SQL Server, like, do anything. Just like CTE, the results of derived tables aren’t materialized in any way. It doesn’t matter as much for derived tables because you can’t re-reference derived tables in a way that you have to re-execute the query.

But, like, the point is that, like, you still have to apply filters the same way to this. It doesn’t help you, like, make, like, push filters down, like, any step further, right? So, like, if we look at what this filter does, it’s just, like, where expression is greater than expression times 10 and expression is greater than 0.

SQL Server has to do the same thing here, right? Like, it’s the exact same filter that we have to apply to both of these, right? And, like, we still have to sort by up here, right?

When we look at what we’re, like, this is our order by clause. It’s expression 1, 0, 0, 3 descending. And this one, it is still expression 1, 0, 0, 3 descending. So no matter which way, like, you write the, like, this is more like a query cleanliness.

This is, like, a hygiene thing that this makes, like, when we talk about things that makes queries, like, easier to read and easier to understand, like, this is what does it for me. It’s, like, having these expressions just written once in the main part of the query and then, like, just being able to reference those aliases because we have an outer, we have, like, we have the nesting. We have the derived table and then we can talk to those aliases outside of the derived table, right?

So that’s where they really come in handy. That’s, like, where they can really make a big difference. Now, there’s all sorts of reasons why you would put a derived table in a query, of course. There’s many, many uses for it.

This is just one kind of, like, good sort of, like, code hygiene cleanliness one. So important things about derived tables, they’re much cooler than CTE. Good job on those.

And, of course, they can make your queries a lot, like, more compact, cleaner, easier to read and understand. And they can, like, you don’t, like, it’s not going to change performance. It is going to change your performance.

It’s going to change how fast you can figure out what the hell that query is doing. So thank you for watching. Hope you enjoyed yourselves. I hope you learned something. All of this content is still available at the presale price down using the link in the video below.

It is still companion content to the pre-cons that Kendra Little and I will be teaching in Seattle. And if you are attending those, of course, you will get free access to this content with pre-con admission. So I think that’s everything, right?

Do I have anything else to say? I don’t think so. I don’t know. Check out this neat SSMS 21. Yeah.

Yeah. Look at that. Real, real nice looking. Dark mode. Got co-pilot up there that sucks. God, what a piece of crap that is.

Anyway, before I get too far off track, I’m going to go now. All right. Goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.