SQL Server Performance Office Hours Episode 35
Questions:
* I have a large table where 60% of the rows are defunct, the table is updated nightly by an ETL process that takes several hours. To improve query performance during the day, would you recommend using filtered indexes, or an archive table?
* Hello there! If a table that has a computed (not persisted column) that uses GETDATE(), is used in a scalar function, that function is never going to be inlineable no matter what, regardless if that column is not utilized. A covering index will not work either.
* What features would you like to see added to constraints in SQL Server?
* I often performance tune reasonably simple parameter sensitive queries that search date ranges. Before I was touched by your magnitude, I would approach this by dumping the sub-queries into a temp table. Now, I use dynamic SQL. Do you ever use temp tables for simple parameter sensitive queries? Why/why not?
* Why do so many people hate business logic in databases, and even stored procedures?
To ask your questions, head over here.
Video Summary
In this video, I dive into a series of community-submitted questions during my Office Hours session, providing insights and advice on topics ranging from managing large tables with defunct rows to the intricacies of computed columns in SQL Server. I also discuss the potential benefits of using temp tables for parameter-sensitive queries and address common misconceptions about business logic in databases. Whether you’re looking to optimize your database performance or just curious about best practices, there’s something here for everyone. If you have a question of your own that wasn’t covered, make sure to check out the video description where you can find links to submit your Office Hours questions directly.
Full Transcript
Well, through the magic of time travel, when you’re watching this, I will be in Europe. So I will be at Red Gate, Netherlands stuff. So that’ll, that’s fun to think about. All right. Next week at this time, where will I be? Anyway, we are doing, because it is time for that, Office Hours, in which I answer five community-submitted questions. Wonderful, right? Good stuff. Look at me. If you want to submit a community question, look down on the video description. You’ll see all sorts of helpful links. One of them is to ask me Office Hours questions, and you can, you can go get your question answered on this thing, if you, if you have one that you, you deem worthy of my time. Of course, just about anything is worthy of my time. So don’t set the bar too high. The bar is somewhere in hell, probably. With some damn soul twisting on it. So submit your questions. Thank you. There are also other helpful links in there. If you want to hire me, buy my training. I recently dropped my performance engineering course, which is, there, there is an on sale link down there. If you’re interested, if you’re interested in that. And, you know, some other stuff too. Like I said, by the, when you’re watching this, I will be on, like, I will be in Europe for the Utrecht event.
So, uh, what you should do is if you, if you’re in the Utrecht area, uh, maybe, maybe this is your last chance to buy a ticket. So check that out. Uh, and of course, after Utrecht, I will be home for a long time, uh, relatively speaking, and then going to Seattle, uh, November 17th to 21st for the past data community summit. Where, uh, Kendra Little and I will be delivering, delivering like a fastball, uh, two days of T-SQL, uh, pre-cons. And then I will be home, uh, I don’t know, doing things. What will I be doing? I don’t know. Figure it out when I get there. Anyway, let’s answer these questions because that is what we do for office hours.
And, uh, you know, try to do our best. Anyway, uh, first question here is, do, do, do, do, do, do. I have a large table where 60% of the rows are defunct. What a, what a nice word, defunct. Uh, the table is updated nightly by an ETL process that takes several hours.
Well, I mean, my rates are reasonable. Just saying. Uh, to improve query performance during the day. All right. So it’s a little mixed bag of information here, isn’t there? At night, this process takes several hours. During the day though, uh, would you recommend using filtered indexes or an archive table?
Well, if 60% of the rows are defunct, um, it seems like they’re just taking up space, useful space in this table. There could be, you could, you could be doing other stuff with it. Um, I mean, sure. Filtered indexes are nice. They’re smaller, right? They’re more compact. And if your queries meet all of the, you know, sort of prerequisites for, uh, filtered index matching, then, uh, they, they could be quite useful to, to screen out 60% of the rows there.
But, uh, general, general, if I were you, I would probably be working on an archive process because, um, you know, there are all sorts of rules for filtered indexes, what, what kind of columns you can make them on, determinism, uh, you know, um, whether queries match to them appropriately, uh, stuff like that. So, uh, I mean, I would probably go with an archive process.
If you want to read about a really neat trick for that, uh, I actually have a post on my site. So if you search on erikdarling.com for, uh, simplifying archival processes, uh, you’ll see a neat piece of sort of nested DML query that will make archiving data a lot easier for you. Um, you know, uh, if you’re too lazy, then sure, filtered indexes might help depending on stuff.
But, uh, you know, generally, uh, I would probably, if I, if I had more than 60% of the rows in a table not being useful to queries, I would probably just want to get rid of them. All right. Uh, do, do, do, do, do, do. I don’t see a question mark here, but that’s okay. Uh, hello there. Hello back.
If a table that has a computed, not persisted column, well, it can’t be persisted if it uses get date because it’s not deterministic, uh, is used in a scalar function, that function is never going to be inlineable no matter what, regardless if that column is not utilized. A covering index will not work either. Um, yes. Uh, one of, one of the restrictions on UDF inlining, uh, is that, uh, you can’t have non-deterministic functions in there.
Things like get date, sysdate time, new ID, ran, things like that. Uh, those make, those make for non-deterministic results because they are, uh, they are decided at runtime, not, uh, stored forever and ever in a way that would make them deterministic. So, um, what, so since you can’t persist that computed column, then, uh, enabling trace flag 176, Paul White has a wonderful, uh, post called properly persisted computed columns, uh, in which he discusses how to, uh, get around what you’re talking about, but, uh, get date there is going to screw it up, uh, in the permanence.
So, um, maybe don’t do that. If you need other options, my rates are reasonable. Uh, here we go. Oh, this one has a question mark. What features would you like to see added to constraints in SQL Server?
Uh, I’m not that guy. Um, I, I realize that there are all sorts of, uh, constraints that other databases have that SQL Server doesn’t. Uh, I just don’t spend a lot of time, uh, creating them. Uh, so, you know, like, you know, I realize check constraints have some deficiencies when compared to other, uh, databases, particularly to Postgres.
There are no, like, real domain constraints, things like that. Um, you know, they’re just, like, really complex constraints, um, you know, don’t often work out well. Uh, you know, even weird little shortcomings, like foreign key constraints, uh, can’t be filtered or something like that. You know, there are, like, all sorts of things that might be nice to have, but, uh, I, uh, you know, I question the, uh, the amount of effort that would go into adding them to their product, uh, based on, or versus, uh, how many people would actually use them since, uh, I don’t see a lot of people generally using constraints all that well or all that wisely now. So what would I like to see added? I don’t know.
There might be some neat stuff in there, but, uh, in general, um, I don’t really see people using them anyway. Um, some temporal constraints might be interesting, like a really easy way to figure out, like, the most recent active row or something, or, you know, something like that, but, uh, you know, people are strange. People are strange. Sorry. I don’t have a more explosive answer for you there.
I just, I just don’t get excited about constrained data. It makes me just think of tight underwear. Uh, I often performance tune reasonably simple parameter sensitive queries that search date ranges. Before I was touched by your magnitude, oh, well, speaking of tight underwear, uh, I would approach this by dumping the subqueries into a temp table. Now I use dynamic SQL. Do you ever use temp tables for simple parameter sensitive queries? Why, why not? Yes. Um, the answer is absolutely yes. So, uh, where temp tables make for, uh, a good, um, sort of parameter sniffing, uh, sensitivity issue is, uh, uh, let’s say that, um, like, just to make things simple, let’s say you have a store procedure that accepts one parameter and, uh, the table that, and that parameter is a sensitive one and the table that you hit, uh, you know, depending on how many rows that, uh, that parameter produces could be a nightmare.
Right. And let’s say that, you know, like, you’re like, we’re not just selecting from that table. Let’s say there’s some like, I don’t know, joins or exists or not exists or other stuff like that. Um, it can absolutely help to use a temp table to dump the results of the select for, from that table into there, and then do your more complicated part of the query that doesn’t have any parameters touching it with the results of that temp table. Indexing that temp table could also be useful depending on what is going on elsewise in the query, of course, but, uh, you know, I can’t tell you that based on your question. Um, the only, the only thing you have to really be careful of is if, uh, you know, how sensitive that parameter thing is, if it’s a difference between like, you know, a few hundred to a few thousand to even like, maybe even like a couple million rows, um, then a temp table can be perfectly fine. But if the, the magnitude of your, uh, sensitivity is, um, in the tens of millions or hundreds of millions of rows, then, um, you know, probably don’t want to dump that into a temp table. So be a little bit cautious there. Know your data or something like that.
Love your data. Uh, let’s see here. We got, okay. This is question number. Let’s make sure we got two, one, two, three, four, five. So this is the, this is five questions. I did not screw that up this week. Lucky me. Why do so many people hate business logic and databases and even store procedures? Well, people tend to hate what they’re bad at. That’s it. Uh, I hate PowerShell.
I’m bad at PowerShell. Uh, I hate using it. I hate typing it. I even hate using an LLM to generate it because I just hate the way it looks. I hate dealing with it. Uh, um, I am not good at figuring out what PowerShell is doing and I have no interest in getting better at. Uh, I, I frankly think it’s kind of a crappy language. So I hate PowerShell. So I don’t want to use PowerShell. So I don’t use PowerShell and I hate PowerShell. So when people talk about like things like in, in a very generic sense, like business logic, uh, or store procedures being bad, it’s probably because they’re bad at them. Right. And if, if, if I were terrible at SQL and I was very good at C sharp or some other programming language, uh, Java, maybe, I don’t know. I could code in crayon with Python or something.
No, we got, I’d probably be like, oh, screw that database. Oh, it’s hard to do all that stuff. I can do all this stuff in application code that I know well. So, um, you know, they’re probably just people who are bad at SQL. They’re, they’re, they’re bad at designing things and within the database, that’s about the end of it. Um, I I’ve never seen, uh, I’ve never seen a valid argument for it. Uh, again, sorry. I’ve never seen a valid argument against business logic in databases or store procedures that didn’t, um, also express what would seem like willful ignorance about SQL as a language or the possibilities of SQL as a language. Um, or the rather the capabilities of SQL as a language, um, that like I’ve never just never seen one. Uh, they’re all just like, Oh, I did this one thing once and it was bad and it didn’t, it was slow or like long rambling things. And you’re like, well, if you, if you were any good at this, you would have done it different. Right. You tried this one thing and it sucked and you were like, never again.
Right. It’s like, I don’t know. It’s a weird take. It’s a very weird take. Like don’t, don’t have your database do anything with data. Don’t put logic where the data is. Okay. Okay. Well, you know, uh, you know, I guess a little bit further to that, if, if you’re that type of person, um, you know, you could very easily find, uh, teams that agree with your sentiment and go, go work there. Um, but I think, uh, if you are a ma, if you are managing that kind of team and you start making rules like that, um, that that’s, that’s a bad idea. You, you should, you should allow your team to, um, to work where they are most comfortable and most proficient, do not set arbitrary rules about where, uh, certain logic should live or which APIs or procedures artists, another API, uh, should be used to interrogate the, the data in a database because you’re, you’re just hamstringing them. If you say, no, you can’t do, you have to use an ORM now and you have to put all the logic in some other piece of code, then they’re, they’re not gonna, they’re not gonna do their job as well. So don’t be stupid. Anyway, that is five questions. I think we’re done here. Thank you for watching. I hope you enjoyed yourselves.
I hope you learned something. And once again, if you would like to ask a question here, uh, please look down in the video description where the, the link to ask questions is, because that’s, that’s the way to ask them. Otherwise I cannot, I cannot accept psychic questions here.
All right. Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.