In this video, I delve into the fascinating world of locking hints in SQL Server transactions, sharing insights from a recent client project where we explored how to handle concurrency issues effectively. We start by creating a simple table and inserting five rows, then dive into demonstrating various locking hints such as `upd lock`, `row lock`, `repeatable read`, and `read past`. I explain why these hints are crucial for maintaining data integrity while allowing concurrent processes to run smoothly without stepping on each other’s toes. The video showcases how different isolation levels and locking hints can affect transactions, both within and outside the transaction scope, providing practical examples that highlight their importance in real-world scenarios.
Full Transcript
Erik Darling here with Darling Data, and we have a fun video. Today we’re going to talk about locking hints in transactions. I was working with a client recently, and we were talking about various ways to handle concurrency with transactions that might have to deal with each other in kind of funny ways. And when I started showing them locking hints, they were like, but wait a minute. If we start a transaction and we have a select query with these locking hints, isn’t it going to block the thing we’re trying to do afterwards? And it was like, no, it’s magic. And I’m going to show you that same magic here today. Down in the video description, you can find all sorts of helpful links where you can hire me for consulting, buy my training, support this channel, ask me offers hours questions, and of course, there are all sorts of things that you can do aside from that, aside from monetary exchanges, like subscribe and tell a friend, and then more people show up, and I don’t know, maybe one of them. We’ll do one of them will do one of the other things. And that’d be grand. That’d be wonderful. Leaving the house. Going various places. At least they’re closer than Seattle. Data 2 in Nashville, March 6th and 7th of 2026. And of course, Data Saturday, Chicago, March 13th and 14th of 2026. I’ll be doing pre-cons at both of those, so buy your tickets now.
Make me feel like a cool person who sold all sorts of seats and stuff. And I can say, hey, look what a good job I did marketing this thing. You can buy tickets somewhere around those links that are in the PowerPoint thing there. Datatune.conf and datasaturdaychicago.com. So I assume those are valid working links because I copied them from a very trustworthy web browser, internet browser. So anyway, let’s go talk about locking hints and stuff here. So what I’m going to do is I’m going to create a simple table and I’m going to stick five rows in it.
And what I’m going to do now is talk a little bit about transactions and whatnot. So because SQL Server does not have a select for update clause, when we want to, say, select a row that we’re going to do stuff with, that we don’t want anyone else to do stuff with, we have to take some additional steps. I’m beginning a transaction, declare, like having something to hold that value is good. So I’m declaring an integer up there called ID, right? And it’s null to begin with. And I’m going to grab row five. In real life, you would probably have some other logic here, like, you know, like, like, grab a row that needs work done. But for me, ID equals five works just fine.
And then after that select, there will be an update. What I want to do is just kind of show you like, like, a that like within this transaction, everything works fine. And what kind of queries outside of this transaction would be maybe affected by it. So let’s start just by beginning the transaction and setting the ID. So we’re just going to run to the select on this one, right? We haven’t touched the update. That’s kind of annoying. What I hate about this is that like, even though I’ve like done begin transaction, declare, select, if I try to run, if I’ve like paused now. And so if I try to run the update, it’s going to be like, well, you have to declare the variable ID. And I’m like, but I did. And I have a transaction open. Why don’t you remember that?
Anyway, we come over to this window. The types of queries that like that is designed to protect against would be competing queries. Like, so like if this is like a process that you’re writing, where like only one person should or one thing, it doesn’t have to be a person, like one thread or one worker should only be allowed to work on something at a time. The code outside of that process should like can access things normally. So like, I don’t even need a no lock in here. If I just say select from that table, I get all the rows back because the select with the upd lock, that hasn’t taken a lock that competes with a select just trying to read data. Right. So if I were trying to update row five, I of course would not be able to update row five.
But oh, sneaky weeky, I could update row four, right? If I do this and say, hey, go update row. Oh, wait, I have to cancel that first. It was it was being blocked. That’s why I just kept running without saying it did anything. But if I update this for row four, row four can go do something right. So that’s that’s nice. But if I were to try to run another select with an upd lock, then this this select would be blocked because the upd lock is like, wait a minute, I’m trying to upd lock.
You can’t upd lock to right. So like, you know, like if I had a where clause on this for where ID equals four, then let’s see. Let’s do a little experiment together. I didn’t plan on this. Just just occurred to me while I was doing all this. So if I say row four, well, that that that can go and do something right.
So but like the point is that if I were to try to take, say, row five now, right, because row five is the one I’m updating in the other window. If I were to try to do that now, SQL Server is like, wait a minute. Row five has an upd lock on it. You can’t upd lock that row two or you can, but you have to wait.
Now, coming back over here just for a moment, the hints that I have supplied to this query are upd lock, row lock, repeatable read and read past. Now, for a lot of processes, you know, the idea of serializable sounds great, right? Because, you know, serializable is quite strict and would prevent all sorts of strange phenomena.
But if you want this process to run nice and sort of like concurrently, the read past hint is often useful because it like you like basically you want people to be able to like just skip over anything that is currently locked. Right. It’s not like no lock where you can you’ll read the locked row. Read past is just like that locked row is none of my business. I’m just going to move on. I’m going to keep going here.
So like if we come over here and we were trying to say all this stuff with a serializable hint, we would get an error that says you can only specify the read past lock in a read committed or repeatable read isolation low. So that doesn’t do us any good. So but this query here, this will allow us to read everything but row five.
Right. Like if this were a no lock hint, we would get row five back with like I mean, I mean, didn’t change anything. So it would just be whatever data is in row five. It’s just a select query at this point. But so like we just we’re just like row five. You have a lock. We’re not bothering with you. Right.
You are not consuming any of our effort or energy here now. So like really processes like this. And what I want to show you next is just like within the transaction, those locking hints are fine.
But also like like for processes like this, you really have like, you know, to consider, you know, like like I mean, concurrency is obviously the most important thing, because if you want to make something like this multi threaded, you know, you want to have multiple processes, be able to take something and work on it.
Like, you know, design a queue type situation or design like a worker tables type situation. Then these types of locking hints become really important because a you want like every process to be able to go get work freely when it’s ready for work. But B, you don’t want it double working or overworking something that is a either like already in process or something that’s already been processed.
So like those are the things you want to like make sure you bake into a process like this. If I just hit commit on this transaction just to get rid of stuff, I’ll reload the table just so we start clean here. And then I’m going to run this whole thing.
But now with the update. Right. So now the update we have the begin transaction, declare an ID, select the thing we want where ID equals five and then go update stuff down below. So like this all runs fine. So like within this transaction, the locking hints that we’ve supplied with the select don’t don’t like mess up the update.
Like we don’t block ourselves in our own transaction. That would be crazy. It would be insane. Right. Just be bonkers. But we’re still like, you know, still holding stuff out here.
So like now if I try to run this query, now this query does get blocked because that update is doing stuff. If I want this query to do anything useful, I, of course, have to say with no lock. And now I can see the dirty, the dirty read down here where I have added one day to this.
Right. So these are all 1110, but this one’s 1111. Because if you remember, the update down here is just adding one day to that column. All right. Cool.
So this would also, you know, of course, block other updates to row five, row four. You know, we still get in there. But if we were to try to say like, hey, go talk about row, go mess with row five. This would get blocked. Right.
This just drags and drags because row five is currently locked in that transaction. Likewise, you know, you know, trying to select row five with an up D lock hint would also be be prevented. Right. Row four, we can still get at.
Oh, no, I hit insert. I’m never going to recover. But row four, we can still get at. Of course, this is still going to throw an error because you can’t mix serializable and read past. And this thing, of course, would get just not get blocked because we have repeatable read and read past.
But again, we’re just skipping over row five because row five has a lock on it. And we are just like ignoring. We’re ignoring that.
We’re ignoring any locked rows via the read past hint. So, you know, kind of a short video to talk about how process like processes like this generally work. Generally, you know, how to make them nice and concurrent if you want to multi thread them and how to make sure that those threads stay sort of within their own lane and don’t mess each other up.
But yeah, you know, just sort of a little primer for maybe some more videos down the line. Anyway, that’s what I wanted to talk about. And I’m done talking about it.
So now I’m going to go do something else. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we are going to talk about optimizer rules that we don’t have. But well, we’re going to talk about one optimizer rule that we’re getting soon that I’ve talked about before.
But, you know, I want to talk about optimizer rules that I wish we had like that one. All right. Anyway, thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
bit Obscene: When Will AI Fix The Query Store GUI?
Video Summary
In this video, I delve into the myriad issues and quirks of SQL Server Management Studio (SSMS), focusing particularly on its Query Store feature. As a seasoned database professional, I share my frustration with how SSMS handles wait stats and query performance tuning, highlighting specific examples where the tool falls short. I also explore potential improvements that could be made, such as better line graph visualizations for wait times and more intuitive default settings for frequently used features. Through this critique, I aim to provide a realistic perspective on what can be improved in SSMS while acknowledging the challenges faced by developers and database administrators in maintaining efficient and effective tools.
Full Transcript
All right, welcome back to the Barely On Life Support Bit Obscene podcast. My AI-generated co-host Joe Pilot is here today, and we’re going to be talking about the state of SQL Server tooling in, well, I guess 2025 is well nigh at the end, but, you know, not a lot of stuff gets done by the time the holidays roll around. So, I think we can call this a pretty clean wrap up on the year. Anyway, we are once again brought to you by Darling Data and Beer Gut Magazine, where you can find all of the finest beer and sports related content known to man. So, with that out of the way, take it away Joe Pilot. Tell us, tell us your thoughts on SQL Server tooling. Joe Pilot, You know, the great thing about AI is it doesn’t take holidays or overtime. I’ve even seen some government websites. They’re only open from like, nine to 5pm Monday through Friday. They literally just stopped working after, after like 5pm, you know, get some AI in there. We can get those government websites going.
Joe Pilot, You know, it’s like what Microsoft does with fabric when no one’s using it, they turn it off. Joe Pilot, So, yeah, it’s not it’s not an outage. It’s just, you know, there’s, there’s saving power. They’re going green. Joe Pilot, I’ve used.
Joe Pilot, SSMS not as long as some, I think a pretty long time. I think it’s been 2011. So, 14 years. Eric, do you remember? Joe Pilot, Oh, yeah, I do. Of course I do. I was I was sitting at my desk, and at a different job, of course, not this desk, much different desk. And I had been talking to my boss about, like some annoyance that I had with an Excel file, I think doing a VLOOKUP. Joe Pilot, And he was just like, Oh, you should just do that in SQL. And I was like, All right, how do I do that? And then a few minutes later, an email came through that said Eric needs SSMS installed.
Joe Pilot, This was SSMS 2005. And so this is this was around the year 2007 2008. And I got some some really nice IT worker. Well, I don’t even think he came by my desk. I think he just did a remote install. And he said, Go ahead and open it up. And I opened it up. And the rest is history. Joe Pilot, So yeah, so if you add it all up, we got like over 30 years of, yeah, SSMS experience. Joe Pilot, That’s true.
Joe Pilot, We definitely know what we’re talking about here. Yeah. Joe Pilot, I don’t remember the the date for this because it was such a traumatic memory. Joe Pilot, But you remember that the the dark rise of Azure Data Studio? Is that what it’s called? Joe Pilot, Well, do you remember?
Joe Pilot, I remember. Joe Pilot, I remember. Joe Pilot, I remember. Joe Pilot, I just blocked it out of my memory. Joe Pilot, Yeah. Joe Pilot, I was so s. And then that turned into 80s. And then. Joe Pilot, ADS needed an SOS.
Joe Pilot, So there’s foreshadowing. Joe Pilot, Yeah, a little bit. Joe Pilot, Yeah. Joe Pilot, And now, ADS is dead long live Visual Studio code. Joe Pilot, Yeah, I remember ADS and it was going to be like the new like, we’re not going to work on SSMS anymore.
Joe Pilot, Yeah, for the future. Joe Pilot, Why would you want to use that old yellow program? Joe Pilot, Multi platform, cloud native friendly. Joe Pilot, Yeah.
Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, Yeah. Joe Pilot, The slow trickle on that was great too, where it was just like the future. And then it was like the future. Well, it’s more for developers, not for DBAs. DBAs still need SSMS. And then it was like, no one needs that. Joe Pilot, We got query plans, four years later. All right.
Joe Pilot, I mean, I was looking at an old stack exchange, or an old stack overflow question I asked. And yeah, it was about SSMS. And there was a very helpful comment by a gentleman, who informed me that Azure Data Studio was the future. So I really should just stop asking questions about this. Joe Pilot, Yeah. Joe Pilot, Why bother?
Joe Pilot, Deprecated, obsolete, old fashioned, no good, yellow SSMS program. Joe Pilot, Yeah. Joe Pilot, Yeah. Like, and like, I think, like, think about how much better we’d all be. If instead of doing their like, their like, their ADS SOS side quest, they just like worked on old, you know, good old yellow SSMS. Joe Pilot, Well, no, good old yellow SSMS has dark mode now. And it’s got, you know, a visual studio shell. It’s, it’s, it’s a hip happening place all of a sudden.
Joe Pilot, I mean, like, I’ve, I’ve, I’ve, I’ve, I’ve, this is true. I’ve heard a big tech company is like, you get attention as you like, to, to release a new product. Joe Pilot, You know, like, you know, like, I feel like, I mean, like, higher ups don’t use SSMS, like, couldn’t someone have just taken SSMS, like, just have to cut a blue paint on it, add Azure to the name and been like, hey, we developed this great new cloud first cloud compatible cloud native IDE for SQL Server, everyone gets all their promotions and bonuses. Joe Pilot, Yeah, it’s all great. We all celebrate and then we could like, do something useful. I mean, that’s Yeah, I’m really it’s just a matter of changing.
Joe Pilot, It’s like a matter of just changing a splash screen from SSMS to Azure SSMS or something, right? Joe Pilot, So I was, I was, I was going to share my screen, but the host is quite rudely. Joe Pilot, Oh, well, the host, the host is making you a co host, the host was unaware that you were trying to share your screen.
Joe Pilot, That’s the unpredictability of AI. Joe Pilot, So, I mean, let’s, you know, I’m sure I’m sure you can you can you see the yeah, no longer yellow and now dark mode. Joe Pilot, Yeah, I know.
Joe Pilot, It’s gorgeous. Look at that. Look at the sleek lines. Joe Pilot, Yeah, I like it. Joe Pilot, There’s a saying which I’m sure you’ve heard, which is I might have invented it. Joe Pilot, The opposite I might have said it first.
Joe Pilot, The opposite of love isn’t hate, but indifference, right? You’ve heard that one. Joe Pilot, I sure have. Joe Pilot, And I feel like that one applies both to SSMS and the query store. Joe Pilot, All right.
Joe Pilot, I’m going to open up the query store GUI. Joe Pilot, Oh, come on, Joe, we’re humans here, please. Joe Pilot, Why? Joe Pilot, Like, like, what? Like, why are you going to show me this graph? Like, like a, like a bar to like out of all the beautiful detail data available in query store.
Joe Pilot, You’re going to show me a big blob of green and a slightly less big blob of blue. Joe Pilot, But look at the green. Joe Pilot, And then a little tiny blob of an unknown color because it’s like I have never gotten any value whatsoever out of this.
Joe Pilot, No, no. Joe Pilot, Like thing that appears to be some like intern project that was done like 10 years ago. Like, like, this is not how people use use databases. Joe Pilot, So like, you’re not, you’re not wrong.
Joe Pilot, You’re not wrong. Joe Pilot, You’re not always to like go to here. Joe Pilot, I mean, you’re not entirely wrong because the query store GUI was actually the product of a Microsoft hackathon. Joe Pilot, It was not intended. It was not planned. It was an unplanned product pregnancy. And it somehow got a commit was made during a hackathon. I don’t know someone got drunk under a table. And now we’ve now we’ve got this, this thing.
Joe Pilot, I didn’t know that. Joe Pilot, Yeah. Joe Pilot, I might learn new things. Yeah.
Joe Pilot, This is why you’re a good man listen to. Joe Pilot, Yeah. Every once in a while. Joe Pilot, The thing is, like, every time I open this, no matter what database, it always defaults to. Joe Pilot, It does.
Joe Pilot, There’s no way to change that default. Is there, Joe? Joe Pilot, I don’t believe there is. Joe Pilot, No, there sure isn’t. Joe Pilot, And worse, I think worse is that there’s no way to search for anything meaningful in there. Like, you want to find a store procedure, you want to find some query texts, where do you go? Joe Pilot, You click one of these buttons. But if you, if you click this one, you’ll see.
Joe Pilot, Yeah. Joe Pilot, With additional details. Yeah. Joe Pilot, And you have the audacity. Joe Pilot, Mm. Joe Pilot, To say, Joe Pilot, Show me the max.
Joe Pilot, Oh, oh, dear. Joe Pilot, We, we, we, I can’t do that. Joe Pilot, Huh? Huh? Joe Pilot, No. Joe Pilot, You might be thinking, well, Joe Pilot, You know, I’m asking for the max. I have a pretty big, like, one rep max. You know, it’s a lot of effort. So, like, Joe Pilot, Like, maybe, maybe the men will be easier. Right? But it’s, it’s not easier.
Joe Pilot, Yeah. Joe Pilot, You know, like, no, no, but hey, if you want standard deviation, because, Joe Pilot, Hey!
Joe Pilot, Everyone knows that standard deviation is the most useful performance tuning metric. Joe Pilot, So I’m not entirely sure what a standard deviation is. Occasionally, I’ll try to figure out what an order of magnitude is, but they both escape me pretty well. I like the averages in there quite a bit. Joe Pilot, Well, these are the average.
Joe Pilot, Ah! Joe Pilot, But you know, if you want, you know, I can cook this and I can spend the next hour cooking these. Joe Pilot, Yeah. Joe Pilot, And it’ll always say, Can I connect database? So if you’re interested to know, Joe Pilot, This works in SQL Server 2016. It does not work in 2017, because 2017 introduced query store waitstats.
Joe Pilot, Sure did. Joe Pilot, And apparently, presence of the query store waitstats results in a generated query, which makes no sense. Can’t compile, it errors out. I think I actually traced the query and put in a blog post like eight years ago and like screamed to them. Joe Pilot, And then I said, I’m in the void about how dumb this was, and I used to be the fix. And, you know, as you can see, like, I’m on the most modern, you know, the sleek dark mode, SMS.
Joe Pilot, It says 2015. But I swear I download this recently. And this bug is still there. It’s… Joe Pilot, Yeah, so I don’t know. I don’t know about, um, what was it? There was a queries that were like missing top without an order by? Or was that something else? Joe Pilot, Might have been? Yeah, I remember being kind of snarky about it.
Joe Pilot, Yeah. Joe Pilot, Maybe that’s why I didn’t get it fixed. I’m getting my, you know, what I’m due. Joe Pilot, Everyone’s busy getting query store on AG replicas implemented. So…
Joe Pilot, I don’t know, like, it’s… Anyway, one thing that I’ll admit a little embarrassment here, like, I like to copy and paste sometimes, like, I’ll just, you know, like, grab this, copy it and paste it. Oh, I doubt it was fine. Oh, I don’t want to talk about what I’m doing there. Joe Pilot, Right. Here, this one. Look at that beautiful, you know, who needs more than one line? Who needs line breaks? Joe Pilot, No. Zero.
Joe Pilot, You don’t believe in line breaks. Joe Pilot, No. Joe Pilot, Now, to be fair, and this is actually something I learned while preparing for this. If you click this button, it then, like, adds line breaks. Joe Pilot, Oh, that’s fantastic.
Joe Pilot, I guess they’re not really, like, copy and paste fans. Joe Pilot, But what does that mean? Containing a line break? Joe Pilot, That’s a fun comment to put in there. Joe Pilot, That’s a… I’m like, and I got all the real, like, who communicates by object ideas? I have never had a single, like, oh, man, like, show me object ideas, please.
Joe Pilot, That was like, ah, you know, like, there’s just so many, there’s just so many opportunities here. Joe Pilot, For this, this one small, tiny parts of SSMS. There’s just so many opportunities for improvement. You know? Joe Pilot, Yeah.
Joe Pilot, There’s just so many opportunities. Joe Pilot, Like, when I, when I, when they, or rather, when they introduced wait stats, I was like, surely, there will be some line graph that will show me wait stats over time for the database. Joe Pilot, Oh.
Joe Pilot, But no. Joe Pilot, What if we put all the wait stats together in one bucket? And then that way you could hear which queries have the most waiting time? Joe Pilot, Yeah. But hey, is that min working?
Joe Pilot, Well, it works on the, and the grid format, but does not work in the, oh, with additional details. Joe Pilot, Oh, so no additional details available. Joe Pilot, Yeah, because I believe the additional details shows the wait stats on 2017.
Joe Pilot, Yeah, so I’m pretty sure it’s this column is the culprit. Joe Pilot, Excel, What are you doing? Joe Pilot, Just, uh, spray things.
Joe Pilot, Now, I can’t see that. Joe Pilot, I do try to be a reasonable person in my critique, you know, I work as a developer, I understand how to, I don’t want to be unreasonable. Like there’s some people out there who like, get mad that Azure SQL database or managers have up to like 60 seconds of IO latency. Joe Pilot, And like Microsoft calls these fine people unreasonable. Like I don’t want to be unreasonable like, like those people, you know, because yeah, yeah, I guess it’s a big program. I’m sure there’s a backlog of like 10,000 bugs. And they probably probably have like, I probably have enough people to work on it. Or like all the people were busy on SOS and ADS. And now they’re all working on a Joe Pilot. And they’re all working on a copilot. And you know, the, you know, the, the, the pleas of developers and DVAs to please fix this query store couldn’t connect to database very sad error, or simply unheard.
Joe Pilot, Yeah, you know, I feel like this is a great opportunity for AI. Because you know what, like, if you tell an AI like, hey, this, this, this query doesn’t work, fix it. Maybe you don’t fix it. Like, I don’t see how it gets any worse than Joe Pilot, It surely can’t make things worse. Joe Pilot, It’s like, or, you know, like, heavy, I optimize the, oh, wow, look at this. Like this beautiful, this beautiful thick bar graph showing you who’s boss.
Joe Pilot, There is one theory that matters here. Joe Pilot, Wow. That is something. Joe Pilot, So I mean, at the very least what they could do is say, is change the couldn’t connect a database screen to say thinking, and then they could say it’s AI ready.
Joe Pilot, Right? Joe Pilot, I mean, there’s all kinds of AI things, right? Joe Pilot, Like, you could have AI optimize, like, you could figure out, you know, like, which of these uses and user use, and you could default to that one, or it could guess your like business hours.
Joe Pilot, Or you could just have a little gear icon that says like, query store defaults. And you could pick your default. Joe Pilot, That’s, that’s not a modern solution. Joe Pilot, No, no, but no, wait, wait, wait, wait, wait, wait, wait, it could save that to a JSON file. That’s a modern solution, Joe.
Joe Pilot, I mean, it’s modern a while ago. Joe Pilot, It’s a modern solution for us. Joe Pilot, No.
Joe Pilot, Or like, I mean, I don’t have one of those cloud 24 seven apps. Thank God. Joe Pilot, Not yet. Joe Pilot, Like, I don’t care. Well, I don’t care what the server is doing it for him. Like, I just don’t care. So like, like, you know, like, like, like, all these, you know, like all these, like filters, you can have, like, change the date, which I like can’t find for some reason. Joe Pilot, I think it’s over to, is it over to the right under it’s under configure, isn’t it?
Joe Pilot, Or is it that drop down arrow? Joe Pilot, I don’t know why I’m having so much trouble. Maybe this is the new. Joe Pilot, Because I’ve done this like a million times.
Joe Pilot, Oh, dots. Joe Pilot, Oh, so many dots. Joe Pilot, Is this not in? Am I just an idiot? Because you know, I’m talking about the thing where they feel like the screen where it lets you pick the dates and the, wait, where is that?
Joe Pilot, All right, I’m gonna do an emergency stop share. Like, I’m gonna start the old reliable yellow SMS. We can figure out what’s going wrong here. Old faithful copyright 2022. No, no, that’s dark mode nonsense. Joe Pilot, This newfangled stuff. Maybe it’s just hidden in there. We don’t know. Joe Pilot, Yeah, all right. Where is this?
Joe Pilot, No, but like, you know, a lot of the complaints that you have about Query Store, that’s why I started writing the SP Quickie Store procedure. It’s like, you know, it doesn’t show you the stuff you need to see. It includes dumb things like statistics updates and index stuff. And like, you know, a lot of like background queries. It even shows like, sort of embarrassingly, it even shows like, like the Query Store queries in it. So like, sometimes you open up a Query Store and you’re like, what went what was so slow? And you’re like, Oh, it was Query Store querying itself. And you’re like, Wow, I can’t do that. I can’t fix that. So like, I don’t know, like, like, you know, like, not being able to just like find plans for a store procedure or not being able to like, you know, like, like search for some query text. All these things that very much irked me about Query Store trying to use it like in front of customers over the years be like, there’s this great thing. It’s amazing. You can query history flight data recorder. Wow. And then, you know, like, they’re like, Oh, well, can you find this in there? And I’m like, No.
If it shows up shows up. I got nothing for you. I’ve definitely used yourself before I’ve written my own stuff. Like, you know, like, Have you? It is nice to have You think you’re better than me?
I mean, I didn’t open source it. So what does that tell you? I just use it privately in shame. Like some other things. No, but like, you know, like, there’s something human about wanting an AI. I’m supposed to be role playing or wanting a UI. I’m supposed to role playing as an AI. So I guess not really filling the gag here. But I’m doing terribly. Yeah, it would be great if the UI works. But to your point, if you really are one of the serious power user query store, you know, that this UI isn’t gonna do it for you.
All right. So I finally found it. I think what happened is like the zoom cutout of our faces was black. Oh, okay. The new sleek blackness of SMS, whatever version this is, was throwing off all my muscle memory. Like, you know, like, these defaults aren’t bad, I guess. But like, I don’t care what happens. Like, when everyone’s asleep. Yeah, you know, like, like, give me some kind of like business hours, like, like, show me yesterday.
Yeah, that’s the kind of thing. Yeah, I care about. Yeah, like, all right. So what was I saying? Yeah, like, anyway, yeah, like, they probably have like, 100,000 bugs that haven’t fixed, and they’re on with developers. And you know, the super higher ups, want them to work on copilot and like, nothing else. So their hands are tied. But I think that should put those AI coding agents, throw them at SSMS, throw them at that, that, that, that backlog, have them fix all the bugs, all the annoying things that we’ve been suffering through for years and years. And then we can finally say that AI did something. Yeah, you know, I mean, it’s, it’s one of those, it’s sort of like funny things with software development, where, you know, there’s always time for new endeavors, but there’s hardly ever time for the backlog. Right? Yeah. And so like, you know, like, it happens to me to, you know, sometimes I’ll look back at like old blog posts, and I’ll be like, Oh, I should really update this to like, you know, like, add in some more words that make it make more sense, or like, you know, make it more technically correct or expand on some point. But then I’m like, like, why do I want to go back and like, rewrite this blog post that probably no one’s seeing and gonna see when I could like, record a new video that does all that, and get it in front of people who will actually see it as like a thing today. And so you know, a lot of like the I should go fix this old stuff gets thrown away, because I’m like, screw it, I can I can I can make double material. Like, I can have something new based on this and have more to it. So, you know, I understand why a lot of stuff like that just doesn’t get done. But I, you know, just if for the amount of effort that has gone into query store, generally, I do find it shocking that the query store GUI like the face of query store is so hideous. It is a purely anti human interface. It does not like people it isn’t like whoever designed it, I think just a very anti social person did not does not care about human experience, maybe even enjoys human suffering to some degree.
like, like, just, there’s just like nothing to it. It’s just like human intuitive. And I get I get very frustrated, because, you know, I spend a lot of time, like, you know, making the stuff that I produced to like, hopefully be human intuitive, or at least human readable, or at least lay things out in a way where like, if you like, you know, spend a little time with it, you can figure things out. But the query store GUI is just a wretched experience. And like, the one thing I guess that it has going for it, is it there is like some visualization to it, but the visualizations that it has are so I don’t know. what’s what’s what’s what’s what’s what’s the what’s the nice word for ugly visual?
Yeah, well, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, when when a visual looks good, people people people call it accessible, but a highly accessible. visual this is when I don’t know what they call it when they’re not accessible. You’re a eight year old kid can understand it. I mean, everyone understands a bar chart bar graph. Yeah, yeah.
And it even has the nice like bright colors. Yeah, it’s designed for like, yeah, an eight year old. How? Yeah. Wait, how? How is it? Is it eight? I don’t think it’s correct. I’m the goal of 10. Okay, 10 year old could do it. Yeah. Okay. Okay.
Maybe. Yeah, and such an old blog posts. I know what you mean. I only update mine when like Brent emails out in his newsletter, but then Paul White posted comments like something I was wrong. But like everyone see how wrong I am.
And even if I updated, I’m not able to update like Brent’s emails. I have to live with that shame forever. Yeah. Yeah. You know, that’s one of the risks of blogging. Yeah. That happened back in 2014. No, 2020. Well, probably happened then too, but it happened in 2024 as well.
There’s actually some JSON thing. That’s probably good for blogging about JSON. Oh, yeah. The JSON blob thing. Yeah. Yeah. The using more and more memory. Yeah. Fun. Yeah. Yeah. And then then Paul, Paul dunked on you and then it was too late. It was just memorialized.
It was too late. Yeah. That was the perfect way to put it. Too late. The shame is permanent. I mean, like on the, on that same subject though, it was, there was a very funny thing that happened this morning on LinkedIn. Uh, where this gentleman who has SQL Server expert in his, uh, LinkedIn title, uh, posted what is clearly LLM content, uh, about a new feature in SQL Server 25 to 2025 to do table level restores, uh, which doesn’t exist.
It was, it was, uh, it was an April fool’s joke by Brent, but the April fool’s joke apparently poisoned AI. And so this person was like new SQL Server 2025 features. I’m going to go test these out. And, uh, basically got made fun of until, uh, he deleted the post.
So if you’re trying to motivate me in the blogging, I think this is pretty effective. You’re saying that I can like poison AI. You can poison the AI. That’s, that’s definitely, uh, that’s, that’s changing the balance a little bit.
Yeah. Right. Don’t want to blow. I mean, maybe there’s an AI watching this video right now. You know what? There very well could be, we should just say a bunch of wrong stuff. I mean, that probably already happened, but, uh, yeah, we’re going to, there you go.
Yeah. Now everything we say wrong is on purpose to trick the AI. Yeah. That’s, that’s what we can go with. Yeah. We can call it the, the, yeah. So yeah. I mean, just LLM poisoning is I think the wave of the future.
That’s going to be my new gig. I mean, you know, how it is as an end user of SQL Server, you’ve been disappointed so many times, or even just software, right?
Yeah. Like being a Mets fan. You have like a bug or a suggestion and they’re like, yeah, well, if it’s SQL Server, it’s not going to respond.
Yeah. Like, you know, if you think back to the old, like glorious kind of connect days back when the site worked, it’s like, yeah, like this is a problem, but like, well, you know, like we’re not able to prioritize this or we don’t have developer bandwidth or whatever, you know, kind of saying, yeah, whatever, you know, but now with like these, these supposed AI agent coders or whatever, like they, they don’t need time off.
They, they don’t take holidays. You can spin up as many of them as you want. And I’m looking forward to a future where, you know, you, you, I complained about a bug or ask for a new future and you get an automated response that says, you know, we’re sorry, but all the AI agents are busy working on hard-party work and they, I just can’t get to your suggestion.
Your suggestion has been facing the queue. That’s what I think the future looks like. Yeah.
Most likely. Yeah. I mean, it’s, it’s, it’s, it’s, it’s, it’s incredibly cute to me because, you know, when, when you talk to people who build like real software, like SQL Server, I’m going to say real software, you know, and, and sort of like in the sense that a lot of people use it and it has the word enterprise in it.
So like, you know, they’re building real software, but you don’t like, when you talk to them about like real problems with it that need fixing, the conversation always comes back to like, like a developer time or sort of like developer budget, right?
Like you have so much money that you can spend on a feature, which is really like developer time on a feature. And like, and you’re always like, well, how come AI isn’t doing that? Like, where’s the co-pilot for that?
Like, is, I’m sorry, is this thing too important for AI? Like, is this thing too complex for AI? Maybe, maybe, maybe we’re not at that point yet. Maybe we shouldn’t have fired like 9,000 people who work on this like important things in favor to, because the AI is coming.
Maybe, maybe we could have kept some of them around to like, you know, sweep, sweep up the pieces while we wait for all this stuff to blow over. I don’t know. It’s, it’s, it’s, it’s amusing to me when you, when you get to that point. Well, I think what my hope is, and, and, and, and AI watches this video and AI on map recreates a bug report for SMS for the thing that doesn’t work and hasn’t worked for like eight years.
Yeah. And then, and, and AI fixes the query, which would be very easy to fix. I think I even fixed it in the blog posts. In fact, maybe AI could, could find my blog posts and I could get the fix from there and I could plug the fix in and then it’ll finally be fixed.
That is what I’m hoping for. I don’t know. It might, it might be that my, my SEO is so bad that not even AI can find me. Oh man. It might be, it might be that terrible.
Actually the, the new funny thing is like I used to get a, like, like, as soon as I signed up for LinkedIn, I, I, the first thing I noticed was that, excuse me. First thing I noticed is that I started getting a crap load of spam.
And so like, you know, I, I have a very well curated junk folder now, but every once in a while I’ll go like looking at it and the tide really has turned from like, like, let me fix your website SEO or let me fix this other SEO thing for you to let me fix your like chat GPT SEO.
Like you want chat GPT to mention you when people search for SQL Server performance. And I’m like, it does already. Like, like, like whenever I have like an LLM do like deep SQL Server research for me, nine times out of 10, there’s a, there’s a link from to my site in there.
And I’m like, you bastards. The best SQL AI review, the best SQL Server installs include individual actions, like, you know, like, I, I, I, I, I’m not a well traveled person.
Like, yeah, yeah. Excuse me. Um, I don’t know.
We have Greg Galloway. We have someone whose name I refuse to say for reasons I also refuse to say. Um, I, I, I don’t see you here.
Yeah. Sorry to, to break the news to you. Crap. Well, maybe, maybe, maybe, hold on, hold on.
It says AI responses may include mistakes. So obviously it was a mistake that your name wasn’t included. Well, you know, maybe, maybe, maybe if you asked for, uh, the best SQL Server performance consultant in the world outside of New Zealand, it would, it would show up.
But maybe. Yeah, maybe we can, we can hope. Right.
You know, if you take advantage of that, uh, the AI SEO, maybe by the time someone watches this video, the results will be different. Maybe, maybe, maybe, maybe that’d be nice. Maybe you don’t pay for, pay enough for chat GPT for them to give you a real in-depth answer.
You probably, you’re probably using like, Oh, I’m using some like free Google or whatever. Oh, okay. Yeah.
Well, I’ll, I’ll, I’m, I’m doing my best not to be offended by this. Okay. I’m crying inside. Hey, if you want to end the video, we can. No, it’s all right. Oh, I’m just gonna, I’m just gonna pinch my inner thigh real hard. So the emotion sort of tapers off.
Well, for the record, I think the AI is wrong. It is yet another example of untrustworthy. Yeah.
You know, there’s a lot of that. Uh, I don’t, I don’t know. Have you, have you been able to use AI for anything meaningful? And like your. No. Yeah. I mean, like I’m kind of an AI. Uh, I guess like, like I, I, I can’t really get over the, Oh, well, you know, here’s a lot of here’s some, here’s an answer to have, like might be wrong.
Like. It makes you think you can only use it for like. Things where it’s okay to be wrong or things that are easy to verify if they’re correct. Like, I don’t know.
Like if you asked for like a pound drone, that’s like 15 characters long, I guess, yeah, for that. And then it like, like spits out a word easily verify if it’s correct. Right.
Maybe like that’s a good use. Or if you’re doing some like proof of concept code that will never reach production though, as we all know, if the code works, you can. Ship it.
Say what you want about never reaching production and temptation will rear its ugly green head. And now you’ve got AI code in production. Yeah.
I think. I’m definitely the wrong person to ask, but I have, I have a very negative outlook on AI personally. Yeah. Uh, you know, the, the, a lot of the things that I’ve tried to use it for were things that theoretically it should be good for, but the, it just gives you such junk back.
Like, uh, you know, like I, I, I’ve, I’ve tried to use it to like, um, outline in detail training content, but it’s just such a dead end for that because like the, the topics that I would want to cover are not just like, you know, uh, the like DBA one-on-one stuff that you, that it’s capable of, um, you know, generating, uh, a lot of problems with things being factually and technically incorrect.
Uh, whenever I try to like, you know, get into it a little bit there and, uh, yeah, it’s just, you know, I’ve, I’ve, I’ve been very disappointed. By, by the, the things that it’s produced. I know that AI is very good at maybe that’s, I, I don’t know, maybe it’s sort of like the, like the coin toss thing where it’s very good at showing me what I don’t want.
So I can at least like, like, oh no, I need to do the opposite of this. This is fine. Right. Uh, but yeah, I’ve, I’ve, I’ve not, I’ve not had a lot of, uh, success getting it to, um, to produce things that are, um, up to, up to the par that I would like them to be.
I actually did have an interaction recently. Uh, you know how if you enable service broker, it has to make a super annoying lock, which is like very difficult to get.
Yeah. I guess. I don’t know. It’s like, uh, it’s like an exclusive database lock or something where like nothing else can be connecting. Yeah. It’s like turning on RCSI. Yeah.
Yeah. Yeah. Um, so I wanted to turn off service broker and I, I wondered, I wonder, I wonder if it needs the same super annoying lock. I suspected it did. I, uh, Google that Google’s AI summary shows up at the top and it says, no, you know, you can like while enabling service broker needs a lot, you can say, well, without taking lock.
And I thought that doesn’t sound right. I’m going to try it. And do you want to guess if it takes a super annoying lock?
I’m going to guess it did take a super annoying lock. It did take a super annoying lock. Well, I mean, you know, the power of AI at our service, you know, uh, you know, typically with SQL Server, uh, you expect the worst.
You don’t hope for anything. There’s no hope for the best. Expect the worst. You just expect the worst possible outcome. And then sometimes you’re pleasantly surprised when it does not reach that critical mass.
Like other. I think it’s better than these videos where our viewers can and should expect the best. Correct.
Right. Especially from your camera. Yes. Yeah. Yes. All right. I’m, I’m, I’m a, I’m a, I’m a, a stop recording. Okay. All right. Well, since Joe is ready to stop recording, Joe is tapping out on me. Uh, we’re going to wrap this one up.
Uh, I guess we got off the subject of SQL Server tooling a little bit, but, uh, I think we covered enough of that to at least live up to the name of the, the, the title of the episode.
So we’re going to get going here. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. Once again, thank you to our fabulous sponsor, darling data and a beer gut magazine. And, uh, we will see you in the next thrilling episode where, uh, Joe will show us more of his screen.
I hope. All right. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
In this video, I delve into the intricacies of ordered columnstore indexes in SQL Server 2022 and 2025, exploring their benefits and limitations. I argue that while these new features offer some advantages, particularly for large ETL loads, partitioning tables by date or another logical key remains a more robust solution for most use cases. To illustrate my points, I walk through a demo using the latest SSMS preview (version 22), showcasing how turning on trace flag 58008 can prevent the creation of unpartitioned columnstore indexes in large tables, ultimately guiding users towards more sensible design practices.
Full Transcript
All right. Welcome, everyone, back to the unofficial maybe relaunch of the Bit Obscene podcast brought to you by Darling Data and BeerGut Magazine. I’ve got my temporary co-host, Joe Obish here. We’ll see how he does this time around. And today, Joe is really excited. You can tell by the look in his face. Look at that face. That’s an excited man. Joe is excited to talk about ordered columnstore indexes, which, correct me if I’m wrong, Joe, that, that, that, that, I’m sorry, I have to get rid of his background. It’s making me look like an insane person. It came up with SQL Server 2022. It’s made maybe some, some progress in SQL Server 2025 that, but no one knows, because SQL Server 2025 is, of course, the unofficial AI and fabric release. SQL Server is billed third on that one.
So maybe there’s nothing good for SQL Server in it. But anyway, take it away. Mr. Robish. First of all, Oh, shoot. Here we go. I learned that my dear friend, Eric is now living under communist rule. And I thought, what’s a small token gesture that I can do to appear like, like I care, and I want to help but not really do much. And I thought, Oh, I can just show up as a guest on his YouTube channel. So that’s why I’m here. I’m, I’m, I’m, I’m, I’m, I’m, I’m in full solidarity with Eric. I’m wearing the plain white t shirt.
He’ll probably be wearing one in a couple of months along with everybody else in New York City. But I, I, I, I’m not here to help Eric. We’re, we’re, we’re all pulling for you. All right. Thanks, Ben. Hopefully, hopefully, my, my means of production are seized in a gentle way.
So for, for order to come store, Eric and I actually both blogged about this a few days of party then back in 2022. And the way I remember it was, that’s right, no clues. The way I remember it was Brent blogs about it. And he was like, Hey, I tried this and it didn’t work. And, you know, Eric and I live to serve. So we, we both took our own little look at it.
But, um, I think Eric can have the links in the description. Oh, yeah. And the viewers can vote on who, uh, did it better. You know, we can get some engagement, maybe. Of course, there’s no, there’s no voting in communism, Joe. So that’s, that’s, that’s, that’s very, that’s a good point. I take it.
Oh, back. Uh, now I’ll admit that I haven’t looked at since. I remember ending the blog post like, Oh, this is CTP 2.0 or whatever. Maybe Microsoft will make it better. Wink, wink. Now they, they like never do. Um, but I haven’t looked at it. I’m assuming it’s not better. If, if they did make it better, feel free to let us know it in the comments.
Now, in terms of what, okay. So why does ordered columnstore exist? So there’s a very old workaround for columnstore where you create a clustered index on the row you want to order by first, and then you create a columnstore index with max. That one second, the idea there being, I was like an, it’s like an unofficial way to get like a, an ordered columnstore after you built it.
That’s where you create the clustered index on the table on some column or key columns. And then you do the clustered columnstore index with drop existing. Yeah. Yeah. Yeah. Okay. And like, you know, like that could be improved. You’re writing the data twice. It’s a bit of a annoying workaround. So I guess like from my point of view, you know, getting rid of that workaround could be a good thing. Um, because now you can just create a, just create a clustered columnstore next with ordering.
Problem is, and this is the thing that Brent blogged about. It’s not always ordered, which, which is not too good start. Um, but like, I’m even gonna set that aside. So I found that when you add ordering, SQL Server is making three changes. First being when you create the index, it’ll add a sort operator. It’s a soft sort. Um, there are all kinds of reasons why the data isn’t going to be fully sorted.
When you’re doing an insert in the table, it’ll again, add, add, add a, add a sort. And then once again, the data might not actually be sorted or it’ll be sorted in an intuitive way. And then when you’re doing a rebuild, it once again, adds a sort. And we’re a three for three and that sort may also not fully sort the data. So that’s what the feature does. Um, my big beef with it is how it changes the, the insert queries, but you had a comment here. Oh, I was gonna say we’re now at three sorts and actually no sorting. Yeah, yeah. So.
All right, I’m gonna ask you a non trick question. You ready? I know what that means. Why are clustered indexes generally useful?
Why are clustered indexes generally useful? Uh, well, uh, there, there are lots of reasons why they’re generally useful. Looking for the simplest, the simplest answer.
Well, it, it gives you a free access to all of the columns in the table in an ordered fashion. Yes. Yes. It’s because the data is sorted, right? The data is sorted globally. If you insert some old data, according to your key, it’ll do the, you know, page splitting and it’ll, the data is always sorted.
Now, what if I told you that we created a new type of index and it was kind of sorted. Sort of sorted. And whenever you inserted new data, it would sort the new data you inserted and append at the end of the table.
And it would, it would, it would keep doing that. So whenever you do an insert, it’ll locally sort that data and append it at the end. Does that sound very useful to you?
Let me, let me just make sure I have the right mental model of this. Let’s say you have a table with a million rows in it and it’s the numbers one through 1 million and they’re in order. And then you insert 10,000 rows to it.
Instead of those 10,000 row, let’s say they’re, they’re also the numbers one through 10,000. So they fit in with the numbers one through 1 million. So instead of the numbers being dispersed in the table where they should be, it would just be like an appendage on the table with the rows in order from one to 10,000.
So it would be like one to 1 million and then one to 10,000. Yes. Yes, that’s correct.
And that’s, that’s the fundamental problem. From my point of view with order to columnstore, like I find that kind of worse than though we’re going to not sort data always for you. Other people could disagree because you know, like it’s, if you have clustered indexes, you have your, your, your AK pages.
They can be split all kinds of great things can happen. You have columnstore. You have your compressed row groups.
Like there isn’t any way for, there isn’t any reasonable way. Like if you’re loading a million new rows, like it’s, it’s not going to find the most appropriate compressed row group uncompress it, add the new data and then recompress it. That that would take forever.
Like it’s just, it’s just not going to do that. Like it’s going to append the data at the end. I’m surprised though that on rebuild, it doesn’t like fix that. It could fix it.
It could. If, if, if your sort manages to work fully. Yeah. Okay. Great. Good. Good. So what, what makes us sort work fully or not? Uh, here’s a number of things, how much memory you have, how much data you have, what DOP you’re at.
Um, which brings me to something, which, man, I, I, I have these great notes that I’m like not reading. So, you know, like one of the, uh, like one of the annoying things about the old workaround to get like pseudo ordered column stores before 2022, you know, the cluster index and CCIs. You have to create the CCI at max top one, which, which could be slow for a big table.
Free the, free the documentation for ordered columnstore. It says, if you create the index in parallel, it’s not going to be ordered. And in fact, if you want perfect ordering, it.
Max top one. Yeah. Max top one. So like, so are you better off just doing the old message, the old method rather? The, the, the problem.
I view the sorts during insert as not that useful or, or even harmful in some cases. And you, you get this partial to full store when creating the index, but like the penalty for that is you, you lose online rebuilds. Hmm.
At least on 2022 and 2025, they’re adding online rebuilds or the columnstore. Okay. So if you asked me if it was worth it, I would say you should just partition your tables, but I’m going to try to make my case a bit better. Um, on this subject of insert.
So let’s say you, you have an ordered comp store index. Cause. You’re in 2022. Yeah. Use all the new features as soon as they come out. Read the documentation.
You’re hot on the heels. Why not? Yeah. Yeah. Why not? Now, if you’re inserting less than a million rows per partition, what’s going to happen? Well, it’s going to sort the data pointlessly. Yeah.
And that’s going to try to compress the new single row group. If you get some kind of, uh, what’s it called? Uh, pressure, dictionary pressure, memory pressure, whatever, you can get multiple row groups. But like in general, like that, that, that sort’s not useful.
Like if it’s, it’s, it’s, it’s, it’s almost like sorting, like a single, like a single 8K page before you insert the data, which maybe it happens, but you know, like, you know, like, like, it’s, it’s not doing anything for you. Now you want me to say, all right, well, what if I insert more than a million rows? What if I insert 10 million rows of time?
Well, It’s an ambitious insert. First of all, if you, if you, if you have like, you know, like, like a normal ETL load where you’re like, like, like a normal ETL load. Like yesterday’s data, it’s going to be like new data probably.
Right. So the thing you want to order by is probably for like yesterday’s day. So it’s, it’s not gonna matter in that case, or like, let’s suppose you’re loading a week of data at a time. Cause you’re, you know, you, you, you, you live under communism.
You’re not going to work every day of the week. Right. Right. So in that case, you, you, you have a week of data, 10 million rows. So it’s, if it’s ordered, then your row groups have like one to three days of data instead of seven.
And if you happen to run a select query, which is like, you know, looking at a single day in that range, then maybe you read a couple of fewer row groups. So it’s, it’s not, it’s not, yeah, maybe like, that’s not really giving you a whole lot in terms of performance. No, but I think, I think that gets, I think that gets back to your point though, about if, if this is something that you care deeply about, regardless of ordered columnstore.
Like, but if your table is truly that large and you’re truly, truly doing that sort of ETL with like what you’re talking about where there’s daily loads. And I think that’s, truly the most sensible thing to do is have that table partitioned by some date that, that, that gets you like, I mean, maybe, maybe not like from a data loading perspective, but from like a, like querying that data later perspective. That at least gets you the partition elimination and then segment elimination based on, you know, the criteria for your query, looking for data for some date range.
I agree fully. And I’m going to jump into my demo ahead of time. Ooh, my, my, uh, four notes.
They’re just, they’re just basically just, you know, the, the, a cast of Erik Darling is just down the memory hole. Yeah. All right.
Can you see the beautiful modern SSMS? And I can’t, is that 21 or 22? It’s the newest one. That’s SSMS 22 preview. It’s glorious.
It’s glorious. Look at it. Oh, it’s not 21. 21.6.1. Wow. I’m already out of date. I guess. Let’s do that. Check for an update too.
All right. All right. Um, I, I, I, I, I, I, I, I hear this stuff. All right. So what’s that trace flag? It looks, it looks, it’s that trace flag, Joe. Microsoft has heard my, my, my desperate please.
Yep. On the subject of partitioning is super important for columnstore. So what was, so why do I say that?
It’s the same reason you said it. If you, if you, if you partition logically, then that’s going to force all the word groups together. It, it, it basically gives you like, like, like, like a minimum sort of segment elimination free.
Like no matter how badly the developers load the data. If it’s a row at a time, it’s a hundred thousand rows at a time. If, if, if they don’t load the data correctly the first time and they delete it and update over and over again, they just still in that partition.
Even if you don’t get a partition elimination, you can still get segmentation. I mean, I’ll go further and say that I’ve never had a columns or forms issue get resolved by, well, you know, we, we had bad segment elimination because the data wasn’t ordered.
So we were reading like, we were reading like 20 row groups and then we made the segment elimination perfect. And now we’re reading three and performance is like amazing.
Cause we, we, we got rid of those 17 compressor groups. Like I I’ve never seen that. The things I’ve seen are you have a big table. It’s not partitioned at all.
Data is basically in a random order. You know, you can get hurt there. Um, or you have like a billion soft deleted rows. I never got cleaned up.
I’ve seen them production, you know, that’ll hurt you. Or even just like, like all of the normal performance issues that you can find with really any query.
Like for me, it’s, it has never been like I’m partitioning my data by quarter or whatever, but well, my data is too unordered within the three month window. And like, I really need that perfect segment elimination.
Like I’ve never seen that. Maybe there’s somebody with high frequency trading or online gambling, or who was like super hardcore. And that they really need those perfect pristine row groups.
In which case I would think that, you know, a sort that might not start isn’t good enough. Definitely not. And you, you gotta fix your UTL anyway.
Anyway, I’m a big fan of partitioning and Microsoft has heard my fleet, my please. They gave me permission to introduce this new trace leg 58008. We go ahead and turn it on.
All right, it’s turned on. And now I’m going to try to create a columnstore index, a big table, which I’m creating an, an, an unpartitioned index.
I’m trying to, that’s, that’s, that, that, oh, look at that. This is not safe for children. Look at that. The operation failed because the table is too big. Like, create a partitioned commister index instead.
Finally, SQL Server is finally stopping us from, from making dumb mistakes and bad decisions. Isn’t this great? This is fantastic.
Now, you know, I’m sure, you know, like, now, like, you know, maybe your DPA thinks they always, they always know better or, you know, like, it’s, it’s important to be able to, to, you know, have full control of the software.
So we had this new create bad index mode, uh, option. And if we, if we set it to on, then, uh, we can, uh, avoid Microsoft’s guardrails. So I, I’m really excited about this new feature.
We can finally, you know, encourage people to partition their damn tables, which I think is really, it’s probably like the, I’d say it’s the most important thing if you’re doing columnstore at any like reasonable size, you got those deals partitioned.
Great performance, ETL, maintenance. There’s just so many reasons to do it. And, uh, I’ve seen what happens when you, when you don’t do it and it is not, it is not pretty.
Yeah. I mean, uh, you know, I don’t, I don’t do a ton of production work with, uh, with column store. I do a bit, but, um, you know, most of the, most of the tables that I run into that are, are, that are column stored are well beyond like, you know, my ability to, um, have my consultant way with them and like, maybe go back and partition them or, you know, do anything useful in that way.
Um, you know, I wish, I wish more people would involve me in like the implementation of something rather than like the, Hey, all of a sudden this sucks. I think I’m almost like, well, guess what? At 3 billion rows, we’re a little past the point where we can easily do this. So, um, you know, not, not undoable, but certainly, uh, what makes, makes the experience a little bit more painful.
But, um, you know, a lot of what I run into is, um, people who have, uh, you know, like row store partition tables, cause they bought into the, the meme that, uh, you know, partitioning tables makes queries faster with Rome, with row stores indexes.
And boy, are they surprised when they start doing things like men queries and max queries, and even some, like, you know, some other, some other types of queries, all of a sudden they, they, they can’t just like, like start at one end of an index and get something like they, they, all of a sudden they have to scan every single like index and like the, like all of the partitions.
And you’re like, again, like, like, why didn’t you call me like, you know, five years ago when you first got this bad idea in your head. So for the, for the folks out there who have perhaps large columnstore tables implemented, but did not have, uh, the foresight and certainly, uh, at this point may or may not have the high hindsight to, uh, to partition their, their large clustered columnstore tables.
Uh, is there any hope? Is there anything they can do to make their lives better or easier? Or do you just recommend creating the partition columnstore table and loading data over and doing the old SP rename switcheroo?
I don’t remember how I did cause it was so long ago, but we had that same problem where we had like, you know, we had 300 customers at the data warehouse, nothing was partitioned. Some tables had billions or tens of billions of rows.
And we did ship the code to make that happen. Um, how did we do it? I don’t remember how we did it.
Uh, I, I, I think we just did the usual scam of, oh, you know, you’re now in a maintenance window and we’re going to rewrite all of your data. Um, I think we had like a lot of parallel jobs doing the inserts.
Sure. There is, there is some trickery there. It, it, it definitely wasn’t easy, but it’s, you know, if you want to make the best long-term decision, you gotta have partitioning.
It’s only going to get worse the longer it goes on without partitioning. Um, in terms of this new feature or columnstore, I mean, maybe it’s better than nothing. You know, there probably are some use cases where it’s okay or bar than status quo.
I do think it’s kind of fundamentally flawed and that, you know, as we talked about your word groups already compressed, you’re not going to uncompress them to get perfect segment elimination.
And that’s just like an unsolvable problem pretty much like, you know, the, the, uh, new data you load at best. It’s going to sort that, but it, it, uh, might not even do it. You know, you might get a partial sort, or if, if, if, if you’re doing parallel insert, it’s just, it’s just not going to work in the way you’re expecting.
That’s covered in the, in the blog post I did. Yep. So I’m looking, I’m looking at the 2025 notes here. And, uh, apparently Microsoft has really expanded its repertoire of, of 75% complete features by adding ordered non-clustered columnstore indexes.
So we’re very excited for that. Aren’t we? Yeah. Yeah. That’s a big, that’s, that’s, that’s going to change, change the data landscape forever.
I was actually trying to fathom like why order to columnstore even exists. Uh, I do remember hearing a rumor about the, the, uh, SQL variant data type where like some big, important customer gave Microsoft a really big bag of money and then Microsoft implemented the SQL variant data type just for them.
And all of us could depend on it. So like maybe it was that, uh, maybe there was some customer who was like, Hey, you know, we have, we got this big problem.
We really need to order to come store. And they managed to convince them with, without the big bag of money. Yeah. Maybe I have a third theory though, without, without any factual basis, what’s the thing whatsoever, but I feel like it’s my favorite kind of theories.
So I’m just going to throw it out there. Like they’re like, really there, there, there are like two things that, uh, DBAs love. Right.
First one is a good clean demo. Second one is attempting to remove fragmentation. And the problem with calm store is it offers you both. Cause like, you know, like, like what’s the easiest possible demo to do with calm store?
Oh, I load my data. Like, uh, like a total dumb ass. And now I’m not getting any segment of elimination. I can, I can, I can view the little print screen and it says segment with eliminate zero segments red 99 or whatever.
Yeah. Then I heroically load my data the right way. And now it’s ordered.
And now, you know, the, the, uh, little numbers and the segments, you know, I’m, I’m skipping segments, doing less work. Like, you know, like I’ve done that demo. You’ve done that demo.
Like everyone’s on that demo. It’s really easy to do that demo. Yeah. You see all these blog posts talking about how great and easy and important. The elimination is nobody talks about partitioning. So, so like, you know, like, like, like the, the important thing, do the ease of demoing it, become a single elimination.
And then there’s just anything, right? Who doesn’t like anger or fragmentation? How many people are still rebuilding all their indexes every night?
Too many, too many, too many Joe. You gotta get rid of fragmentation. Right. So I feel like there’s this way too much importance on that subject. Um, I mean, I, I already said, practically speaking, what I think people should do.
I think partitioning is, is way more important and, you know, like it’s, it’s, it’s even worse. Cause like, I don’t know, like I I’ve seen playing questions like what comes or indexes aren’t ordered. This is totally useless.
Microsoft needs that ordering. Like I’ve seen that come up a lot. And, you know, like, like maybe finally Microsoft listened to the community when they shouldn’t have. And now we had this super half baked feature that can like never truly work, which source the data sometimes and removes online index rebuilds and ads point in the sorts.
And, and so on. Um, you know, it, it, pointless sorts are like, I don’t know. I think a good chunk of the SQL Server source code.
Um, so in, in, in terms of what I want, I already showed what I want the, uh, magic new trace lag. So that’s, that’s great. Look forward to that.
Um, I kind of feel like maybe like, uh, you know how there, there, there’s the sword and, uh, temp TV option, which only applies during your next creates or rebuilds. Yep. I think that would have been better.
Like, Hey, you know, you, you’re creating columns or index, like say which court, which column you want to order by, it’ll make a. Attempt to do it. Mm-hmm . And it’s not going to fundamentally change every insert and it’s not going to remove online rebuilds because it’s just something that happens like once during creation. Mm-hmm .
And, you know, if you’re the kind of guy who’s like rebuilding your indexes every night, well, it’s going to be offline now. Like maybe that works for you. Maybe it doesn’t. I don’t know, but.
At maxed up one. Right. Yeah. Like, I don’t know. I, I don’t think. I don’t think this is a very good feature. And if, if you read the documentation, like there isn’t anything even about inserts, you know, it’s just all like, Oh, we’ll, we’ll sort the data and, or maybe we won’t, but even partially sorted data will improve query start performance.
Like they make it sound like a no brainer and. I, I, I, I just think it’s, it’s, it’s, it’s fairly flawed, you know, do the whole. Once again, we’re not going to uncompress row groups and, you know, slice and dice and surgically insert things.
No, I, I honestly like this should be, this should be one of those things where like when you use it. Uh, like. So there, there are other database systems where like you can.
Uh, like create a table and partition it all in one go. Right. Like, like SQL Server, you have to like create the partition function and partition scheme other databases.
You can be like create table with like, and like has like this, like with partition as syntax type thing, which is like, like when you write a window function and you’re like partition by blah, blah. Like you can write the table is like partitioning by that. And you can specify like the column and like the partitioning like ranges for it.
Uh, I think that. With ordered columnstore. Uh, like if it’s not on a partition table, like it should force you to choose a, like, like whatever columns you’re ordering. By it should force partitioning on those columns.
Like, I don’t care. I don’t, I don’t care if it like, like, I don’t care if it makes a stupid choice doing that, not doing that as a stupider choice than any choice you could make in that. Like just.
Bucket things somehow. Like just like why. That’s interesting. I mean, I don’t, yeah, you know, it’s. I think they’d be scared to try to pick partitions for you because you know, of course they would, but yeah.
But yeah, no, it’s, it’s definitely a good point. Possibly even superior to my, you know, you can be able to trace that to tell you when you’re being a idiot or not. So how did, how does that trace flag decide what a too big table is?
That’s, that’s, that’s a, under NDA. So, okay. It’s proprietary. All right. Gotcha. Gotcha.
Well, maybe, hopefully someday, uh, you’ll, you’ll be able to write anonymously on, on some blog about the thresholds for, uh, when a column is too big. Well, I mean, so, so, so the great thing is if I violate NDA on your blog, everyone should think that you wrote it. So I want to get in trouble.
That’s true. I won’t get in trouble. I won’t get in trouble. I don’t have an NDA. What NDA do I have? Oh, okay. All right. Can’t sue me if there’s no paperwork. Okay. All right. That’s, that’s, that’s, that’s the defense you’re going with, huh? Yeah.
Okay. I was drunk when I wrote it. I was just guessing. Uh, I didn’t sign any paperwork unless I was drunk when I signed the paperwork. So make your own decisions there. Well, I mean, I understand why the NDA is certainly annoying.
I remember I was once told by, uh, uh, Dr. Now that, you know, here’s some information that’s under NDA, but I already knew it. Yeah.
But you’re gonna tell me something I already know and now it’s under NDA and I can’t tell the people. No, like I already, I already had that information. I already knew it. That’s why you have to, that’s why you have to blog about everything.
Yeah. Believe it or not, we, oh, let’s see what you did there. Yeah. Believe it or not, we can figure out things on our own sometimes. So what, so the real trick of what to do is to write the blog post, but put this, the posting date is like a month before anyone said anything to you and then publish it and then be like, well, when did you tell me that? Oh, such and such a date.
Well, this post was published a month before that. It’s not covered by NDA. I bet everyone who’s watching this is learning a lot, getting a lot of good tips. Look, there’s one thing I’m good for.
It’s sound legal advice. So that was, that was all my complaints about, or account store. Um, I don’t think it’s very good. I think you should partition your tables and that’s going to do way more for you.
So, but I mean, you know, maybe there’s some workload out there where this is the best thing since that communism and they’ll get a lot of value out of it. Well, I think the only thing that we can say about this feature that it truly is for the people, it’s free kind of for $7,000 a core. It’s free.
Maybe not. All right. Yeah, I’m gonna, I’m gonna have to workshop that one. But, uh, I don’t, I don’t have any specific complaints about ordered columnstore. Uh, you know, my, my complaints are far more generalized than that, like, you know, like, okay, you’re gonna spend development time and cycles on another feature that’s like, you know, 60, 70% done.
And like, try to get us excited about it. And then, you know, they trot poor Bob Ward out to another conference and everyone’s like, Oh, like, what are the internals like get the debugger Bob. And then poor Bob has to sit there and be like, there’s no, there’s no debugger for it.
All right. Sorry. Like, you know, it’s just like one of those, one of those things. It’s like, we’re supposed to get excited about like this, this, this thing that, that, that is like not solving a problem for 99% of the SQL SQL Server. So I think the analogy you wanted before was, uh, you know, think about a, a, a parallel rebuild of an order to come for index.
You have all these individual threads. They’re all working together, but by the end of it, they haven’t accomplished anything. Just like communism.
Man, you’re good. You’re going to get me like flagged in the EU with this one. I’m, I’m going to, I’m going to get, I’m not going to be a big fan of the EU. You’re going to be allowed to travel internationally when, when this, when this goes live.
No, it’s, it’s the, you should have it go live after you’ve, uh, fled the earth. Uh, well, maybe, maybe Elon will take me to Mars. I can open up a saloon somewhere in a gully or that’d be nice.
Probably not have a lot of demand for SQL Server consultants on Mars. But there will be for saloons. Cause it’s me, the new wild west, nice place.
You can, you know, tie up your robot horse, buggy thing, go in for a nice cold beer, Mars, Mars beer, and take a load off. You know, we would be like just the ultimate. We’ll have one of those pianos that plays itself.
Be great. Got a whole vision here. Okay. No, my original, my original plan to open a bar on the beach in Bora Bora. I think that market is saturated at this point.
So I’m going for a saloon on Mars. I wish you well, you know, I don’t drink as you know, it’s true. So I’m not going to support you.
That’s why you’re going to be the perfect bartender. I can trust you not to drink my whole supply of Mars beer. Unless you know, Mars beer is just so delicious.
You change your mind or you get bored on Mars. Then again, you’re not, you’re not bored in Wisconsin. So you can, I don’t know. I don’t know where you could get bored. Sorry.
Did I give away too much personal? Oh yeah. I actually thought you stopped recording a long time ago, but apparently you’re still going. Oh no. Do you want me to stop recording? Yeah. Oh, all right. Cool.
All right. Well, thank you everyone for tuning in to the mildly resuscitated bit obscene podcast with my temporary cohost, Joe Obish. As always, this is brought to you by Darling Data and the kind folks at Beer Gut Magazine who supply me with stickers and magazine covers.
So thank you. And I’m going to hit the button now. All right. But I don’t know the beauty of the light in the internet. Now, I mean, we can do continues. Coming.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
My developers overuse table-valued parameters in application code. Are there any server/database settings I can change to make the queries using them run faster? I know about in-memory tempdb and memory-optimized table variables, but I don’t have latch contention issues so I doubt that they would help.
why aren’t developers good at database performance yet?
Why can’t you make an ErikAI to answer these questions?
Do you have any training content specifies performance tuning efforts and limitations for SQL Server Standard Edition? I get excited to see posts from you and the community around Batch Mode in Rowstore only to get let down by licensing limitations yet again.
Are you doing a Black Friday sale? (This was a blatant lie.)
In this video, I delve into answering five community-submitted questions during an office hours session, providing insights and advice on various database-related topics. We cover issues like table-valued parameters in application code, performance tuning efforts for SQL Server Standard Edition, and the limitations of licensing. I also touch on upcoming pre-conference events at Data Tune Nashville and Data Saturday Chicago, encouraging viewers to purchase tickets early. Throughout the session, I emphasize the importance of regular practice in database performance tuning and share my thoughts on why developers might not excel in this area due to time constraints and project deadlines.
Full Transcript
Erik Darling here with Darling Data. It is Monday in YouTube. Actually, it is Monday in regular land, too. Which Monday it is, you will never know. It’s between me and my Monday. Anyway, we are doing office hours in which I answer five community submitted questions and hope that I can give good answers at a reasonable rate. They’re free. Shut up. If you want to ask your own question, you can head down to the video description where these helpful yellow fingies are pointing. And there’s a link there where you can go and submit your questions to me. If you would like to participate in some sort of monetary exchange, there are ways to do that as well. You can hire me for consulting, buy my training, sponsor this channel, all sorts of other stuff. And of course, if you enjoy this content, please do like, subscribe, and tell a friend. Because that’s how the good word spreads, my friends. Anyway, coming up. Of course, by the time you see this video, Past Data Summit will actually be occurring. It’ll be actually Monday. I’ll be teaching a pre-con. Isn’t that wild?
So I should probably start promoting other stuff that has recently been acceptified. I will be doing a pre-con at Data Tune in Nashville. That event is taking place March 6th and 7th. And I will also be doing a pre-con the very next weekend. Data Saturday, Chicago, March 13th and 14th.
So buy those tickets now. So I can gloat about people buying those tickets now or something. I don’t know. Well, it would just be nice of you to do. Just buy the tickets. It’s great. Anyway, let’s get on with this whole database party here.
I need to go to the Excel file that has the questions in it. And let’s make sure there are, let’s see, one, two, three, four, five questions. And let’s go answer these.
Did it, did it. What do we got here? My developers. Oh, your developers. Oh, all right. See the way it is? My developers.
Let me tell you this. If they’re your developers, tell them to stop using table-valued parameters if they’re such a problem. All right. Not to spoil the question too much here. But my developers overuse table-valued parameters in application code.
Are there any server or database settings I can change to make the queries using them run faster? I know about in-memory TempDB and memory-optimized table variables. Oh, boy.
But I don’t have latch contention issues, so I doubt that they would help. Boy, are you, boy, are you right there. The problem with this question, though, is you haven’t told me what’s slow about them. What’s, what’s going wrong?
Some details would be nice. There are not really any server or database level settings that, you know, wouldn’t already be turned on if you’re in a position to have them that wouldn’t be helping you out. So I don’t really know what to tell you here.
Tell me what’s slow about them. Better yet, click on, click on that consulting link. And then we can do a fully detailed investigation into these table-valued parameters and figure something out for you.
I don’t know. This level of vagueness just can’t be, can’t be overcome. Why aren’t developers good at database performance yet?
Well, because they don’t practice it. They develop features. They need to get something working quickly because they’re under, you know, whatever, you know, time constraints and deadlines they have to deal with. And they have to make something work.
It’s the same reason why developers aren’t typically good at database security or security in general. It’s why, like, every time you read, oh, we found a database on the internet with no password. Surprise!
You know why? Because someone just had to get something working. Security gets in the way of stuff working, right? Like doing performance tuning. Oh, gets in the way of just making the thing work, right? It’s like, oh, I figured out how to do it.
I don’t know if it’s very fast, but it works on my machine. You don’t pry. Unless you are investing time and effort into learning and practicing a skill, you are never going to get better at it. That is true of any mental or physical endeavor in life.
If you are not regularly engaging in that endeavor, you stand no chance at progressing in that endeavor. Let’s see. Why can’t you make an Eric AI to answer these questions?
Have you seen the cost of training a model? Right? I mean, like an MCP server ain’t going to cut it. There’s not a lot of funding rounds here at Darling Data to get into an AI training situation.
So that’s way out of the question. And I don’t know. I always assume part of the charm was me showing up with my alacrity and effulgence.
And effervescence and answering these questions. From wrong, you just want AI to do it. You want a chatbot.
Well, I don’t know. What else? It’s funny because like LLMs are basically an MLM, right? Because like, I mean, just like the industry in general at this point is an MLM.
But like the people making money off LLMs are like selling prompts. Like you’re selling me a question to ask them? Like you’re just like, oh my God.
Anyone who does that is an absolute scavenger. All right. But that’s why. All right. Because Eric AI would be too expensive to train. Do you have any training content specifies performance tuning efforts and limitations for SQL Server Standard Edition?
I get excited to see posts from you in the community around batch mode and rowstore only to get let down by licensing limitations yet again. No, I don’t. Because 90 something percent of the time performance tuning with Standard Edition is exactly the same as performance tuning with Enterprise Edition.
You have some additional restrictions and limitations around hardware and what is automatically available to you. And even some restrictions on, you know, things like batch mode where it is limited to a DOP of 2 in Standard Edition. But most Standard Edition performance tuning is no different from Enterprise Edition performance tuning.
There is just not a whole lot that you have to do differently. You know, you work within your restrictions and limitations and you make the best of it. You know, a lot of the training that I have would work just fine on Standard Edition.
There’s nothing specific about it. There’s nothing specific enough about it, especially since Microsoft gave up on some of the earlier restrictions that existed with it. Some of the programmability stuff with 2016 SP1 that became available in Standard Edition.
There’s really not a whole lot too, too different there. Microsoft was even kind enough to give UDF inlining to Standard Edition. So, you know, there are some, of course, some additional considerations.
But, again, my rates are reasonable. Final question of the day. Oh, boy.
Are you doing a Black Friday sale? No. My training is already priced so that normal people in the world can afford it. I already bake a lot of coupons and discounts into stuff.
You know, I don’t want you to need, you know, a payday loan or a stipend from work in order to buy my training. I want lots of people to be able to buy it and benefit from it. So I don’t price it at a point where, like, a Black Friday sale would really get a lot more butts through the door.
So, you know, right now, no, I can’t see a reason to discount things further. You know, I do have a family to feed and all that. You know, I have actual children.
You know, actual life. There are people, like, you’ll never see on camera who I have to take care of. So, no, I can’t see a Black Friday sale being a very provocative. I can’t see too many provocative incentives in me knocking stuff down further.
So it already sells fine at the very low and reasonable prices that are set for it. So, no. No.
I’m not embarrassed by my prices. So, anyway. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video, which, if I have things right, will be a very special revival video of sorts. So stay tuned for that.
Anyway, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
T-SQL Inline UDF Plan Caching and Reuse in SQL Server
Since it’s also Christmas, enjoy my favorite Christmas song.
Video Summary
In this video, I delve into the nuances of inline table-valued functions (inline UDFs) in SQL Server, focusing on how they differ from scalar and multi-statement table-valued functions when it comes to parameter sniffing and plan caching. By demonstrating with an example, I show that inline UDFs are automatically inlined into the calling query, which means traditional methods of tracking execution plans using extended events become less useful. Instead, I highlight how running queries with `OPTION (RECOMPILE)` can yield multiple execution plans, providing a more dynamic and realistic view of performance under different conditions. Through this exploration, I aim to help you understand when and why certain query plans are chosen by SQL Server, offering insights that can lead to better optimization strategies for your database operations.
Full Transcript
Erik Darling here with Darling Data. In today’s video, we are going to finish up our amazing, thrilling coverage of various user-defined function parameter sniffing and plan caching scenarios in SQL Server. And we’re going to finish up by talking about the type of UDF that I generally prefer, which is the inline table-valued function. To put it briefly, it is the inline UDF, the actual inline UDF. If you like this type of stuff, well, I do this type of stuff all day. Occasionally, I do get paid for it, too. If you would occasionally like to pay me for it, you can hire me for consulting. You can also buy my training, become a supporting member of the channel, ask me office hours, questions, and if you enjoy this type of stuff, please do like, subscribe, and tell a friend.
Past Data Community Summit, geez, closer by the day, ain’t she? Of course, Seattle, Washington, November 17th to 21st, two days of T-SQL pre-cons, the likes of which you have never seen. It’s a great sell line presented by me and Kendra Little. I’m not sure if that’s a great sell line. You should do, what was one from Dune? We got T-SQL pre-cons the likes of which God has never seen. God has not seen? I forget, forget the exact quote. I’m a terrible nerd these days.
but anyway you you go you come you show up you learn things it’s good for everyone anyway let’s talk about inline udf’s here so i’ve got an inline table value function here denoted by the fact that the the returns portion of the function body just says returns table right we do not return a table variable we are not returning a data type we are returning the result of a select the select that we’re returning the result of is exactly what we’ve been doing in prior iterations where we’re just doing this whole thing right so we’ve got that going for us now the first thing that i want to say here is that inline table value function the reason why they’re so different is because the code from them is automatically inlined into whatever query you run so this extended event that we’ve been using to track other executions is no longer useful to us right so if we run this query and we look at the query plan right uh we will see that we got this query plan back right but we actually got the query plan back here right with with other types of udf’s the plan for the udf is hidden away from you and but like for for this we don’t get anything back because the filters on this are for where we want the query post execution plan for the object name of co that contains cohort score and even though the name of our udf does in fact contain the name cohort score uh we we it does not execute as a separate object the the the query inside of this is like inlined into the the query that calls it right so we don’t have to worry about this extended event anymore we can close that out and please god don’t crash ssms please god don’t crash thank you always touch and go with these things but let’s run all three of these with option recompile just because um you know i talked about how um the other other versions of this i needed to make specific versions of the functions with the option recompile hint to see different query plans and write big loops and everything but this one we just naturally get three different query plans with option recompile and the end of the query right this one up here uh it’s a parallel thing we scan an index we scan an index we do some hashing this one down here has a bit more going on in it uh you know we’ve got some compute scale hours and we’ve got a seek and then we’ve got this parallel zone in the plan and then this one down here uh sort of a weird mix between the two except this one uh the sorry the second query i wish this thing would like hold its framing a little bit better but uh this one here uh is just like uh index seek loop uh loop index seek you know we’re done or rather index seek loop loop index seek loop index seek this one down here is of course uh index seek loop uh hash join down to the post table so we got three different plans with the recompile hint right there and we could see the plans without having to use extended events to track what’s going on inside the function because with other ones we wouldn’t see the actual execution plan for the function without the extended event so we’re getting somewhere yay uh so if we uh just because we’ve done it every single time we run dbcc free proc cache and we select the top one here this is kind of annoyingly slow and we’re going to talk about this later but this takes like seven seconds and it takes seven seconds because sql server chooses i mean really like a non-parallel plan and it does some silly things right like uh like like this part is probably fine 152 milliseconds but then like we we did a you know single threaded scan of 17 million rows in the post table maybe not a great idea sql server maybe maybe not such a great idea but the the point here is that we get this version of the plan back and we have the whole query plan nestled in with our query now again we’re not tracking stuff in extended events like we were with the scalar or multi-statement uh valued functions but now when we run this with uh the the wider range of things really because this is where the reputation is over 800 000 people we get back results really quickly and sql server chooses a different execution plan from the one we just saw right like this one here we’re not doing that big ugly hash join single threaded scan of the entire post table we do a seek and a loop and a loop and a seek and a loop and a seek and all this returns relatively quickly right like we could do some performance tuning work on this but i intentionally had to make parts of this not perform great to show you like the differences and the different plans if everything had a perfect plan there wouldn’t be much of a demo here would there no be bored there would be no youtube video crap couldn’t have that what would you do what would you do with your life so now let’s try this for three right so now we’re going to do this and you know and in prior runs when when we did like the the scalar and the multi-statement one for just three it was usually really quick but for this one we get the same kind of crappy query plan back that we got back for reputation equals one right it takes like seven seconds and that’s that’s not so great either i’m like not thrilled about that but again when when we run this like with like in the context of like the full query we we get a we get a good planning or we get a better plan again and this all finishes relatively quickly for the four rows that we care about right again it’s kind of like a you know seek loop seek loop blah blah so on and so forth right you get you get the idea but this all works out a lot better now uh we are going to request performance assistance i’m going to say performance assistance please and we’re going to try to figure this thing out so um like you know you might start by thinking well sql server just you know it it didn’t do a loop join it should have done a loop join right needed a loop join like the the hash join plan is stupid sql server please just use loop joins but when we just hit the loop join at the end uh this ends up taking a little bit longer right that was the last time we ran this it was seven seconds and now that’s like 12 and a half seconds so that loop join hint was clearly not effective and sql server has made an even sillier mistake here now it has done all this right again all single threaded scanning the post table aggregating some stuff uh you know then a loop join down here and then another thing down here that takes like three seconds and so this all adds up to about 12 and a half seconds of time down in this portion of the plan now normally i would do this re i would do this i would do this hint inside of the function body but just for sort of like code brevity sake uh i’m going to just take the function body and inline it into the query and add the hint to that that i would usually put in the function body and that is to just add a force seek hint right here on the post table because really what this hinges on is sql server seeking or scanning the into the post table and with a lot of uh inline table valued function rewrites you do have to do this sort of hinting because like despite them being like the superior type of function in just about every single way imaginable you know they can return more than one row they can return multiple columns there’s no table variable involved they don’t force your query to run single threaded for any reason this is just the optimizer picking a bad single threaded plan for whatever reason the optimizer is doing that for um i haven’t gotten so far as to experiment with the parallel version of these plans because uh i i wanted to get this part done and recorded so uh you know not terribly interesting there uh yet anyway but um if we add a force seek hint to this query and and we run this well now now we’ll now we’ll get sort of the plan that we want and like i said normally i would take this and i would put this in the the body of the function and i would say sql server uh where every time this function runs i am going to force you to seek into the post table because we have a great index there for that now i like i i did try other stuff in this you know like i’ve been talking a lot about row goals and oh you can add top to this and that and top and speed things up top did nothing here top was ineffective uh across this entire thing fast one hints ineffective uh i tried all sorts of hinting around things and really the force seek hint here was the only one that kind of stuck and made things tolerable so so um the bottom line here is that uh because inline table valued functions even though they accept parameters they get inlined into the code and you don’t really see the same type of parameter sensitivity plan reuse thing that you would uh when you deal when you are dealing with scalar and multi-statement table valued functions so we can tell that this is the end because down here is where i was working on the loop code that i showed you in the first video we’re not going to look at that again because we just don’t need to see that but uh i i i think that’s the end so um thank you for watching i hope you enjoyed yourselves i hope you learned something and i will see you in tomorrow’s video i have a short video on sort of a funny take on sql injection that i hope you enjoy too um it’ll be it’ll be it’ll be it’ll be a friday short we’ll call it right because it’s not long so because it’s not long it’s a short all right i think i think i’ve i’ve taken enough of your time thank you for watching i’ll see you tomorrow all right that’s that’s good goodbye you
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Merry Christmas from Santa and Definitely Not Erik
Video Summary
In this video, I’m stepping into the jolly spirit of Christmas and sharing a bit of magic with you all! As Santa Claus himself, I’m offering something special: use the code HOHOHO from now until January 1st, 2026, to get my LearnTeenCast.com course at a whopping 75% off. But wait, there’s more—Erik’s Performance Engineering course is also up for grabs with his own unique discount! So, spread some holiday cheer and share this video with your friends and family. Ho, ho, ho! Remember, Erik wouldn’t be so generous with such a big discount, but as Santa Claus, I am all about spreading joy and generosity. Happy holidays from the North Pole!
Full Transcript
This is definitely not Erik Darling. This is definitely Santa Claus. Ho, ho, ho! Can’t you tell how jolly and merry I am? Anyway, it’s Christmas, and in the spirit of Christmas, I’m gonna give you something. Ho, ho, ho! I’m Santa. If you use the coupon code HOHOHO from now until January 1st of 2026, you can get my LearnTeenCast.com You can get Erik’s LearnTeenCast.com with, definitely not Santa’s course, definitely Erik’s course, LearnTeenCast.com with Erik. And Erik’s course, Performance Engineering for 75% off. Ho, ho, ho! I’m Santa. Look at me. Definitely not Erik. Erik would never give you a 75% off coupon code for that stuff. He lacks the generosity that Santa has. Ho, ho, ho!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
T-SQL Multi-Statement UDF Plan Caching and Reuse in SQL Server
Video Summary
In this video, I delve into the intricacies of multi-statement table-valued functions (MTVF) in SQL Server, specifically focusing on parameter sensitivity and plan caching issues. Building upon yesterday’s discussion about scalar UDFs, today’s episode explores how these functions behave differently when recompiled, leading to varying execution plans and performance impacts. I demonstrate this through a series of demos, highlighting the differences between using a function with and without an `OPTION (RECOMPILE)` hint, and explain why certain execution plans are chosen over others. By running multiple executions and analyzing query plans in QuickQueryStore, we uncover how SQL Server handles plan caching for these functions, revealing the parameter sniffing problem that can arise. If you have encountered similar issues with your own functions or need help optimizing them, consider reaching out for consulting services. Additionally, supporting my channel as a member of the community will ensure you don’t miss any future insights on T-SQL and database performance optimization.
Full Transcript
Erik Darling here with Darling Data. And in today’s action-packed episode of YouTube, we’re going to continue our conversation about UDF, plan caching, and parameter sensitivity, whatever. I forget what I call these things. But anyway, this is what we’re talking about. Because yesterday we talked about scalar UDFs and what happens there. So today we need to talk about these and what happens here because it is the next logical step in the hierarchy of T-SQL functions. Alright? So let’s do that. But down in the video description, if you would like to hire me for consulting, perhaps you have parameter sniffing problems with functions of your own and you’re like, well, Erik Darling, why can’t you help us too? You can do that. If you would like to buy my training, you can also do that down in the links below. Where these lovely fingers point. You can be. You can become a supporting member of the channel, just like PBS. You can ask me office hours questions. And of course, as always, if this kind of content pushes you in your happy place, well, please do like, subscribe, and tell 50,000 or so friends. Anyway, past data community summit, Seattle, Washington, November 17th to the 21st. Kendra Little and I will be tag teaming to you tantalizing, titillating days of T-SQL pre-cons. Like I keep saying, it’s going to be the best T-SQL pre-cons that humanity has ever witnessed. So you should be there and get a t-shirt to commemorate your presence there because otherwise no one will believe you were there, right? Be like that Skid Row concert you went to in 1987. Did you get the shirt? No, you weren’t there then. Anyway.
Let’s get this database party started. And let’s see. We’ve got our function queued up here. Now, just like the ScalarUDF version of this, I had to write two versions of this function. One, normal function that we’re going to be using in our demonstrations or majority of our demonstrations. And two is a version of this with an option recompile hint in it, which probably already gives away the whole thing that, well, we have a parameter sniffing thing that can happen with these two. That probably messes the whole gig up, right? There I go. Blowing the whole waterworks. But anyway, I have reapplied my filters to this extended event session. So that’s great.
And what we’re going to do is we’re going to run the recompile version of this multistatement table valued function. Well, we can see that it is indeed a multistatement table valued function because it returns a table variable. And the primary driver of this UDF is to insert data into that table variable. And then, well, we just say return here. If we wanted to keep things like normal, we might say return that table variable, but we just say return. Don’t ask me why. I swallowed a fly. I swallowed a fly. But let’s make sure that these things are both in here and all correctamundo and that good stuff. So if I run these two functions, or rather this one function with, again, we’re going to do the same couple, same two of the recompile things.
Actually, there’s a third one hiding on me down here. Look at that. All right. Wonderful. We’ve got three. Run these recompile things. You might notice that these taken a little longer than we might like. The days really drag on, don’t they? Anyway, if we go look at the live data for these, we’ve got a query plan over here that looks like this for the first one. It takes about 5.4 seconds to run. Now, in the last set of demos with the scalar UDF, where I talked about how the query that calls or invokes the scalar UDF, the query itself is inhibited from going parallel because it invokes a non-inlineable scalar UDF.
Multi-statement table value function. Part of what took me a little bit to write these demos was working around this part is that the insert into a table variable has the same general problem that invoking a scalar UDF does, where if we look at the properties over here, I’m just going to make this a little bit wider, apparently a little bit wider. We have this thing in our query plan now. So whereas the last one had a non-parallel plan reason talking about the presence of a scalar UDF, we are not allowed to insert into the table variable.
This is true of all table variables, not just ones in multi-statement table value functions. Table variables can’t be modified using a parallel execution plan, including inserts. So part of why this one is so perturbingly slow is that.
Let’s close that. And for the second invocation, we have a slightly different looking execution plan, right? This one does some stuff and then does a hash join down to here. This one does some stuff and this one does the loop joins again. So this is the key lookup plan.
This one runs a bit faster, right? The total time on this is 238 milliseconds. Whereas for this one, it was five and a half seconds, which is no good at all. And then this one, this one takes 4.9 seconds and uses a similar plan to the one before, except there’s a hash join here.
You know what? These were showing different plans before. This time, I guess it didn’t work out so well. This, this, oh wait, there was something weird and different. Oh yeah. So there is a difference here. This is an actually, this, this was a funny one.
This one, this one took me a minute to catch. This one does a hash match inner join to the post table, right? This one does a right semi join to the post table. So it is a slightly different execution plan.
That caught me off guard there for a second. Someday I’ll have this all memorized, but we’ve cut the query plans. We see we got three different ones when we recompiled. Cool. Now let’s go free the proc cache, right?
And let’s say you select the top one from this. Remember if we like, we don’t want to cross apply to this. We could theoretically outer applying at correct results, but for this, we want to have the scalar sub query in the select list because this will not reduce rows.
Cross apply would reduce rows because it’s a like relational operator. And it’s like an inner join where if rows on like, like the rows don’t exist here, we won’t get a row back, right?
And we want our results to maintain correctness across different executions. So we need to put the, put the function call in the select list for this one. We could outer apply to it. But again, that’s a little bit more of a relational thing, whereas this is just like, go find me a result.
So, uh, we’re going to run this for reputation equals one. Let me make sure I did this. I can’t remember. It was too long ago. And let’s run this.
And that’s going to take about five ish seconds to run and get a result back. And if we come over here, uh, we will have a new entry in this. And we can see that we, we used the query plan that we got when we ran this for the one row with a reputation of one, right?
So that’s that new thing. Now I’ve remembered a clear data and not clear filters because that was annoying last time. But, uh, if we now run this for the 800,000 people, this is going to take about 20 seconds because we, it’s about five seconds per run and we’re returning four rows.
So the rules of mathematics tell you that you should take the number five, like about five, and you should multiply it by four because we have four rows. Four is not negotiable on that.
So this will take 21, 22 seconds, 20 seconds even. Great. So we’ve got that. And now we’ve got, uh, four query plans and we’ll see that SQL Server reused that same plan each time, right?
We, we reused the hash match inner join plan where we go down and scan the post table. Fantastic. We did it, right? We’ve, we’ve, we’ve cracked the case. So let’s do the same thing that we did.
Uh, we’re, we’re only going to do three just because, um, the, the plan for 11 was also about five seconds. So it doesn’t really make a lot of sense to show that one here. Um, I’m just going to do a one and three for this one, just to move things along a little bit.
So now if we run this for three, we get a, we get like the faster query plan for this back. And if we run this for the 800,000 rows here, we get those four rows back pretty quick. So, um, what I’m going to do is because there’s not a view for multi-statement table valued functions, the way that there is for, um, scalar UDFs, uh, they’re not tracked in their own DMV.
Uh, I’m going to use, um, quickie store to look at this and, uh, we should see, well, if we come over here a little bit further, five executions of each one of these, uh, UDFs or rather it’s the same UDF.
We just got different plans for it, right? You can ignore the all plan IDs here from the last hour. We got, uh, for a query ID 28475, we got plan ID 6007 and 6008. Uh, there was another plan ID for this from an earlier run that still ends up in all plan IDs.
It was don’t, maybe don’t want to worry about it so much. Actually, that might’ve been the plan ID for the number 11. I forget the, would be nice if there was a row for it, but what the heck? So we get the five executions and we can see the differences looking in quickie store in average min max in total duration between these, right?
Uh, we can see that, uh, on average, uh, one of these did a lot better than the other one. So over those five executions, the, the execution time for the slower one really piled up. And this is again, uh, a parameter sensitivity issue inside functions in SQL Server.
So hopefully this has answered all of your deepest, most burningest questions about, uh, multi-statement table value function, uh, caching and plan reuse rolls right off the tongue, don’t it? Anyway, uh, we’re going to stop here because we have, we have reached the end of these demos clearly because the next function down there on the bottom is inline table valued functions.
So now we’re, we’re going to go do that next. That’s going to be tomorrow’s video because we still have to make another video for tomorrow. Otherwise we have no video for tomorrow and people will say, Eric, why is there no video today?
Tomorrow. All right. Cool. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you tomorrow where we will talk about inline table valued functions.
All right. Goodbye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
T-SQL Scalar UDF Plan Caching and Reuse in SQL Server
Video Summary
In this video, I delve into the intricacies of scalar UDF parameter sniffing, specifically focusing on how different parameters can lead to varied execution plans. I walk through a detailed example using two versions of a scalar user-defined function—one without and one with a recompile hint—to illustrate why parameter sniffing is crucial in understanding query performance. By leveraging extended events and dynamic management views, I demonstrate the nuances of plan caching and reuse, highlighting how these functions can significantly impact execution times based on input parameters. This video aims to provide insights that go beyond just stored procedures, emphasizing the importance of considering function behavior when optimizing SQL Server queries.
Full Transcript
Erik Darling here with Darling Data. And today’s video is going to be about Scalar UDF parameter sniffing. There was a question in recent office hours where I was like, I got blog posts about that. And then I went back and I read the blog post. And it’s funny, like, reading old content because you’re like, oh, I could make that better. Oh, I could fix that. But, you know, updating old material always seems to get get pushed away in favor of forging ahead, creating new material. I went back and I actually reviewed the file that I had on that stuff. And I was like, well, it’s like, I had to get more into some of this stuff. So I actually ended up using completely rewriting completely different functions and giving much more interesting tests on these things. So the next three videos are going to be about Scalar, multi-statement and inline user-defined function parameter. Well, like plan caching parameters sniffing. So we will endeavor to do our best covering those. Down in the video description, you will find all sorts of useful links. You can hire me for consulting. You can buy my training. You can become a supporting, that means with money, member of the channel. You can also do other fun things with your life, like ask me office hours questions. And of course, if you like this content, please do like this.
Subscribe and tell a friend. Past data community summit. Oh boy. These are probably sound like my last pitches that I can do on this thing. But November 17th to 21st, Seattle, Washington. As usual, me and Kendra Little are going to do two days of the absolute most magnificent T-SQL pre-cons that have ever existed. Just the entire span of human existence. So I hope that you’ll be there getting swag and other things from me. T-shirts and stickers and buttons and all the other stuff. So anyway, let’s start this database party up here. So there are a few things that I need to walk through before I show you exactly what happens when these functions start executed.
The first one is a view that I’m going to use to look at scalar UDF details. This hits a few dynamic management views and gets a pretty, like the reason why this is a view here is because I don’t feel like having this code interspersed throughout the presentation and having to reference the entire thing over and over again. So we’re going to look at the sys.dm.exe function stats view. Brand spanking new in SQL Server 2016. Glad you’re keeping up. And we’re going to use this to look at scalar function performance metrics. The second thing is, of course, the scalar UDF itself.
All right. So this is going to be the scalar UDF that we’re going to use. There’s actually two of them. And there’s two of them for a reason that I’m going to explain in a moment. But the idea here is to find out, based on someone’s reputation, we find people whose reputation is like 25% lower and 25% higher.
We can, of course, manipulate this if we wanted to, but it seemed like a good starting number. And we sum up all the reputation for other people who have a similar or all the post scores for people who have a similar reputation. And we look at those. It’s kind of silly, but it sounded like a fun function to write, so I wrote it.
So I don’t really care if it has a, I don’t really care if it delivers actionable business insights for anyone. Because the idea is to show you a behavior, not to show you actionable business insights. I’m on a dashboard. My eyes are up here, as they say.
So there’s this one. But then I also had to create a version where there’s a recompile hint inside it. And the reason I had to do this might surprise you. If you run a query, let’s say like a select query from blah, blah, blah, blah, blah, blah, blah, and you reference a scalar or multi-statement user-defined function in that query, and you have a recompile hint on the outer query, it does not recompile the scalar or multi-statement function in your query.
Fun, right? So the reason why that’s important is because over here, I had written both for the scalar, but this version has the multi-statement function column in it.
But I had written a loop. And the reason that I wrote a loop is because I needed to find out which of these values generated different plans. And the easiest way for me to do that was to grab the histogram from an index, put that into a temp table, and then have a cursor go over the temp table.
I did this in two different ways. One was just grabbing all of the range high keys. And then one way was grabbing all the range high keys plus one. So like I was trying to see if like the in-between numbers gave me any different plans.
Mostly they didn’t. And mostly I think they didn’t because I do not have a direct equality predicate in this. It’s a greater than, equal to, less than, equal to predicate.
So this was maybe a little bit of a wasted effort adding the plus one, but it was a fun experiment anyway. And then basically what I do is I cursor over the table, grabbing the range high key, and then execute.
Well, you can see there’s a recompile in the name of this function. This is clearly not the scalar version of the function, but this is the multi-statement version. And basically it’s just me trying to find different plans getting generated.
The way that I examined those plans getting generated was with extended events. And of course, I used my fabulous store procedure, SPHumanEvents, to spin up an extended event that grabs actual execution plans so that I could see what plans the function generated on each call.
Because getting execution plans normally, you don’t see that, right? Like when you get an actual execution plan for a query that calls a scalar UDF or a multi-statement table-valued function, the actual execution plan for the function doesn’t show up in the query.
That only happens for functions that are inlined into the code. So like if you have a non-inlinable scalar UDF as mine is, you get nothing, right?
It doesn’t show up in there. You can get an estimated plan, but the estimated plan often doesn’t match the actual plan. That’s another funny thing that I ran into quite a bit with this. It was a bit of a head scratcher at first.
So that’s what this session over here is. And there are some filters on this. That’s why you might see displaying zero of filtered event. Displaying zero of filtered zero events.
Total. What? Okay, whatever. Microsoft English. It’s a nightmare.
Anyway. So that’s this option recompile version of the function here. So what I want to begin by showing you is we can turn off actual execution plans for the moment. They’ll only get in our way.
Is if I run the recompile version of the store procedure, this was my big finding with the loop that I ran. If I run this and we keep an eye on this window over here, I might have to tinker with the filter, so don’t be mad.
If I run these three queries, all of these three queries will generate different execution plans because they’re going to recompile. There’s an option recompile on this each and every time.
So that’s running this version. And if we come over here and look at the live query data, wow, it worked. We have three query plans for this. And the reason why we generated three different plans is notice this number right here changed.
That’s the second parameter in the function. That’s the reputation one. This is the one that really drives cardinality estimation for the query plans that we get. So running these for one, three, and five, clearly I didn’t need that whole loop through 195 histogram steps because the first three, four lines in the histogram gave me everything.
So great. Anyway, for reputation one, we get this plan. Which takes about 1.2 seconds.
We can see that it is a big parallel plan. We scan an index over here. We scan an index down here. And the whole thing takes, again, about 1 point. Well, actually, it’s about 1.3 seconds.
The second query plan looks like this, where we have some constant scans and some compute scalars and blah, blah, and a nested loops join.
And then we go parallel over here and we do some stuff. The third plan looks like this, where this one’s a little bit different from this one, right? This one, we have one, I think it’s one set of constant scan stuff over here. But this one, you know, we still have one, but it’s different, right?
Because it’s a different, like, join over here. This one is nested loops, nested loops, nested loops. This one is nested loops hash join, right?
So we change things a little bit here, right? Because there’s a key lookup in this plan. Oops, this one. There’s a key lookup in this plan. And this one, we just scan the index on the post table.
This one takes about 735 milliseconds. And the one prior takes about 75 milliseconds. All right. So let’s keep those numbers in mind as we forge forward in our path to figure all this stuff out.
So there’s some stuff in here about the histogram that is unimportant. And there’s also some stuff in here about getting some frequencies. When I used to work in market research, you would call getting counts of various things frequencies.
How frequently did this happen? What’s the frequency of this? I’m going to throw the whole thing into the garbage. But this stuff is not pertinent to us.
This was pertinent to me writing the demo. So we can move on from that. But what I want to show you here is let’s come over here and let’s clear data out of this thing. And let’s run dbcc free proc cache because we’re going to have the non-recompile version of this first scalar UDF.
And what we’re going to do is we’re just going to get the top one row for where reputation equals one. And that takes about 1.3 seconds. Crazy, right? If we come over here and look at live query data, well, I guess it was more like 1.2 seconds.
I’m sorry. I misled you by 0.1 of a second. I hope you can forgive me. I hope you can. Find it in your heart.
Now let’s run this query. And let’s get everyone with a reputation over 800,000. Run this. And this is going to return four rows in about five-ish seconds or so.
And if we come over here and we look, there are four rows. There are four new execution plans. We reuse that hash join execution plan every time, right? And some of the times are a little bit different.
Like this is a 952 milliseconds. There’s 990 milliseconds. There’s 993 milliseconds. And there’s 1,003 milliseconds or 1.003 seconds. So every time that we ran the function, the function reused the plan that was cached for.
All right? Let’s clear that out. And let’s repeat our experiment. First, let’s see in here. Let’s validate some stuff.
Here is our function. Here are the five executions. And we have the total and average worker time and the total and average elapsed time. Something that I didn’t talk about up there that is kind of funny is all of the function plans that we saw, it’s inside of a scalar UDF, right?
And one thing that a lot of people, what they mess up about scalar UDFs, and I’m actually going to come back to these because it’s interesting enough to sort of backtrack a little bit. What everyone says about scalar UDFs is, oh, they prevent parallelism, which is true, right?
For the non-inlineable scalar UDF will inhibit parallelism for the query that calls it. But the body of the function is completely free to go parallel, right? Like this is clearly a parallel execution plan.
There is clearly parallelism in this execution plan, right? We can see the parallelism, right? We can smell it. It smells like fast. And we can see the parallelism in this one.
So the body of the scalar UDF is completely free to go parallel, but the query outside that calls that function is not free to go parallel, right? So I’ll actually show you that in the next demo just in case you don’t believe me.
Anyway, let’s free the proc cache and let’s run this now for reputation three, right? So let’s run this. And actually, we’ll turn on query plans for this one.
We’re going to have an extra execution of this query, but that’s okay. If we go to the properties over here, we will have this little thing in the query plan. Non-parallel plan reason. T-SQL user defined functions not parallelizable.
So they inhibit parallelism for the query that calls them, but the body of the function perfectly fine to go parallel, right? Good for us. Anyway, if we come over here and look at the live query data, we’ll see the two executions of this and the two query plans for this that do in fact use parallelism in quite a bit of the plan.
And now, so remember the last one took, I don’t know, like three, four seconds? But that was because the query plan for that one for each execution took around 991 second to run.
These function calls, I’ll take 70, well, rather, these function calls specifically take about 68, 69 milliseconds to run. So when we come over here and we run that 800,000 query for the same four rows, this comes back a lot faster because we reuse a slightly faster execution plan.
So we can already see that T-SQL non-inlineable scalar UDFs get a plan cast for them and further executions will reuse that execution plan, right?
So we come over here now and we look, remember the first two in here were the ones from our executions. The next four are from what our query returned, right? So this one, two, three, and four, these all took around 90 to 100 milliseconds or so.
And we got, but we got faster return results because we were using a generally faster execution plan for this query. So the plan reuse thing here can be a pretty big deal. Not a lot of people think about functions when they think about parameter sensitivity.
Everyone looks at stored procedures and SP execute SQL and like, ah, you parameter sensitive jerk. But functions, just as likely to happen. Fun stuff, right?
And just to sort of prove things out, let’s come over here and we’ll notice that the total and average times for these things are way down, right? From the initial invocation there. And just to sort of prove things out a little bit further, we’re going to do this for five as well.
So if we run this, did I clear this? I didn’t. Let me make sure this is cleared out. Oh, oh, I didn’t mean to clear all filters. I just meant to clear data. There we go.
Anyway, we’ll just deal with it. All right. So there’s our first execution. And we can turn execution plans back off. And this is why I had things filtered out. So here is our query plan for that.
There’s one other query plan in here for the calling query, right? Well, it’s, but this is the one that we cared about. But now if we come back over here and we run this for the 800,000 people, this is going to take a few seconds, but we get results back.
And then we have, if we scroll a little bit, we will have the query post execution plan for all of our executions of that. Oh, we’re up at the top.
So let’s scroll down a little bit, but you get the point. Oh, that’s not the point. Point is down here a little bit further. So there’s one reuse of the plan. There’s two.
There’s three. There’s four reuses of the same plan. So T-SQL, scalar UDFs, most definitely, most definitely. Cache execution plans and reuse them.
And if we look over here, then we will see, of course, the times on that. And these times went up a little bit. That query plan is a little bit slower for the results that we were looking at there.
So as always, I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where most definitely we are going to talk about the multi-statement variety of UDF.
And we will find out how plans are cached and reused for these. So we have that to look forward to, don’t we? All right.
Thank you for watching. Thank you. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Hi Erik! How do you see the DBA role evolve in the next couple of years ? Will it go to a more coding level or maybe more to a multi-database role ? Thank you for the great content.
Hi Erik! When will you be coming to Sweden?
What setup do you use to record your YouTube videos?
First of all, thank you for everything you share! When it comes to SQL Server 2025, which new performance-related feature do you believe has the greatest potential to become a real success?
you’ve already told us all the sql books you like, what are the books about lifting yhtat you like?
In this video, I discuss the evolving role of database administrators (DBAs) over the next couple of years and share my thoughts on how it might shift towards coding or a multi-database role. I also dive into some personal questions from viewers, such as my travel plans to Sweden for SQL Server conferences and the setup I use for recording YouTube videos. It’s been interesting to reflect on the future of DBAs while sharing practical advice on equipment and techniques that can help improve your own video production process. Additionally, I touch on what new performance-related features might be coming in SQL Server 2025, or rather, why they may not live up to expectations based on current demos and updates. Overall, it’s been a great opportunity to engage with the community and share insights that can help both DBAs and developers navigate these changes.
Full Transcript
Erik Darling here with Darling Data, and it is my favorite Monday of the week because I get to answer your office hours questions. Five of them, this many. I’ve counted and double counted to make sure that I don’t shortchange anyone or overcharge myself. So we’ve got that going for us. If you want to ask me office hours questions, you can do so. There’s a link down in the video description. It will take you to my website where there’s like, you know, a little language that’s going to be in the description. So I’m going to have the language around like, hey, if you want to ask a question with some details, here’s how to do it. But there’s also a handy link to the Google doc form that you submit your questions to no email required, right? Just send it in there, right? No, just drop it in. While you’re down there in the video description, hunting for the office hours link, you can do all sorts of other wonderful things like hire me for SQL Server consulting, buy my SQL Server training, become a member of my SQL Server channel and all sorts of other stuff. And of course, if you enjoy this content, well, you can always like, like, subscribe, tell all your friends, you know, all that good stuff. Past Data Community Summit coming up short order. Very close now. You can smell the pre-cons that me and Kendra Little are doing. Two days of them, T-SQL, the best T-SQL pre-cons that have ever been pre-conned. So, you know, if you’re not there, you will be missing out entirely.
on the most formative event of your life, right? But anyway, let’s get this T-SQL, or not T-SQL, SQL Server Office Hours, Performance Office Hours, Party Started. Oh, that database is missing an arm. Ah, AI. That’s not, that’s, that’s not nice. Chopping arms off of these poor databases. Anyway, let’s go look at what we’ve got here. So, hi, Eric. How do you see the DBA role evolve in the next couple of years? Will it go to a more coding level or maybe more to a multi-database role? Thank you for the great content. I am notoriously bad at forecasting these things. I’m just straight up not good at it. You know, I’ve, I’ve lived through, I think like three DBA extension, extinction events now. You know, it was like VMs, the cloud and well, I guess we’re living through the AI one at current.
But, you know, I think that if you are good at what you do and you are good at learning things, you know, you, you will, you are, you are not going to be extinct. But, um, as far as like the, the two things you mentioned, I don’t think that it’s going to go to, I don’t, I don’t think that it’s going to go to either one specifically. Um, you know, uh, you, you might see a shift to, uh, or you might see yourself shift to a different technology, a different database platform more and more.
I don’t know. Uh, I, I, I don’t know what industry you’re in, where that might happen. Uh, a lot of my clients are firmly entrenched in the Microsoft stack for better or for worse. Um, you know, so like a lot of them, there is no like multi-database thing, unless it’s for like some microservices or like some outside feature. Like, um, you know, you’ll see them move some micro microservices to like, uh, like Aurora Postgres, but it’s not stuff where they’re like sitting there staring at performance and like, like biting their nails over it. Um, they might go to Elasticsearch for full text stuff rather than keep, uh, like punching themselves in the face with full text indexes and SQL Server.
But, um, you know, uh, like when you think about like what like DBAs generally do, you know, you’ve got your, like your infrastructure DBAs, you’ve got your performance, like, uh, oriented DBAs, which is like what I do. And then you have the people who like, like, who are like developers and stuff. Uh, the people who do, uh, performance tuning work and development work have already, like already do a lot of coding type stuff, right?
You rewrite queries or, you know, like, like the developers make the features and the performance tuners make the features faster, right? Type things. Uh, the infrastructure type DBAs, they’re just doing infrastructure in the cloud mostly now, right? You still have a lot of on-prem people, of course, but for people who are part of that shift, you know, like whatever they were, whatever they were doing on-prem, they now just have to do like through a crappy portal and like maybe some like command line scripts and stuff.
So, um, you know, I, I don’t generally know where, where the whole thing is going to go and end up. Um, but as far as I’m concerned, uh, people are always going to have performance problems. And so that’s why I’m very happy to keep doing my performance tuning stuff.
So that’s the best answer I can muster on that. I just had lunch. So this is like 90% chance I’m going to fall asleep during or after this video. Uh, hi Eric, when will you be coming to Sweden?
Well, uh, I don’t have any vacation plans to Sweden currently. Uh, but if there is a SQL Server conference in Sweden that you feel that I could make a meaningful contribution to, uh, and they are the type of conference that accepts pre-con events, uh, where I can, I can, you know, defray some of the cost of traveling overseas, uh, then I’m, I’m, I’m always happy to show up.
So you tell me when I should come to Sweden based on when there is an event in Sweden that I can attend. And again, defray some travel costs and potentially have a little tax write-off as well. So you tell me when I’m coming to Sweden.
I can’t tell you. All right. Uh, what setup do you use to record your YouTube videos? Well, it’s a good question. Um, I have a Nikon Z30 pointed at me. Um, I didn’t, I bought it because they were like, this is the ultimate content creator camera.
And then like, it got like one firmware update and like in like, I don’t know, I guess it’s 2019 or something and nothing since. So, uh, I don’t know if it is the ultimate creator camera.
It took a lot of work for me to get it to the point where it didn’t overheat and shut off after like 30, 40 minutes. So maybe, maybe not. Uh, I don’t know if that’s the best advice that I, uh, I don’t know if that’s advice you want to take, get it.
Maybe the advice is not get it. Like maybe there’s better advice out there. Um, I’ve got this shore wireless thing that I use. Um, I don’t know if you recall, uh, a few months back, my, my microphone thing snapped and I had to get these clip on things, but it’s been working great ever since.
Uh, and aside from that, I’ve just got a bunch of hot lights. I’ve got two, one on one, like, you know, photography type lights on either side of me. And then a couple of ring lights in front of me over here.
Uh, aside from that, um, it’s, it’s all held together by bubble gum and magic as far as I’m concerned. All right. First of all, thank you for everything you share.
Well, obviously we haven’t eaten dinner together. Uh, when it comes to SQL Server 2025, which new performance related feature do you believe has the greatest potential to become a real success?
Uh, quite frankly, none of them. Um, so there are features that I think are cool. Like I think optimized locking is cool. Um, I also thought the optimized Halloween protection thing was cool until it got pulled out cause there were bugs in it, but I was really looking forward to that one.
Um, aside from that, when you look at like performance stuff in 2025, there’s nothing really all that interesting in there. Um, there’s like the oppo thing, which is built on the pisspo thing, but the pisspo thing isn’t that great.
And they haven’t made the pisspo thing any better. So I don’t think the oppo thing is going to be all that great. Uh, and then like, like the few demos that I’ve seen about like, uh, batch mode improvements for some of the mathematical functions, like sum and min and max and average, all the, all the improvements are measured in microseconds.
So I’m like, you can’t like, like, I remember like when, uh, when, uh, what was it? JSON first came out in SQL Server 2016. There were my actual Microsoft blog posts comparing JSON and XML for stuff.
And the, all the, all of the demos ended with, and as you can see, there is a 300 micro sec, micro second difference. JSON is clearly superior.
And I’m like, get out of here, like, leave me alone. Yeah. Uh, so I don’t know. Uh, I’m not, I’m not all that crazy about SQL Server 2025 for this stuff. Um, clearly this is the fabric in AI release.
This is not a SQL Server release. Nothing good happening here. Uh, the vector indexes are laughable at the moment, right? They’re read only. Um, I had posted the video about the code that they currently runs when you created a vector index.
It’s like, like who pushed that dead horse out the door? Oh, it’s amazing. Anyway, um, what was I going to say? Oh yeah. It was, there was a, there was a, Bob Ward posted a picture the other day of, uh, the SQL Server 2025 book on a shelf with all of his other books and the SQL Server 2025, 2025 book is like a pamphlet about this big.
All the other books are like, like a good solid ribeye thick, right? Maybe a healthy porterhouse thick. That’s that’s, those are steak measurements in case you’re a vegetarian. But it was just funny.
It was like, damn, like, uh, not much to say about that one. All right. Well, anyway, I dug a pretty good hole on that one. I’m going to have to send flowers.
Uh, you’ve already told us all the SQL books you like. What are the books about lifting? Excuse you that you like. Um, well, um, the, the, the, the two books that have made the biggest difference for me, uh, are the two starting strength books.
Um, it’s, there’s the blue book and the gray book. Um, one is, uh, for about beginner, um, barbell training. The other is about intermediate barbell training. Um, I suppose I could put links to those in the video description if you’re interested in picking those up.
Um, but they are excellent at, um, both, uh, describing, uh, the, uh, how to do the lifts properly. I think it’s like 72 pages about how to properly do a squat, which if you, uh, read and slowly absorb them as you’re doing those lifts, you can, you can really like feel why this is the right way of doing things, or at least, uh, I think, I think a very optimal way of doing things.
Uh, and so it’s not only like how to do the lifts, but also like how to program, um, a lot of, uh, uh, take, takes a lot of the programming questioning out of there. Um, you know, there are of course like templates in the book on how to do something, but, uh, past a certain point, you really should just get yourself a proper coach who is good at programming and make, uh, good calls on these things.
Uh, cause, um, you know, you’re, you’re going to get certain, uh, rep and set schemes stuck in your head. And it’s going to be very difficult for you to figure out, uh, alternate ones and which ones would be appropriate at your stage of training.
So, uh, if, if you, if you, if you do find yourself, um, you know, uh, going into the gym and getting, uh, quite advanced with the, both the amount of weight that you can lift and all that other stuff, and you find yourself sort of like hitting plateaus and not like being able to like meaningfully add weight to the bar any further than, uh, getting a coach really is, um, uh, the next logical step for you there.
Anyway, I think that just about covers everything here. Uh, been a pleasure as always. Uh, thank you for watching.
I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video. I’ve got some interesting, at least I hope it’s interesting stuff lined up for this week. So, um, we’ll, we’ll slog our way through that as we are slogging our way through, uh, yet another Monday of corporate doldrums and all the other stuff.
Uh, you know, there’s a reason that people used to drink at lunch and it’s sad that we’ve lost that in our, in our society. Anyway, thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.