All About SQL Server Stored Procedures: Error And Situation Handling

All About SQL Server Stored Procedures: Error And Situation Handling


Video Summary

In this video, I dive into the world of error handling and situation management within stored procedures, emphasizing their importance in SQL Server environments. Whether you’re a seasoned DBA or just starting out, understanding how to handle errors and unexpected situations can significantly improve your stored procedure’s robustness and reliability. I explore various scenarios where you might want to implement error handling, such as validating input parameters, managing empty result sets, and dealing with critical processing issues. Additionally, I discuss the nuances of using row count big for efficient data retrieval and how it can be integrated into triggers. For those looking to support my channel or get deeper insights, I provide information on becoming a member, asking questions during office hours, hiring me for consulting services, and attending SQL Saturday New York City in May 2025. Whether you’re a beginner or an expert, there’s always something new to learn about stored procedures and error handling.

Full Transcript

Erik Darling here with Darling Data. And we have a very interesting topic for today’s video, which we’re going to continue on learning about stored procedures because stored procedures are an important part of most, you know, DBA performance tuning development work in most SQL Server environments. Granted, even if you are using an ORM, well, golly and gosh, you can call stored procedures from an ORM. Successfully in most of them. So they are worth talking about. Now we’ve covered a few things and today we’re going to talk a little bit about, well, as soon as I click on the right thing, we’re going to talk about error handling. I’m going to tell you ahead of time. We’re going to talk about some theoretical stuff around error handling. This is not just going to be like specifically error handling, like, oh, I hit an error and now I need to handle it in some way, but also sort of like situation handling with like parameter values and like local variables and stuff because sometimes you might need to do a little bit of magic on those to make sure that nothing too screwy happens.

So it’s not just error handling. It’s like bad situation handling in store procedures. If you would like to support this channel, well, man, do I have good news for you. You can become a member of the channel and join about 60 other folks who have so kindly offered me up some money every month in response to the incredibly high quality content that I produce here for the masses.

It’s a nice way to say, think of a little tip jar, except not one of those like, you know, annoying ones where someone’s like, give me a 90% tip for the croissants I put in a piece of tissue paper. It’s like this, you get better at your job in exchange for a little bit of money. If you, if you, if you, if you’ve been shaken down by some criminal organization and you no longer, they’ve collected their debts, you are missing a finger, your wallet, your life savings, your house, your car, all of your wife’s jewelry or whatever.

You can also comment, like, subscribe and all that other stuff. I recently started or rather resumed answering questions from the public, not just in comments and whatnot. If you want to ask me a question that will be answered publicly in my office hours episodes, you can do that by clicking on that link, which is again available just about there ish.

Might be off by a few pixels, but I promise you I’ll get you in the right general area. If you need help with your SQL Server, you are in need of a consultant to help you with the health, performance, well-being and sort of general attitude of your database system. RDBMS.

You can hire me to do all sorts of things. And as the wonderful editors of BeerGut Magazine have noted for many years running now, I am the best SQL Server in the world outside, well, in like three of the four hemispheres. All right.

I’ve got world domination in 75% of the hemispheres. At least until, at least until China steals, steals my tech. And then, then we’re all, then we’re all doomed.

They’ll be deep seeked. If you would like to get access to my core training bundle, you can get, you can get all of that for a hundred and, but 150 US dollars for the rest of your life. Again, in the video description, it is, it is available there.

And I promise you, it’s a wonderful deal. You pay the fee once and you have entry forever. SQL Saturday, New York City is coming up this May 10th in 2025.

It’s taking place in the scenic, now congestion-free Times Square, New York City. So you can, you can show up there. You can learn a whole lot more about data stuff and you can leave a happier, maybe not healthier person because you’re going to drink, you’re going to eat a sandwich.

You’re going to eat a, eat a cookie and you’re going to drink a lot of soda. That is, that is generally the tenor of how these days go. So that’s, that’s, that’s going to be what happens.

I don’t know that you’re going to be healthier. You’re also going to do a lot of sitting. There’s not, not a lot of physical activity. You’re probably not going to get your 10,000 steps in, but you can show up and you can, you can learn. So if you can get a break of mental sweat for whatever that’s worth.

With that out of the way though, let’s talk about error handling stuff. Now, no talk about error handling would ever be complete without noting that Erland Somerskog, our dear friend, our dear friend of the summer, summer bog or something, has written a, a very long article on error handling. Just like Erland, it has three parts and three appendices.

So if you ever meet Erland, you can ask him about his three appendices. Don’t, don’t ask him about his three parts. It’s a private matter.

But if you need to learn like the whole in and out of error handling, that’s the best place to go. That link will be in the video description, just like all the other helpful links I provide to you. So I would, I would go there if you are just completely new and lost on it.

What I want to talk about is sort of like what, when and why you would want to implement error handling and situations where you would maybe want to do something that raises an error because someone has like, like something has just screwed up along the way. It might not be like you hit, like you did like a primary key violation, foreign key violation, something like that. Like stuff that throws errors, like divide by zero, whatever it is.

There’s all sorts of stuff that SQL Server would be like, whoa, that’s an error. What I’m talking about are situations that might arise in your store procedures that you might want to check for and say, hey, this happened. Bail out, right?

Don’t go no further. Everything’s going to be screwy. So the type of like certain errors and conditions you might want to like, or rather the outcomes of them that you might care about are stuff like this. So you might want to like, depending on like where the, where the procedure is getting run, you might want to print or log an informational message and keep chugging along.

Right. You might want to say like, hey, like zero rows came out of this thing where like nothing else is going to happen. Let’s stop here.

Or like just log like zero rows came out of this thing. None of these results are going to be good. You could also do something like return that message to the user. If it’s something that they would actually see and be able to take action on.

You might want to either keep going. You might want to like roll back or like, you know, or, you know, just kind of stop execution. You might want to like just log the error somewhere and keep going.

Or you might want to use the type of error that you hit to retry the thing and like move on from it. If you’re not going to do any of this stuff, right? If you’re not going to do anything at all with your, with the error that pops up, there’s really no sense in doing like begin, try, like and catch anything.

Right. There’s just not a lot of sense unless you’re going to handle the error in some way or like, like raise the error to something else in some way. There’s no sense in error handling.

Like if you say like begin, try, begin, train, and then there’s just like a commit and there’s like no, like nothing substantial happens on like a catch block or like, you know, you’re just not doing anything with it. Then don’t even bother. Right.

Like if you just say like begin, train, blah, blah, blah, commit. And there’s no like, what if for that? There’s no, there’s no like alternative. You hit something crappy along the way. Nothing rolls out of our stuff back.

Like you just end up with bad data, right? It’s not a good situation. So unless you’re going to like do something that like offers some kind of like secondary outcome to everything happening just fine. There’s no sense in you implementing any of the logic at all.

So like just, just to start with this sort of simple example, let’s say we have a store procedure that accepts start and end date parameters. Right. So like someone’s like, well, I want to start a week ago and I want to end up with like, you know, whatever happened today.

Right. So you’ve got seven days worth of stuff in there. You could, if, if, you know, for various reasons, someone might make a typo or someone might like transpose the order of your parameters and put like the wrong date in somewhere. Like all sorts of this, you know, errors could like mental errors could happen.

Like they’re working with a dropdown. They’re like, oh, like, you know, oh, I thought I picked like the seventh of last month and the seventh of this month. But I accidentally picked the seventh of like, like the last month.

And then the seventh of the month before or something. So, or like they just messed stuff up. So like the dates wouldn’t return anything. You might want to think about helping them out in some way or letting them know or just something along those lines.

Now, just as an example here, and this is, this is something that I do in all of my analysis procedures, because with the analysis procedures, you know, like I’m running them in management studio. I’m not like running them from an application. There’s no front end to them.

It’s not like there’s no, there’s nothing, nothing like cool going on with them. I can, I can, I can make the decision to say, hey, I’m going to try to fix this for you. Right. So when you, so let’s say like, you know, like in my store procedures, one thing that I do like at the beginning is check for like weird conditions with parameters where things might not go the way people expect based on what got passed in. So like, I just, I usually have a simple check for something like, Hey, if the start date is greater than the end date, here’s what we can do.

So the first option is you can throw an error, right? This is just an informational message and then try to like fix the parameters. And I’ll show you examples of that below.

Right. The other thing you could do is say, Hey, why don’t you try that over again? Because you like the start date is greater than the end date. That’s not going to return anything.

There’s no sense in moving forward. Like, like, Hey, like reenter that stuff in the correct way. Right. So you have two options there. If you wanted to switch things around, you could do something like, uh, declare, um, a couple local variables, store the current start and end date values in there and then swap them around. Right.

So you could do something like that. Right. So that’s, that’s one option. Uh, the other thing you could do is mess with people and just like give them random values. Right. I’m not saying you should, it’s an option.

Right. Uh, the other thing you could do is say like, you know, like I would print an informational message and be like, start date was greater than end date. We’re going to default to something sane. So just, you know, make the start date seven days ago and make the end date like, you know, a minute from like right now.

So you get like all that, that, all that data for that week. You certainly have lots of options in there. There are other times where you might hit some like critical processing issue where maybe like data from something else wasn’t ready or like you didn’t pick anything up.

That’s like new to do anything with. And you might want to stop there. So like, let’s say you had some, like there’s some part of a query where you’re like, okay, I’m going to put data that I need to work with into a temp table and then figure it out from there.

If you hit a situation where like nothing comes from this, there’s nothing in that temp table, there’s no need to keep processing like empty set, empty steps. Right. There’s no need to keep going like, oh, well, join to this, join to this, join to this, join to this, join to this.

Just, it’s just a, it’s just a waste of time. So you would probably want to just throw that out to an error here. Now, there was an article recently on SQL Server Central and it was like, how do you use ADAT row count?

And I was like, what year is this? We should, everyone in the world should be using row count big, not ADAT row count. Row count big is the preferred method for this now.

It’s why Microsoft put it in the product. There’s no use to, no need to use ADAT row count any longer. Row count big, it’s the way to go. That’s also very critical for when you’re using triggers.

Now, this is not an error handling thing. Again, this is a situation handling thing where at the beginning of every trigger, before you do anything else, just put this check in. If row count big is zero, bailout.

Right. It’s no need to go further from here. No need to like go into all the different steps and check inserted and deleted and whatever you do when you’re triggers. And try to do all this stuff and cursor over everything and audit data and all this other stuff.

It’s just useless. Right. It’s no good. Now, when it comes to retrying things, there are lots of good reasons to do that. I’ve, you know, in my life, I’ve written retry logic into a lot of stuff.

And I’m going to share with you like three different situations for doing that. Three different kind of, not completely different ways, but just different times, like different examples of ways I’ve done it. So stuff that I usually do is like around like either deadlocks or lock timeouts.

So get SP get app lock is another one where you can do like a lock timeout thing. And those are all stored over here. So just sort of like a loose sort of like a skeleton, right?

Like the sort of like outline of how I do this stuff is, you know, you have some, I mean, these can be parameters depending on how you’re doing stuff. But you can like local variables are fine for these. And what I usually do is say like, hey, if my retry count is less than my max retries and success is zero, which it currently is, then I want to try to do this.

Right. So I’ll try to, you know, redo some step. This is just a stupid update query to have something in place here.

If this is, if this goes through and does its thing, then I set success to one and I don’t, I won’t reenter that loop. Right. There’s no need to, but if something happens in here and let’s say I hit a deadlock, right? These are the, these are the things that I typically look for with deadlocks, even though I think one of those is technically locked timeout, which we’ll see below.

These are, these are the things that I typically look for with deadlocks. And if I hit one of those, I’m going to increment my retry count. And one thing that I like doing is this, where I set.

So this is, this is like, like exponential back off. Right. So every time I do this, I wait a little bit longer to, to retry. So I’m not just like hammering in a loop, like real quick.

I want to like wait a little bit longer because I might be waiting for something else to finish. Right. So I might want to like back off a little bit. I’m going to, might want to back off exponentially and then retry in there.

Right. So I’ll set my, I’ll, I’ll set up like this thing. And every time this, um, every time we enter this, this delay will get a little bit longer. And then, uh, we’ll do like the wait for delay thing in here.

If it’s not a deadlock or some other lock issue, then we’ll just throw the error and be like, Hey, I don’t know what’s going on. This is, this is not, this is not going well. Maybe it’s a primary key violation.

Maybe it’s a foreign key violation. Some other constraint violence. It’s like something else is happening here. That is not good. We’re not waiting on a locking problem anymore. We’re done. And then of course, if six, if success is still zero, we will just log in. We’ll just raise an error and say, Hey, we tried as many times as we’re allowed to.

We’re, we’re done here. You could run this thing some other time. Uh, so that’s one way for deadlocks. You can also do the same thing with lock timeout. So rather than like, like again, the same starting thing and the same like reasons for entering the loop here, but then we set a lock timeout.

And if we can’t get a lock to do this, then this thing throws an error here. And then if we hit, that’s what it is. If error number is one, one, two, two, two here, then we’ll start the retry process again. Now, one really cool way of doing this is with SP get app lock with SP get app lock.

You can set, you can set a lock timeout right in there. So if you’re not able to get the lock you want with SP get app lock, which again is like a, like a weird logical, like locking thing.

It’s like not a table or an index or like any other kind of structure. It’s just like this in memory placeholder where like if there’s a lock that exists here, if SQL Server, like put something in the lock manager that says, Hey, go no further.

Like there’s an exclusive lock on this thing. You, you like you, you’re blocked there. Right? So SP get app lock is great for that. Now, I want to make it perfectly clear.

SP get app lock can and will cause blocking. Um, it’s right in the name. It’s SP get app lock lock, right? Like it’s supposed to do what it does.

It’s, it’s like written out in the name. This is what it’s used for. Take a lock and hold it. So nothing else can like, like you can use it to serialize code, like access to code blocks, all sorts of neat things.

If you’re, if you’re going to complain about SP get app lock that you tried it and it caused blocking, well then you, then it, then it did it right. Right.

It was doing the right, it was doing what it’s named after, like right there. It’s very explicit. So if you get, if you’re able to get, um, if like, if you’re able to get the, the lock result from this, right, would be, which would be greater than zero.

Then you would begin and do your stuff. And then you would, so this is a, uh, good, uh, the way that I typically use SP get app lock is I make the lock owner a transaction, right?

But that means you have to do a little bit of transaction handling to support that. So if this, if this completes, right, then you have to do like the commit transaction here, right?

There’s like a, there’s a begin transaction up here for all this, right? So this transaction is what owns this. And then if, if this is successful, then we commit the transaction after that completes. If the lock times out, then we have to go in here and we have to roll, we have to do a rollback because our transaction will still be open up there.

So when we, we get into here, we have to roll that back. So we’re not, we’re not like, we don’t have another transaction open, but then we can do all the exact same retry stuff down in here.

And like, we’ll do the same thing where we like, you know, if, if it’s something else, then we’ll throw the error. And then we’ll, if, if we exhaust our retry attempts, then we’ll note that here. So that’s like three different ways that I’ve done retry things and different retry lodging in different ways that I think, I think have worked out pretty well.

Now, there are a lot of processes that you might write, like especially batch processes that might require robustness in various ways that normal code wouldn’t. Batch processes often need certain error handling and logging methods built in.

You might want to retry certain things like I just showed you. You might need some self-healing. So like if anything fails, you can restart where you left off rather than like, like, like, like start again from the beginning and like fail a bunch of times because stuff’s already done and then move on to the stuff that didn’t get done.

I’m not talking about, I’m not going to talk much about that because that’s, that can get really complicated depending on what you’re doing and how, like how you’re, how you’re, how you’re doing stuff with the data.

But just as like, like a, like a sort of easy example, let’s say you were building a process to migrate data from like one really big wide table to a bunch of normalized tables. You wouldn’t want to like, you wouldn’t want to have to like redo a bunch of stuff in there.

You would want to be able to pick up from where you started from. Granted, there are like, you know, you can’t destroy the data in the base table yet because like people are probably still going to be using it until everything’s in the normalized table.

So there’s a lot to sort of figure out in there. Now, uh, sort of recently on across social media, I asked a question because like every time you talk about table variables, someone’s like, oh, they survive errors and rollbacks.

You can do all this stuff with them. I’m like, oh, okay, fine. So I asked around, I was like, Hey, like, like everyone talks about this. How many people have actually done it?

Cause like, I never see anyone using table variables for this. I always see people use people using table variables for the normal dumb stuff where they should be using temp tables. So like I asked the question, like, Hey, you ever done this?

Like what do you do? And most of what I got back was that people would use it for like either extended verbose logging of like, um, steps in a process, like an ETL process or something else, or they would use it to like capture errors and log those to a table because they would be able to roll back a bunch of stuff, but maintain what’s in the table variable and then dump the contents of the table variable into a logging table after the rollback.

But before like the, like the, the, like whatever catch block completed, which is fine. Um, one thing that I think would be great with table variables is if they were like, if you were able to like capture bad data with them and like, I would never use them like for the base of an ETL process.

Cause table variables mess up a lot of stuff, right? Like you can’t do a parallel insert into a table variable, which hurts in a lot of ways. Like if you’re like really doing ETL where you’re moving a lot of data around, that’s a, that’s bad news.

And of course, like the sort of like the bad, like, like base cardinality estimates, like, you know, like, like prior to the table variable deferred compilation stuff that I’ve, I’ve talked about, you know, you get either like the one row or like, you know, like let’s use a recompile hand or you, it’s just, there’s a lot missing from table variables that would make them good for large scale ETL processes.

On top of that, like, like, like you get no like distribution, like histograms for them. Like there’s no statistics on them. So they just, they just fail in a lot of ways where like, like if they had that stuff, I mean, there wouldn’t be table, they’d be 10 tables, but like, or like rather nothing would distinguish them from 10 tables, but like they had all that stuff that might be useful.

But one thing where I get annoyed with table variables is like, uh, let’s say I wanted to like do something, right. And, uh, I want like, like, I wanted to like put data from like some ETL process into another table. And I also wanted to like figure out if, um, like, like I like, like typical, like primary key constraint violation, something, right. Like I wanted to know which rows mess things up. Like, let’s say I create this table and I do something like this, right. I declare a table variable called survivor. Uh, I insert stuff into my table here, and then I output the inserted data into the table variable here, right. And then I can select from the table variable afterwards on the first run.

This is fine because there’s nothing in the table, right. I just created it. But on the second run, we just hit an error with a constraint violation and nothing returns from here. So what I, what I think would be a really useful, um, really good use case for table variables that SQL Server doesn’t have as far as I’m aware, no, no one has this. I think it would be really useful if there were a way to like output, but say like errors only so that you could get data that threw it, like caused a problem that threw an error in the table variable without like before the whole thing fails. So it’d be nice if like there were a way to be like insert errors only into like in the output clause or like rather output errors only into the table variable so that you could like, like figure out what data caused the problem and then log that or return that somewhere and be like, Hey, this is what screwed up. This is what you have to watch out for. But of course we don’t have that. And I don’t know if we’ll ever get that. I don’t even know if that’s a thing anyone’s thought of before, but I think it would be very useful. Anyway, that’s all I have to say there. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I hope to see you over in the next video and good goodbye for now.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Formatting

All About SQL Server Stored Procedures: Formatting


Video Summary

In this video, I dive into the world of T-SQL code formatting, sharing my strong opinions on best practices that can significantly improve readability and maintainability. I start by addressing some common issues like tab usage (which I strongly dislike), leading commas, and inconsistent capitalization—both all uppercase and all lowercase. These habits, while seemingly minor, can lead to confusion when trying to understand complex queries or stored procedures. I also emphasize the importance of proper indentation, new lines for logical sections, and grouping related code together to make debugging and maintenance easier. By organizing code in a structured manner, you not only improve your own understanding but also make it much simpler for others who might need to work with your code in the future.

Full Transcript

Erik Darling here with Darling Data. And I’ve got a big smile on my face today because me and Bats are going to be talking about one of our favorite subjects, and that is code formatting. Now, code formatting for me of the T-SQL variety essentially comes in, well, I mean, I guess you could call it three measures. One is the way that you write the code. It’s a query, right? The way that you indent things, capitalize things, so forth. Place your commas, right? And also the way that you organize code, so larger groups of code, which you might associate with something like a stored procedure. You are doing all of the things that sort of belong together. all together. So when you declare variables, you’re doing that all in one place. If you’re setting those variables equal to something else, you’re also doing that as much as possible all in one place. We’ll talk about an exception for that. You’re creating temp tables, you’re doing that all in one place and so forth because one of the worst things that I run into when dealing with client code is stuff that’s been added on over the years where, you know, you’re reading through a stored procedure and you’re like, oh, I’ve accounted for all these things. And then you get 2000 lines in and you realize some other knucklehead has, you know, added all these other things and declared stuff and created tables like down through the second half of the procedure.

And now it’s just like this mishmash, right? It’s just like it makes things a lot harder to follow. And when you were trying to figure out like the things that are in play for like what you have to work with, it is much easier and much more convenient to have everything as much as possible together. So that’s what we’re going to be talking about in the formatting episode of our journey through SQL Server Store procedures. But before we do all that, of course, we have to talk about the normal sort of stuff and how you can engage with your dear friend, Erik Darling, more. If you would like to support this channel, helpful little link right in the video description. You click on it, you give me four bucks a month, we’re all happy. You can choose more than four bucks a month if you’d like. But one pre-tax cup of one pre-tax single shot of New York espresso is just about as low as you can go there.

If you are for some reason opposed to caffeinated beverages, you can like, you can comment, you can subscribe. And if you would like to ask me a question that shows up in an Office Hours episode, there’s a link also in the video description that will take you to this thing and you can submit your questions. If you are beyond the help of YouTube lessons and like basically anything else and you need help performance tuning your SQL Server, I am available to do all of these things. And as always, my rates are reasonable. Also reasonable are my rates for SQL Server training. I’ve got about 24 plus hours of it. You can get it for about 150 USD. That belongs to you for the rest of your life. No backseas. Don’t want it back. You’ve probably spit on it.

As far as upcoming events go, SQL Saturday, New York City 2025 is taking place May the 10th. If you should attend, if you are a company watching this, you should sponsor. And we have a pre-con on May the 9th, all about performance tuning with Andreas Volter. So, he’s a smart fella who used to work at Microsoft. Probably the smartest thing he’s ever done is leave Microsoft. But you know, we’re all happy to have him back in the real world. So, with that out of the way, let’s talk about code formatting here. Now, that was the wrong tab to have up front. So, first, there are a list of things that I have rather strong feelings about.

I strongly dislike tabs strongly dislike tabs I realize that there every time I talk about my dislike for tabs there is a chorus of people talking about the the visually impaired and how when they use tabs they can do things better and and if you are in that situation if you’re in that condition and you and you want to use tabs I don’t blame you I used to work with a fellow who did QA he was in real tough shape and in the eyeball department you would walk by his desk and there would be like a single letter on the screen like I don’t I don’t know how he got anything done but he was he was very good in the QA in the QA department apparently he had he had worked out whatever whatever he had to get whatever he had to work out in order to to to be very proficient in his job but for me when I’m writing t-sql and let’s let’s be honest most of the time I am maintaining this t-sql on my own there are there are a handful of contributors who I am ever grateful for but I find tabs or more specifically when tabs and spaces get mixed to be quite offensive there’s nothing worse than when you’re trying to like format and line up your code and you’re like you know you hit enter and then you like delete two spaces and then you go to delete that third space and it jumps back like a fight two full tab stops for some reason you’re like a lot of stuff going on in there other things that I find offensive are leading commas anyone who says it makes taking things in and out of the select list easier has naked never taken the first thing out of this select list so that that causes problems for the 40 other columns you’re selecting things that I dislike but perhaps feel somewhat less strongly about all uppercase and all lowercase all lowercase of course just looks strangely juvenile and all uppercase looks like Oracle shouting I really dislike when new things are not on new lines so things like from join on where and or order by and group by I guess I guess partition by should be in that list too I think all of these things are deserving of their own lines so that you can figure out exactly where things start and end it becomes very confusing when you know just like it was like you’re reading a windowing function or something and it’s just like row number over partition by order like all across on one line these things deserve your attention select queries where it’s like select long select list from table and then like down here it’s like where how did you get that like put stuff where everyone can see it it’s especially helpful because you know you are generally reading t SQL in a very like sort of top-down fashion and being like often like the list of columns in the select list is maybe not quite as important as figuring out like oh we’re selecting from these tables join to these tables you know things like that so this stuff really does deserve new lines a lack of indenting and t SQL makes things incredibly confusing and proper indenting really does lead to it being a lot easier for you to figure out which blocks of things belong together especially if you are doing the in and end blocks if you have parentheses contents it makes it very much much easier to figure things out when things are indented and you can see this block belongs here belongs to this thing leaving as out of table aliases it’s just somewhat irksome there was actually kind of a funny story here is there’s one client who I was working with and we were looking at their code and we realized that they like they were using the old style like from table no lock hint right not with no lock but no lock was spelled wrong and they were like wait how does that even work and I was like because SQL Server is not interpreting no lock with two K’s is a hint it’s it’s it’s interpreting it as a table alias so that was kind of funny I strongly prefer the column equals expression aliasing because it lines up all the column names where you can read through them and see where things are when it’s like like you know you write out this whole complex long expression and then it’s as something it like the context switch to look all the way over here to see a column name and all the other column names are here not so not so great top expressions without parentheses should be banned I hate those uh align parentheses tend to work out much better because it’s a lot easier to mentally grasp what belongs in this group of things uh when you do stuff like this and give things weird little hugs it just looks strange it doesn’t look good uh there are other things that I I will usually uh tweak and tinker with as I’m formatting code but uh that’s about you know that’s about the list of like the the the most important stuff for me I think um when you think about formatting code though like I said earlier it’s not just uh merely about where like you know this stuff up there like the indenting and the grouping and the stuff like that it makes like when you’re writing code in a procedure it’s really really helpful to have everything that belongs that like like like like has a certain action like declaring variables setting variables creating temp tables stuff like that it makes a lot more sense to do all that stuff all together so you know which section of code is responsible for that so if there’s ever a problem you know where to go like if you’re you know declaring variables like we’re going to like 2 000 lines into a store procedure or creating a table like 3 000 lines into a store procedure and there’s ever a problem with that it’s just like okay now well it’s not up with the rest of them now I gotta go down way down here to go do stuff it’s annoying not a good way to spend your time um so the things that I will lend some leniency to uh in these regards it are select into and the reason why I will uh I will allow some leniency here is because there are certain times when um you you you you want to just do a simple select into it makes life easier um perhaps there is some other benefit to it um maybe for some reason creating the table and doing the insert you are just not getting a parallel fully parallel insert from it even with a tab lock hint no sql server gets weird sometimes uh so sometimes a select into is necessary and if that’s way down in a store procedure I get it it’s not not completely off limits the other thing that um i am lenient with is uh declaring cursors and I’m going to show you an example where I was even lenient with myself when it came to declaring a cursor of course uh it is perhaps extra strange for me because I I strongly prefer cursor variables over regular named cursors because the cool thing about cursor variables is you don’t need to close or deallocate them technically you only need to deallocate because deallocate closes too but when you use cursor variables you don’t have to do what you don’t have to remember to do either one sql server scopes them and like like a normal variable and closes them out when you’re when you’re done with the the block of code so let’s take a look at my store procedure sp log hunter and I’m using this one because um I think it is the shortest of my procedures and um that that’s good for all of us because I’ve already been talking for like almost 13 minutes and we haven’t looked at any code yet and I don’t want to spend a half hour going through sp human events or quickie store where there is an abundance of code uh though I would say that if you ever want good examples of formatting dynamic sql there there are there are very good examples in those procedures so here’s sp log hunter um we talked about anti settings uh before um like I said a lot of this stuff uh like I put at the top because this is stuff people should like be aware of like care about when they’re you know doing things with store procedures since I’m never going to be calling this from an application where it might have different ANSI settings nor am I overly concerned about filtered indexes computed columns or index views it doesn’t matter for me to put this up here uh then you know some you know a little bit of green text a little bit of information about what’s going on in here right uh where to go to get support even though everyone just sends me emails anyway uh and then of course we have our parameters and I like to document the parameters up here where they actually live in case you’re just looking at the script file but I also like to do that in a little help section so sql server has a couple very helpful views um sys.parameters sys.object all objects and well you could do sys.objects and sys.types and what you can do is you can actually get the list of parameters for a specific store procedure and you can like also like put them in a help section here where you know you can just say hey this is what this means you can put like all sorts of good information in there so I like to include a description valid inputs and default values for them so everyone knows where they’re going uh MIT license stuff because I’m cool like that and then uh moving into like grouping stuff where it belongs uh before I get into most anything else uh what I want to do is make sure that I am conditionally able to execute uh the procedure that I’m running so stuff like checking to make sure that we’re SA right that’s a good way to figure out if I’m going to be able to run this thing uh if we’re an RDS I can’t use I can’t read the error log the same way as I can when I’m on regular SQL Server even managed instance does allow you to get the error logs I think I’m pretty sure it does uh but um Azure SQL DB does not um RDS does allow you to get to the error log but it’s through this other procedure that doesn’t have the same parameters and searching stuff in there I am thinking about working out a fix with that where I would just read in the whole error log and then do the search off that but I don’t know we’ll see um but here’s our check for Azure SQL DB uh where we will we will not attempt to run this here if that’s going on there’s one other check that uh results in a return right that results in a full error and that’s if someone is searching for a language id that does not exist right so if you search for a language id 99 99 99 or something uh that’s not going to happen for you but then there’s uh fixes to stuff right so this is where if parameters are set to odd values I’ll attempt to psychically correct them so that the store procedure runs without any weird mixed up stuff going on uh so there’s a bunch of that and then down a little bit further this is where I declare the variables that I need to do different things in the procedure but this all happens in one place uh and since um I have uh what do you call it uh not a whole lot going on in this one there’s a very limited list of uh variables in here and I can set those very easily here where I care about them uh down here this is where I create my temp tables and this all happens in the same place right all the temp tables get created and then we uh go and we um do some work right and this is where we actually start doing things in the procedure to you know look at what we have available so uh I talked about this in another video this is all on one line you know just because it’s a it’s a debug statement maybe not the most consistent thing but uh sometimes it just is a little bit more convenient for your sake to just sort of stick stuff in like a debug thing in here like this so that you know your store procedure isn’t 10 000 lines of perfectly formatted debugs like this isn’t too bad but you know stuff that I like to do is make sure that when I am debugging things with the temp table I include the table name in what I’m selecting from so I know what results I’m looking at because if I have a like you saw I had like six or eight temp tables up there what I don’t want is a bunch of selecting from temp tables and not knowing where the results are coming from which like especially if one is empty like what do you do right like it’s going to be hard to figure out what you’re dealing with uh I don’t know some pretty boring stuff in here uh there was something else that I wanted to get down to I believe it was the cursor variable and that is right here so this is another place where I’m sort of lenient about when things do get declared and set up uh where I declare the cursor variable here and I set the options for that here it just makes like it it just makes a little bit more sense with stuff like this to have have it declared where it’s going to be used there aren’t a lot of other cases where I feel the same way about it about this particular thing but for me uh declaring cursors is one of those things because there’s not a lot of sense in declaring a cursor and then having like way up at the top and then having someone be like well wait a minute when when does this cursor happen right so like for me declaring the cursor inside of the like or like right around the looping code is the smartest thing to do right that makes the most sense to me uh and then uh you know helpful debug stuff so this is a multi-line debug I do have the begin and end and everything on its own line and then down here there’s just a little bit more of the uh what do you call it uh more debug stuff there was something else I was supposed to be looking for in here uh well yeah when we return the the results right uh so that’s that’s sort of an important stuff but anyway when uh we when we’re like returning the results again make sure that the table name is in there so I know that this is the result of like everything that got put into the error log table that of like the stuff messages we cared about right so like this whole big block of stuff way up here uh these are all the messages that I want to search for in the error log this is the important stuff where it’s like might be something wrong might be good informational stuff so down here is where I select out of that table and then if there if we hit any errors in the store procedure I log those to a temp table and if there’s anything in the error error like that like if any of the command statements error out in there those get caught here and I return those only if there’s anything in there so uh you know apart from the like actual query formatting stuff that I feel strongly about uh you know tabs commas new lines for things uppercase lowercase print parentheses indenting things like that uh I really it does make your life and the life of everyone who has to deal with your code a lot easier if you put everything that performs a specific function all together so again like declaring the variable setting the variables creating 10 tables table like table variables things like that uh it really does behoove you to do those things so that when other people go into work with your code they’re sort of aware of like just how much of these things like they’re gonna have to look at and work with and deal with and everything else uh it also just helps to be you know well organized uh when you’re when you’re writing uh larger complex pieces of code because you know there there’s there’s probably going to be enough complicated query logic without there being a whole lot of disorganized um code logic as well so uh anyway that’s how I feel about formatting uh I hope you enjoyed yourselves I hope you learned something uh I hope that you will take some of this stuff to heart and when you’re working with queries and stored procedures you’ll start to organize and format and uh you know get things looking a little bit nicer and tidier and uh contribute to nice clean readable code um you know when when folks out there in the world start talking about how CTE make things more readable they don’t they sure don’t nicely formatted code makes things a whole lot more readable and understandable than all of the CTE in the entire world so thank you for watching and I will see you in the next video and uh goodbye

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 2

SQL Server Performance Office Hours Episode 2


What evils come of vertically partitioning a table into multiple tables? I have bad history with two of the alternatives. The first being long thin tables with foreign keys to what ought to have been their columns and the second being tables with too many columns.
How can I keep our devs happy by avoiding stored procedures and embracing their preferred ORM (GORM in this case)? Not a big fan of ORMs.
test after adding a field
I got a table with only 2 columns, bigint and varbinary. For some reason app inserts a row in the table and after that updates the varbinary field with a value (likely a blob). I’m seeing blocking because of this. Any recommendation to mitigate it?
Will you be at SQL Bits this year?

To ask your questions, head over here.

Video Summary

In this video, I discuss some common issues and advice related to SQL Server development. Starting off, I address the topic of vertically partitioning tables into multiple tables, explaining that while it can be beneficial in certain scenarios, overly wide tables or entity attribute value (EAV) tables with too many columns can lead to performance issues due to increased complexity in indexing and query execution. Moving on, I tackle a question about keeping developers happy by avoiding stored procedures and embracing ORMs like GORM. I share my perspective that while ORMs are useful for development, they often hit performance bottlenecks when dealing with complex queries or specific database operations. I suggest allowing developers to use ORMs until they encounter these issues, at which point offering store procedures as a solution can help maintain their satisfaction and address performance concerns effectively.

Full Transcript

I had forgotten to turn on my microphone. Thankfully, I remembered before I started making any mouth movements. So we’re going to just roll with this one. That’s okay. I’m not too concerned about that. Anyway, we are going to do our second Office Hours episode. And we’re going to change up the format slightly. And rather than read directly from the Google Sheet, because things have been coming in much more frequently now, and I don’t want new stuff to show up in the middle of a reading and get distracted. So I’ve pasted the chunk of questions that I’m going to answer into an Excel file, and I’m going to read from the Excel file. Can I, The other nice’s the other as F bucks a month. If you don’t have four bucks a month for various socio-economic reasons, you can like, you can comment, you can subscribe. These are all things that are free aside from the brief spasm of muscle that it takes to click something. If you want to ask questions that show up on these Office Hours episodes, you can do that by going to this link, which is also included in the video description. It’s going to be a recurring theme of all of these things, and you can put your questions in there and I will eventually answer them.

Oh, there’s that. If you need help beyond just asking an Office Hours question, if you are just absolutely stumped with some SQL Server thing, you can hire me to do all of this stuff. I am in fact a consultant. I in fact get paid to consult. That is the majority of my income. It is not YouTube.

So, there we are. And as always, my rates are reasonable. If you would like some other reasonable things from me, like training, you can get that for about 150 bucks for the rest of your life. It’s a pretty good deal. Again, fully assembled link, video description. Click it, buy it, be happy.

Upcoming events, we have SQL Saturday, New York City 2025, taking place on May the 10th, with a performance tuning pre-con on May the 9th. I know we’re missing that awful joke from that burned out franchise by doing it on these dates, but I am fine with that.

So, with that out of the way, let’s office our hours here. And we’re going to talk through these questions. And first up is this one. This is a real humdinger. What evils come from vertically partitioning a table into multiple tables? I have bad history. Oh, you ought to clear your browser cache then, pal.

With two of the alternatives, the first being long thin tables. For the folks playing along at home, those are going to be called entity attribute value tables, or EAV, like victor, elephant, apple, victor, EAV tables. With foreign keys to what ought to have been their columns, and the second being tables with too many columns. Well, I think the problem that you would traditionally have would be that the tables that are too wide. This is, so the tables that are too wide, those fall into the category of generally needing to be vertically partitioned. Remember, V’s go up and down, that is this way.

Horizontal goes this way, right? Yeah. So, what you would want is a way to move those columns out into a new table and go through a process we in the database world called normalization. You can usually tell you’re in a position where you need to normalize tables when you have one of two types of columns in your table. One of them are going to be columns that share a prefix. Like, let’s say you have an orders table and you have mistakenly decided to retain a whole bunch of customer information in your orders table. You might have a bunch of columns that say customer underscore something. I call this tables within tables, right? So, when you have that, that is a sign that you need to normalize things out. Overly wide tables cause lots of problems in SQL Server because the wider your table gets, the wider your indexes need to get to compensate for the columns that people, and generally the more indexes you need because you’re going to have people who need to filter on a certain set of customer details and then return a certain set of columns. And it just adds too many different combinations of things that can possibly happen in a query from one table to really be able to like sufficiently index for. Sure, there’s always like the, you know, the clustered or non-clustered column store thing, but those like, those aren’t always the best idea for an OLTP type scenario where you would have that kind of information in there. So, I have no problem with the normalization thing. There, you know, you can run into, or rather I have run into many schemas in my life where the stuff was overly normalized and like, like, like, like a query just to return like a few simple facts about a thing took like seven joins. Now, there, there is a certain nirvana to the, the, the, let’s call it the perfectly normalized schema where you will never have to update a column. You can only, you will only have to insert a column because any information added to the database will be new information for a new table.

There will be no need to modify existing information in a user table past a certain point. The EAV tables are a separate one. I’ve had, I am a big fan of the EAV or entity attribute value tables when column, columns not, don’t have like a similar prefix. It’s when columns start getting suffixed with things, especially numbers. A easy way to think of it, phone one, phone two, phone three, email one, email two, primary address, secondary address, stuff like that. Because as you know, that’s a great way to end up with a table that’s real wide. And most of the columns aren’t used by most people. You added a whole new column because one, one customer wanted to add one field to something. That’s a bad time for everyone. Um, I, I really, I don’t believe in that and nor do nor do what, and I, I hate sparse columns in SQL Server. They are a complete waste of everyone’s energy. Uh, and they don’t work with, uh, with index compression. You cannot page or row compress stuff in a table where sparse columns are involved, which makes them, uh, I mean, just, I, I hate them for that. Whoever, whoever didn’t, whoever, whoever from Microsoft decided to make that, uh, a point of contention with interoperability between the two features was a real knucklehead. Um, I hope they get dragged out into the street and left there. Um, so I don’t, I don’t have any problems with entity attribute value tables because what they make it real easy to do is if someone needs a new piece of information added, all they have to do is add a row to a table. They don’t have to add a whole new column that impacts everyone else. So that’s a much nicer way of doing stuff. Um, so I don’t have any problem with the long thin table, the entity attribute value table, nor do I have any problem with, you know, normalizing out data to a certain point. Um, I’m not saying you have to get to the normalized nirvana where your workload is basically insert only that, that, that, that takes a certain level of, of, of skill and cunning that, uh, most people would, would are hopeless in getting there. But I think it’s actually kind of a, is it is kind of a cool goal for some workloads. All right. So, uh, hopefully that answers your question there. Now the next question we have is how can I keep our devs happy by avoiding store procedures and embracing their preferred ORM, GORM in this case. All right. Uh, not a big fan of ORMs. Well, um, I think it’s funny because developers will never be happy. Uh, developers are not naturally happy people.

Uh, they are, uh, deeply envious, uh, perpetually unhappy crowd. Uh, it’s, it’s, you, they, you just can’t keep them happy. They’re, I don’t know, they’re, they’re difficult, right? Uh, they always want to be working with something newer, something shinier. They always want something different. They always think that whatever is currently in place is a mistake. And with, if they had their druthers, they do something different and get something different and, uh, everything would be better, right? It’s just, you know, this, this constant, like chasing utopia that developers have that, um, they’re, they’re, they are rarely right about. Now, one thing that you can do is you can let your developers be happy with their ORM until they’re unhappy with it. Uh, the, the approach that I usually take is, you know, you know what, you, you know what, you want to use an ORM. You want to write all your queries with it. Cool.

The second you are unhappy with the performance of a query, then we are going to talk about something that’s not your ORM because your ORM is going to be tremendously inflexible and have all sorts of limitations placed on it as to how you can actually run your queries to get, get them to perform well.

Uh, you know, it’s very, you know, when you think about something like entity framework, the number of problems that could be solved, uh, with an entity framework query, if you could just use temp tables naturally with it, tremendous. Uh, you know, there are lots of things that you can do in the context of a store procedure that use more advanced, um, let’s just say SQL generally, because ORMs obviously can be used across a wide variety of databases. Uh, there are all sorts of things you can do in the context of, uh, of, of a, you know, let’s just say a non ORM query, right?

Whatever, whatever, whatever protocols are, you know, your database uses SQL Server, of course, big on the store procedure. Uh, so what, what’s, let’s just stick with that. You know, there are all sorts of things you can do within a store procedure to, uh, fix performance problems, that, uh, you end up with when you’re using ORMs that just generate those one big monolithic query, you know, like sure every, a lot of people out there have, uh, you know, they had developed or been like, Oh, this query sucks. I don’t understand why you look at like the SQL that it generates and you see like select extent 40,920. And you’re like, Whoa, it’s a big query.

Yeah. Like SSMS like chokes up and freezes. When you try to look at the query text, forget the query plan. That’s out of the question. Uh, so there, there is a certain breaking point with ORMs where they are no longer the best approach for things. So my advice would be to let your developers use their ORM until they are unhappy with something that is happening with the ORM and then offer them the, uh, alternative of a store procedure. It may not be, uh, your ideal scenario, but the thing is you’re not tasked with doing the development for the app. You are tasked with some sort of administration thing. The developers are tasked with, you know, um, like new features, fixing features. Uh, they are tasked with, um, I don’t know, uh, other developer crap. I don’t know.

The usual stuff. Uh, and, and, and, and they need to be able to do their jobs with a certain amount of comfort and, and, and, and happiness, whatever, whatever they, whatever they might actually get out of, out of that. Uh, and, uh, they, they need to be able to do that and they need to be able to do their jobs. If, if it makes it easier for them to do their jobs using ORMs, that’s great. But at a certain point when you hit that, those performance thresholds where the ORM is no longer generating queries that can be handled well, uh, then alternatives like store procedures are where you can come in and say, Hey, I got some ideas. Let me help you out. And, you know, you can, you can get them to be like, you know, happy most of the time. And then they can be like, well, you know, that store procedure really saved our hide. So we’ll, we’ll just, we’ll just be okay with it.

Uh, you know, just one thing, one thing I think is, uh, important to remember about developers is that if they are application developers, if they do not spend a lot of time in the database, the T-SQL that they would write would be garbage anyway, it would not be good. It would, you know, it would most, at this point, let’s be honest, it would come from an LLM and LLMs are just as good as your most mediocre developer. So like you’re, you’re, you’re better off letting them develop where they’re comfortable. And then as soon as they hit like a breaking point with, you know, performance or, um, you know, something else isn’t working the way they want it to do the way they want it. Like, you know, like it could be something related to like isolation level or, you know, some query hinting that needs to be done. Then that’s when you can offer up a deeper level of expertise within a store procedure and say, Hey, I can fix this stuff with this. Let’s take it, let’s take this one thing out of the ORM. You’ll, you’ll live with the rest of it. So that’s, that’s my advice there. So, uh, next up we have, uh, I got a table. You, you got a table. Where did you get this table from?

Where did you, where did you got a table? I don’t know. Someone else give it to you? You steal it? Where, where, where was this table got? Uh, it is a big int in a var binary. For some reason, I’m going to assume that’s var binary max based on information that comes along later in the question.

Uh, for some reason, the app inserts a row in the table and after that updates the var binary field with a value, likely a blob. That’s why, that’s why I was guessing that it’s a var binary max. I’m seeing blocking issues or rather, I’m seeing blocking because of this. Any recommendations on how to mitigate it? Well, it’s certainly an interesting pattern. I don’t necessarily, uh, have a problem with that pattern off the bat. It is a little bit strange to not just insert everything all at once, but perhaps there was some good reason for that historically. I don’t know. Uh, but you know, most, most blocking problems do come down to indexing. Assuming that this table is either a clustered table, meaning it has a clustered index on the var binary field, uh, or, um, perhaps somewhat less likely. Maybe it has a non-clustered primary key on the bar, on the, uh, big int column. And then the, uh, the, the int in the, the bar binary max most likely are, uh, left in a heap structure. Um, I would say that your, your better bet there would be the clustered table because you would only have one object to modify, even though you would still be doing two modifications, you would only have one object to modify that might be slightly better. My bigger concern is that there’s something more going on here though. Uh, because even in a worst case scenario where you are inserting a row, which in this case would be one big int, or maybe you haven’t even reached big int status yet. Maybe you’re still inserting ints, but let’s say that you, you, you’re inserting one big int row with a null in the, in the var binary, probably max field. And then you are updating the, uh, the, the var binary field based on the, um, based on what I plus, let’s just make it easy. Let’s just assume that it’s an identity value. Uh, you’ve returned like, um, Oh, whatever scope identity or something, or you output the identity values from something. And then you do that update there. If you’re only doing that to one row, the blocking shouldn’t be that profound. Now it could pile up a bit, but, uh, it typically, like I typically wouldn’t expect to see it going on that long unless something else were happening.

Uh, the something else happening would perhaps be, uh, transactions not being committed, uh, in a timely manner, uh, perhaps sleeping transactions, leaving, uh, things open for too long. And that’s why things pile up. Um, if, if this were, if this were really, really high concurrency, I suppose that you could be having, uh, last page contention issues on the, the big int table, but I don’t, but that, the, that, that information is, is left to, left purely to my imagination. So I would be more concerned that there is something else going on. Um, you know, not necessarily something going wrong between like the begin transaction, insert a row, update a row commit. I would be concerned that something, that something is not getting to that commit, uh, when it should, um, it could be an application bug. It could be some sort of, um, you know, like, like data return thing. I don’t know.

So, uh, the first thing I would look at is indexing. The second thing that I would, I would look at that would be, uh, you know, if depending on how you’re catching the blocking, if the blocking is going on long enough that the block process report catches it, uh, I would want to look for sleeping sessions in there. Uh, if the blocking does not go on long enough for the block process report to catch it, or the blocking is very short and you’re catching it with like SP who is active, or maybe you have a monitoring tool, then I would look for the sleeping transaction, the sleeping session stuff in there. Because a lot of the times when I see what looks like a very simple interaction with the table and there is a whole bunch of blocking going on because of it, uh, it is usually because transactions are not committing when they should. So that would be up to you and the app developers to figure out. All right. So the final question of this one, we’re going to close out the month of January with, with this question. Will you be at SQL bits this year? And the answer is no. Uh, SQL bits has done some, uh, interesting experiments with, uh, their, their paid training days the last couple of years. Uh, the year last, um, they, they changed the payment structure. Uh, so the pre-con speakers, uh, would max out, and I think it was like, it was 5,000 pounds, which would, I don’t know, like, I don’t know what the current exchange rate is, but let’s just say that’s somewhere between five and 6,000 USD.

Uh, which made it sort of economically not feasible for me to go to London for a week because I would lose my shirt on air travel and hotels and food and not working here. Right. Like, like, there’s a lot, a lot of money that goes out the door to attend a conference. Um, you know, as you know, and, uh, you know, before anyone chimes in that like they can find a ticket from America to London, like on like, you know, question mark airlines sitting in the cargo hold for 300 bucks, that is not how darling data flies. Okay. We, when we’re, especially when we’re going internationally to go teach classes and go have to be happy and peppy for a week, we, we get the nice seats. Okay. So that like, like SQL bits made it impossible year before last to, for me to go. Right. Like I didn’t even submit this year. They, they changed things again. I’m not exactly sure what the payment structure is for pre-cons, but what they, but what they changed is, is how they do things. Normally pre-cons happen pre the con, right? Before the conference, there’s like two or three days of paid training and then the regular conference. Uh, SQL bits has changed it so that they’re offering like basically like a four or five day conference with training days taking place on every day.

Uh, I don’t know what they changed the payment structure to, but one of the other things that they changed was that they are curating their speakers. Now they are, they’re having some elite team of, of technologists and visionaries and luminaries and all that other stuff, uh, curate their, their, their, their paid presenters. And this year, sadly, I was not curated.

Um, there was some sort of lapse in the curation team’s judgment and, uh, I was, I was left, I was, I was left unemailed. So if, if you’re, if you’re a SQL bits attendee and you would like to see me there, it would be worth your while to get in contact or fill out whatever survey materials SQL bit sends out and tell them that you would like to see me speak. I’m not, I’m not expecting an avalanche of this to happen, but if you, if you, if you care enough to ask and you care enough to get a response, well, uh, that, that is one way to perhaps affect the steering of the curation committee in the future so that, um, you can get something beyond the side one track one list of speakers that most conferences pick up. So, uh, those are, that’s the end of the questions for this, this office hours episode. Uh, thank you for submitting your questions. Uh, I hope you enjoyed the answers. I hope you learned something from the answers and, uh, I, I do hope to keep this going.

So, you know, um, more the merrier, send in, send in what you got. Uh, and I will see you in the next video where we’ll, we’ll talk about, oh, I don’t know this and that. Got some, got some more stored procedure stuff to talk about. And, uh, I actually have another, another batch of office hours questions that I’ll get to there too. So thank you. Uh, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Comments

All About SQL Server Stored Procedures: Comments


Video Summary

In this video, I dive into the world of commenting in SQL Server stored procedures, highlighting why good comments are crucial for maintaining code quality and understanding over time. I share my frustration with seeing large blocks of commented-out code at the top of modules, which often lack context or detail about what changes were made and when. To address this, I emphasize the importance of using block quotes instead of double dash comments, explaining how they allow for more readable and maintainable code. I also discuss the value of integrating ticketing systems into your comments to track issues, fixes, and enhancements, ensuring that future developers (or even past you) can easily follow the evolution of the code.

Full Transcript

Erik Darling here with Darling Data. Feeling fresh, fresh as a flower today. We’re going to do the second video in our traipse into store procedures. And in this video we are going to talk about the almighty comment. Because there is just not enough actual commenting in code that I look at. Most of the what looks like comments in the code that I look at is actually just commented out code. Which is somewhat strange because there’s no comments about why the code is commented out. There’s just nothing. You’re just left to guess. You’re just left dangling in the breeze trying to figure out just what exactly happened to that code. That made it no longer necessary. Vestigial code. So anyway, that’ll be today’s fun. Before we get into that, of course, let’s talk about how you and I can get along better. If you would like to support the efforts of this channel to bring you such high quality SQL Server content, well, golly and gosh, there is a link right down in the video description where you can click and you can offer me. I believe the starter package is $4 a month for this. You can join the 55 or so other folks who have chosen to help me pay off my camera and this microphone and this Adidas shirt and a whole bunch of other stuff around me that you can’t see like lights, electricity, I don’t know, stuff. If you don’t have $4 a month, that’s totally cool. I understand the demands and rigors of the modern world are quite difficult. I’m not going to lecture you about avocado toast and all that, but it is what it is. You can do all sorts of free things to support the glory of Darling Data. You can like, you can comment, you can subscribe. And if you would like to ask questions to feed into my infinite, what do you call it, material loop? I don’t know. I’m actually not even sure what they call it. It’s a strange thing. You can go to this link, which is also in the video description and you can ask me questions and I can answer them here, which is a pretty decent way of getting more information out of me.

If you need help with SQL Server, if your company has some extra money sitting around it. It is the beginning of the year, as far as I know, there’s all sorts of beginning of the year budgets and whatnot. And if you need consulting help with SQL Server, well, once again, BeerGut Magazine has said that I am the best SQL Server consultant in the world outside of New Zealand. 15 years in a while. No, actually, I don’t even know. Anyway, health checks, performance analysis, hands-on tuning, performance emergencies, developer training. Hoo boy, I do all sorts of fun stuff. And I look just this good doing it. If you would like some very high quality, very low cost SQL Server training content, you can get all 24 plus hours of mine for 75% off. It’s about 150 USD. And that will last for life.

Your life. Your life. My life, I’m not sure. We’ll see what happens there. Upcoming events, of course, SQL Saturday, New York City is coming your way May the 10th. There is a pre-con as well on May the 9th. And God help you if you don’t go to either one. We are no longer friends. We’ll no longer ban you from my channel or something. So with that stuff out of the way, let’s talk a little bit about commenting code. Now, let’s see if ZoomIt will, how many clicks it will take for ZoomIt to Zoom. This was three clicks before ZoomIt responded and Zoomed. Possibly the least helpful but most humorous way of leaving comments is the big block of green text up at the top of a module. It doesn’t matter if it’s a store procedure, a function, a view. I don’t know. Whatever else.

Just a big hunk of green text. People just writing stuff away. Initials. A little comment. But without anything interspersed within the code to like point out where these things changed and what happened. And maybe even what things used to look like. And no linking artifact to more information. Now, I do realize that a lot of SQL code was written before the days of really good bug tracking system things. And there weren’t always like Jira’s and bit buckets and whatever else people use for stuff like this. I’m still not sure what a Trello board is, but I still hear people talking about them.

So I understand that not everything is linkable back to the beginning of when these things were created. But new issues, whether they’re performance, logic, bugs, maintenance, fixes, additions to code. It’s really helpful to have, you know, like you don’t have to put the store procedures life story in the store procedure. But it is it is helpful to have like some background somewhere else that’s a little bit more verbose.

Right. It’s good to note that something changed and like like some way to figure out more about that change. Because a lot of the stuff that I see is someone just says stuff like this. Fixed performance, avoided parameter sniffing, added distinct.

Doesn’t tell you which query because as far as you know, like every query has distinct in it. So you’re like, OK, well, you add 50 distincts. I don’t know. So if so, if you have a ticketing system, only took Zoom in four clicks to do anything that time and there’s an issue number, you may want to add that to your code comments because that makes life a lot easier.

If there’s not, then you need to figure out some other way to make sure that the changes or additions or fixes you make to your code is sort of trackable in some way. Not everyone is going to want to dig through like GitHub or get blame or whatever. Every single time they want to track down what exactly happened when you made a change.

So we’re going to talk through a little bit of that stuff. We’re going to talk through how you can be nicer with those things. But one thing that I’m going to say straight off the bat here is please use block quotes.

Do not use double dash quotes. Double dash quotes are just terribly confusing, arduous, and no one likes them. Let’s look at a couple reasons why, right?

So the reason why I want you to always use block comments is because depending on where the code comes from, right? Like there are some places where you can copy and paste code from or like, you know, you might end up like, you know, someone might send you code in some way. And the problem with the double dash codes is that once that double dash hits, it quotes out everything else.

Right? Right? When you use block quotes, you can put a comment in line like this, but still have the query be runnable. Of course, this query is not runnable.

It’s just select star. Must be, gosh, SQL Server. Thank you. We must specify a table to select from. We sure must. We sure, sure must.

But with this, this commenting system, we can just, we get exactly what we want. The other thing that I run into a lot is like the sort of unformatted code problem. So like, like one, actually one, one of my biggest pet peeves is SP help text because it’s really easy and convenient to like hit SP help text, get some code printed out for you.

But SP help text does not do the best with like new line printing and like line wrap stuff. There’s a lot like, there’s a definite like limit to how much SP help text can print on a single line. And like, you know, you always end up with one of those.

And I think, I think it handles tabs and spaces real poorly, but we’re going to talk about tabs and spaces in the next one. Where you end up with like partial lines of stuff and you’re like, you go to like hit F5 to recreate the procedure or something. And there’s just like this wall of errors because like things ended up on the wrong lines.

But like the big thing for me with the commenting is let’s say that, you know, there was a comment like this, right? Where like, you know, you have part of your query up here and like, you know, we could pretend this all came in on one line and I just move stuff down so you can actually see it on the screen here. But like, let’s say there’s a comment like this.

All it says is removed. All right. And you don’t know if the removed applies to this or this and this. All right.

So you’re left with this like this logic puzzle where it’s like, well, which query is actually correct? Whereas if you use block quotes and you say something like this, you still see that this part of the where clause is valid. Right.

So like another reason why quote style makes a big difference is because you can you make very clear which portion of the code the comment applies to. It’s not always going to be apparent if you just put it on the line above or below because no one’s actually going to know like like unless you’re very, very descriptive, you’re no one’s going to know what what exactly the comment applies to. So the comments that I see quite a bit and I do have to thank an LLM for this because it made it very easy to ask something else to generate crappy comments and may not have to sit there and make updates and initials and type a bunch of useless stuff.

This is again humorous uses of LLMs are about the only good use of LLMs making pictures, having it do dumb, dumb tasks like this. LLMs are still terribly on serious things for anything, anything serious you have to do. Unserious things.

Beautiful. Keep it going, fellas. Unserious things. So like this is the kind of stuff that I always see, right? Like bug fixes, performance improvements, made it better, minor tweaks, added this, changed some things. These are the kind of comments that I generally see in store procedures that I work on and I’m like, well, guess what?

Guess what? I have no clue what any of this means. Like is this like, okay, like where did this happen?

And then like, you know, you go through the code and there’s some, some crappy comments like this. And it’s like, ah, well, thanks. But okay. Coming back up here. Did any of these fixes apply to this query or this query or this query?

Like where did, where did, where did they apply to? We don’t know. We don’t know where anything happened in here. We have absolutely no way of ascertaining where any of the stuff that changed up at the top applied down below.

And it just becomes very confusing. And because nothing, like there’s nothing readable about that. It’s not helping anyone figure anything out.

The way that you generally want to do this stuff is, you know, if you have a ticketing system, is you want that ticketing system, you want to put items from that ticketing system in the comment. And you want to have more information available in the ticketing system, right?

Again, more verbose information there. So that if like anyone’s looking through this and they really need to figure something out, they can go read that.

And they can, you can put the old code in there. And you can, you know, only keep the new code in the procedure so that you don’t have big blocks of coded out code everywhere. So like, this is a much better way of doing things. And like, when you get down into stuff, you can even put things in here like this.

Like, you know, like whenever you make, like all the changes that apply to a specific section, right? Like this stuff, query one, we did these things to it. It makes a lot of sense.

We added a covering index. Yay. We fixed a calculation. We optimize. Okay. Well, okay. So this is where LLMs do get a little cracky, right? Because anyone who says stuff like this is full of malarkey, right?

Like uses, like the optimizer is going to arm wrestle you on joint order no matter what. But so LLMs, again, still terribly unserious beings in the world, but funny anyway, right? We all can get a good chuckle out of that.

But like stuff like this is much, much more helpful because again, people can plug these things into whatever ticketing system you use. They can read up about exactly what happened and when, assuming that whoever made the change was not just doing something randomly at like four in the morning and like didn’t say anything about it. It was just like, damn it, I’m fixing this and going back to bed.

So this is a much, much better way of doing stuff, right? Because you get the best of all worlds. You get like detailed, well, you get like exactly which things, which fixes apply to which query. And then when you go to the item, you can see which, like more detail about what those fixes entailed.

So that’s a much, much better way of doing stuff. So if your code commenting looks like the previous example, where it’s just like some dash dash stuff with like, you know, some, you know, a little bit of typing mixed in, like again, fixed bugs, improve this, you know, performance, parameter sniffing, whatever it is. This is a much better way of doing things.

It’s going to help not only you understand code better when you come back to it like months or weeks or even maybe even years later. It’ll help anyone else who comes along and needs to work on this crap later deal with the, you know, gigantic, incredibly complicated code base that you’ve generated over the years for yourselves. So when I’m, these are the things that I look for in good commenting, right?

Again, block quotes, very important, right? Double dash quotes are awful. Block quotes are way better.

Some kind of way to figure out which fixes from the big block of change log stuff up at the top apply to which queries or which parts of the store procedure or view or function or whatever it is. And then some way to get more information, some more background on exactly what the problem was, what the change was. And like, like maybe even like the previous state of the query before your changes went in so that if someone’s like, well, oh no, I think it should be like this.

They might go look at that previous version of the query and be like, oh, it used to be like that. And that’s what didn’t work, right? So there’s all sorts of good stuff that you can convey with like the more information ticketing system stuff that you probably don’t want to just stick in the store procedure because you don’t want like, like 4,000 line store procedures that are 3,000 lines of comments and 1,000 lines of code.

That’s not a very good situation either. So anyway, that’s, that’s, that’s my take on it. Um, I’m, I’m not a DevOps guy.

I’m not a CI, CD guy. I don’t, I am not a, not a, not a professional developer in that sense, but, um, I do, I do see pretty clearly when, which, when, when people have embraced good commenting and ticketing systems. And when they have not, because, uh, when, when they’ve done a good job, my job gets easier.

When they’ve done a bad job, my job gets harder. So anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that you will comment responsibly in the future. And I will see you in the next video where we’re going to talk about code formatting. And, um, some of you will become best friends and some of you will become enemies.

Lifelong enemies. And that’s fine with me. We can’t, we can’t, we can’t all be friends forever. It just doesn’t work that way.

And code formatting is the easiest way to gain and lose friends. Right? Build allegiances and make sworn enemies. So that’s what we’re going to do.

Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: ANSI Settings

All About SQL Server Stored Procedures: ANSI Settings


Video Summary

In this video, I dive into the nuances of ANSI settings within SQL Server stored procedures, highlighting why these settings are crucial for maintaining database integrity and performance. I discuss how setting up these options correctly can prevent errors when working with index views, computed columns, and filtered indexes, ensuring that your queries match the correct expressions in the optimizer’s plan cache. Additionally, I explore other important settings like implicit transactions, row count suppression, and exact abort, providing practical advice on when to use them and why they are beneficial for analysis procedures.

Full Transcript

Erik Darling here with Darling Neda. I was talking to my friend Batts about what we should do next. Aside from Batts’ fantastic idea to take some office hours style questions, Batts also said, you know what, Erik, you spend a lot of time fixing store procedures that have all sorts of problems, all sorts of problems. Not just performance problems, structural problems, integrity problems, just the things that people do within store procedures are completely unhinged. I find developers untrustworthy to the point where I sometimes just wish that they had used an ORM for that. It’s that bad. It’s that bad. People are not kind to their databases. It’s amazing. Amazing some of the things you see when you get into the world of consulting. If you are the type of DBA or developer who has only ever had full-time employee positions and you’ve worked at relatively respectable organizations, you might see some of the things that I talk about and think, that’s the most unrealistic preposterous thing that I have ever heard that I have ever heard. No one would actually do that. Lo and behold, every time I think it can’t get worse, it gets worse. So, in this video, we’re going to cover the very beginning of the store procedure, the ANSI settings. ANSI settings, of course, have a couple interesting outcomes in SQL Server. A lot of the ANSI settings to me are funny because a lot of them, even though they will still have these impacts, don’t actually change anything with the queries. Some of them, like a Rith abort or numeric round abort, don’t even actually do anything really. They just end up with different plan cache entries and you get errors if you do stuff with index view, computed columns, filtered indexes. You try to make modifications to a table and you have the wrong ANSI settings. You get errors.

If you have the wrong ANSI settings and you query a table that does the index view, computed column, filtered index thing, you don’t get the matching for it. I guess you get different plan cache entries for different ANSI settings, which makes reproducing performance problems a lot harder. But some of them just don’t even do anything. Some of them only do stuff if you have the ANSI warning settings set to something specific.

So, a lot of this stuff is just stupid and there for backwards compatibility, but this is the kind of stuff that builds up when your code base is as old, wretchedly old, decrepit as SQL Server is. Not that I don’t love it for all its intricacies, but anyway, this is where we’re at. But now Bats is saying it’s time to make some money.

That’s not what Batsmuro actually sounds like. So, if you like this content, if you like this channel, if perhaps you just hit mute and stare at me while I wander around on the screen a little bit, and you want to support my work here to bring you this extra high quality SQL Server content, you can sign up for a membership. Something that needs to be said is that the video description contains helpful links for everything that I talk about.

Whether it’s signing up for a membership, asking me questions during office hours, of course, the liking, commenting, and subscribing are all separate button functions in the YouTube interface, which you are free to click. Just make sure you only click the subscribe button once, because if you hit the subscribe button more than once, I think you unsubscribe. We don’t want.

We want more subscribers. It recently passed the 6,000 subscriber mark. Also fairly recently passed the 50 member mark for the channel. So, thank you to all, I think it’s 53 or so people, for your continued support of four plus dollars a month to my antics. If you would like to pay me more money because you need more help with SQL Server, you can do that.

I am a consultant. That is how I make the majority of my income. I do not make it from YouTube, thankfully.

Otherwise, we would be in quite a destitute situation here at Darling Data. But if you need any of this stuff done to your SQL Server, health checks, performance analysis, hands-on tuning, performance emergencies, or developer training in order to prevent further SQL Server performance emergencies, I am the best consultant in the world outside of New Zealand for all of these matters.

Do not get suckered into paying anyone else to do this who will make things worse, probably. If you want some very high quality, very low cost SQL Server training content, boy do I have a lot of streaming stuff. You can get 75% off of it.

That brings the total cost down to just about 150 US dollars. And again, helpfully assembled links are in the video description. I suggest if you haven’t yet clicked on that, you click on it now.

If you’ve already clicked on it, send it to a friend. You might have friends who need this help too. And by gosh, the more of us who get better at this, the better off we all are. Upcoming events.

Saturday, May 10th, 2025. SQL Saturday, New York City is back in action. It’ll be taking place at the Microsoft offices in Times Square. I think there’s a Dave and Buster’s nearby, so if you get bored, you can just duck out.

Or you can go make friends at Port Authority, bus station, whatever. Whatever floats your boat. There’s vibrant communities in both places.

Also recently announced, I haven’t added the information here yet, but I will. There is a performance tuning pre-con on May the 9th by Andreas Volter, I think. I think that’s how you pronounce it.

I’m just going to say it like that because it’s pleasing to me to say it like that. But tickets are on sale for that as well. I’ll get the necessary background information added in here for the next video. So I messed up and didn’t do it for this video.

So sorry about that, but the information is now out there in the world. But with that out of the way, let’s talk about store procedure stuff. Now, this is the proposed list of topics that I’m going to be covering during these videos.

I’m not going to read them all because, by God, we have things to talk about. But this is the stuff that I’m going to be covering. Most of it is already written.

Some of it’s going to need some additional tinkering and tweaking. But what the hell? I don’t… It’s not like I have to do… It’s not like I do this live. Because there are a lot of times when I’m recording stuff and something happens.

And I don’t know, I have to stop and curse at things for a while or, I don’t know, weird noises sometimes. Sometimes there are ghosts. But this is all the stuff that I’m going to be talking about in there.

Some of these topics individually I have talked about in other videos. But it’s sometimes helpful to have them as part of a larger series of things. So that people can have everything all kind of grouped together in a helpful thematic playlist.

If you’re going to make a playlist, it should have a theme to it, right? You don’t want random jumble of stuff in there. Not a good way for a playlist to be handled.

So we can get rid of this thing now. And let’s talk a little bit about ANSI set options. Now, like I said before, a lot of these things don’t really do much of anything, right? But there’s a kind of specific way that you want to set up your ANSI set options for a store procedure to make sure that all of the correct options are in place for when that procedure runs.

ORM connection strings often are not exactly the way they should be to be compatible with index views, computed columns, and filtered indexes. I’ve mentioned that a few times.

But there are a few settings that are stored with the procedure definition. And then there are a bunch that are not. So when I’m setting ANSI settings for a procedure, I need to make sure that all these things are set this way so that when I want to make sure that I don’t get errors, when I am modifying tables that have indexed views, computed columns, and filtered indexes, I don’t get errors when I try to modify those.

And that when I query tables that have any of those things, the optimizer is able to match the expressions in my queries to those different facilities.

There are a few other things that I like to set up in SOAR procedures as well. Setting implicit transactions to OFF is a little bit of a weird one, but comes in handy if you are calling store procedures from the JDBC driver or the PyODBC driver, the Python driver, connecting to SQL Server, because both of those use implicit transactions by default.

And you often don’t want that to happen. Now, granted, a lot of the store procedures that I write from scratch are my analysis type store procedures where I go in and I run stuff. So I don’t often have to worry about this, but I do try to set them up to be good examples.

Within my store procedures, purely for me, because there are times when I need to performance troubleshoot or debug or work with my store procedures in different ways, I usually put these in there.

You may not care about either of these for your situation. That’s totally fine. They’re worth talking about because I tend to use them. And then within a store procedure, I usually set both of these things to ON and ONE GO because I don’t really want to get row counts from every query that runs.

And I want to make sure that if one of my queries hits some sort of issue and fails that all of the queries stop happening. Now, exact abort ON is actually something that I don’t use in my analysis procedures because it’s not critical if they run through to the end, if there’s an error or not.

There are often transient errors that might happen at one point or another. And I don’t want to not get a result, like any result set back from them because I hit an error during one point in processing. So for me, this is a silly example.

In my human events block viewer procedure, sometimes a query will fail trying to look up agent job information or something like that. But the rest of the results are fine.

And the agent job information isn’t like super critical to me getting like, like to me figuring other stuff out. So I’ll just like, like, I don’t care if that fails.

I still want to get all the rest of the stuff back. So exact abort, you know, if you’re writing important production quality code, you most likely going to want that one on. But if you’re just writing some, like, you know, if it’s just a single select query or something, you probably don’t need that.

So before we go any further, let’s talk about the three settings that are stored, like persisted with the procedure in SQL Server. So since I already have all this stuff as I want it, when you look at the sys.sql modules, dynamic management view in SQL Server, if you look at what’s stored in there, you’ll see ANSI nulls quoted identifier is recompiled.

Now, this is only when you have a store procedure level recompile up here. If you just have option recompile on a query, like down here, like it’s not going to say that the store procedure is made with recompile.

That’s only when you create it with the recompile at the like crater alter with recompile, just adding an option recompile. Even if there’s one query in there, we’ll not have the store procedure marked that way.

Cause you could, you know, take that out something, coat it out. I don’t know. I mean, granted you could coat this out too, but then it would just disappear from down here.

So these settings are stored with the procedure definition. So these, these ones you tend to worry about a bit less, but then within the store procedure.

So like if you like, you know, just say set these for the connection and then create or alter your procedure, like those get stored in the sys.sql modules thing. The SQL Server is like, this is created with these settings.

Then there are the runtime settings that we care about for making sure that we get index views, filtered indexes and computed columns used correctly. And so that we don’t hit errors.

If we try to modify tables that are attached to any of those things. So anti-padding, anti-warnings, erythaboard, concatenal yields, and all of those all have to be on and numeric, numeric rounderboard has to be set to off.

And then, you know, like I said, JDBC and Python drivers, PyODBC, I guess, both use implicit transactions by default. So if you’re using either of those drivers and you’re calling store procedures, you probably want to set this off in the store procedure.

If you’re writing, you know, queries that, well, honestly, these are just for me and my analysis procedures.

There’s really no reason for you to put these in your store procedures, especially because at some point you might, like turn on query plans and want to like get the actual execution plans for your procedure, which is completely fine.

A normal thing to have to do for me, why I put these in my analysis procedures is because sometimes I’ll leave actual execution plans turned on by accident and go to run my procedure. I’ll be like, why is this taking so long?

And, uh, you know, that’s not a good time. So there’s a good reason for that. For me, there may not be a good reason for these for you. And then of course, uh, set no count on is a fun one, right?

We do it. Cause we don’t want row counts from every single thing that runs. And, uh, exact abort is a good one to turn on so that you don’t have like, like, like say you have a store procedure that runs and it does like an insert and then an update and then a delete. And let’s say the insert and the update run fine, but the delete doesn’t run.

So let’s say the delete hits an error and you don’t want the insert and the update to have completed and committed and like leave that data in there. You want those to roll back too, because something that you should have deleted maybe from that or from something else, uh, it will still be lingering around.

So you need to want to make sure that all three of those roll back so that you don’t have like dangling data sitting around in your table. So, uh, that’s all that.

Now, when I was writing this, I came across kind of a funny thing, uh, with the ANSI padding setting that, uh, that caught me off guard and was not in the documentation either for the ANSI padding, ANSI padding, ANSI padding.

I’m just gonna, good ANSI, good ANSI, uh, for the ANSI padding setting, uh, that I didn’t see documented anywhere. And that is that ANSI padding behaves differently with, um, uh, table variables and temp tables, uh, and that table variables do not respect your ANSI padding settings at all.

So, um, what I’ve done is I’ve, I’ve actually stolen some code from the documentation because I was like sanity checking myself on all this and being like, this is, this is too weird.

This, this can’t actually be this way. Uh, and, uh, let’s see, uh, I, just to compare what happens with a, with temp tables and with table variables, with ANSI, the ANSI padding setting changed to on or off.

Uh, I’m just gonna run this whole thing at once. I don’t think there are any errors involved, which is nice. Uh, uh, and so this is what happens. Uh, now with the table variable, you can see up here that, um, no matter what I do, the results, oops, the results are precisely the same between these, right?

And this is testing with the tape with ANSI padding on and off for the table variable. ANSI padding behaves in the exact same way for both of these. If you look down here with the temp table one, you’ll notice that with ANSI padding off, things do look different, right?

Like just to compare the differences down here with ANSI padding on, uh, and I, and on, I get trailing blanks, right? Uh, I get, right, like, like trailing blanks when I should.

And this, uh, this adds the zeros in, um, rather than, uh, not have them in for the bar binary column. But when I said ANSI padding off for the temp table, look what happens.

Much different, right? Like this, this row, uh, this row here looks a lot different from this row here, right? So like ANSI padding has an effect, has an effect on temp tables.

ANSI padding does not have an effect on table variables. So this is stuff that, you know, probably doesn’t make that much of a difference in the world because ANSI padding is a fairly rare thing to see people like messing with. But, you know, uh, just thought it was kind of a little bit of sort of interesting SQL Server trivia.

Probably the, like, I don’t, I don’t have a good reason why this is the case, why table variables don’t respect ANSI padding when temp tables do. The closest I got to an explanation was that table variables were added, uh, in SQL Server 2000 and did not come with the backwards compatibility for the ANSI padding setting, which was already like deprecated or something by then.

So that’s probably why, but, um, you know, that’s, that’s, that’s as far as I got with it. Anyway, uh, I think that brings us to is about as much as we can talk about for the ANSI settings in SQL Server. Uh, we’re going to talk about, I don’t know, whatever the next topic was next.

I think it was, I think it was commenting or code formatting, one of those two, but we’re going to talk about that next. So, uh, I hope you enjoyed yourselves. I hope you learned something. I hope that you will be careful and judicious with your ANSI settings when you’re creating and executing store procedures. And, uh, yeah, I don’t know.

I think that’s about the end of that. Um, cool. Yeah. All right. Well, we’re going to, we’re going to close this one out. We’re going to say goodbye. Uh, actually, you know what?

I, I realized I forgot, I forgot to sort of tell you that the reason why I set these at, at runtime here, there’s probably an important, important factor with this is that, uh, coming back to like what executes a store procedure.

If it’s an agent job, if it’s an ORM that doesn’t use the correct connection string settings, you can correct them within the store procedure and override the, whatever the connections, uh, context was from the ORM.

So that’s why I do that there. In case I didn’t say that before, I’m saying it now. Anyway. All right, cool. I’m going to leave now. That that’s, that’s enough. All right.

Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Some Thoughts On The Future Of Query Optimization

Some Thoughts On The Future Of Query Optimization


Video Summary

In this video, I delve into my thoughts on query optimizers, specifically focusing on those found in database engines such as SQL Server and Oracle. I start by clarifying that this discussion is not about the human query tuners who optimize queries or my personal approach to optimization; instead, it centers around the software within a database engine that generates executable query plans based on cost models. I highlight how these optimizers make assumptions, often assuming a cold buffer cache and using generalized cost models that may not fully account for modern hardware capabilities or specific system configurations. Throughout the video, I explore various feedback mechanisms like memory grant feedback and adaptive joins, discussing their limitations and potential improvements.

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we’re going to talk about some of my thoughts on query optimizers. Now, I don’t want you to get this confused. I’m going to make this very clear at the outset. I am not talking about the people who optimize queries like query tuners, like, you know, people like me and maybe like you, if that’s your thing, too, who, like, you know, rewrite queries, fix indexes, you know, change things around. So that they magically run faster. Nor is this my thoughts on query optimization generally. This is not going to be about, like, my approach to optimizing queries or what I look for, what I do. This is, this is more about, like, like, the piece of software inside of a database engine, a lot like SQL Server, that does the costing and the modeling and the query plan stuff that that gives you the cost of the software. gives you an executable query plan. And sort of like, how they can be improved in like the future of those things. Because I don’t know, I just I think I think what I think is interesting. So I’m going to share what I think with you. If you like this, this sort of content, and you want to support me recording these videos, I got to pay off this this camera somehow. You can sign up for a membership down in the link, the link in the video description below for as few as $4 a month. If you also have a camera to pay off, and you don’t have four extra bucks a month, you can like you can comment you can subscribe and gosh, seeing those numbers go go up just well, it doesn’t help me pay off a camera, but it does does do my sad heart some good. If you want to ask me questions about SQL Server performance, you can go to this link, which is also the video description below and you can you can you can insert as many questions as you’d like. Just please try to stay on topic. It’s all I it’s all I require of you. Now if you need help with SQL Server, from a consulting point of view, well, gosh darn it, I am still the reigning champion of SQL Server consultants. If you need help with SQL Server, checks, health checks, performance analysis, hands on tuning, query optimization. If you’re having a performance emergency, or if your developers just need some training because you’re having too many performance emergencies, well, guess what I do at a reasonable rate? That stuff, you’re in luck.

If you would like some high quality low cost SQL Server training content, again, all about query optimization, you can get all of mine for about 150 USD. Just go to that link, enter that that coupon code and all will be applied. There is a fully fully formed link for you down in the video description as well, just in case. SQL Saturday, that’s what it’s called. New York City is taking place in this year 2025 on May the 10th. I highly encourage you to attend. It’s going to be good. All you have to do is drag your behind the Times Square on Saturday, May 10th. It’ll be good for all of us. With that out of the way, let’s talk a little bit about this stuff. Now, query optimization. You have a query optimizer. I’m not talking about a query planner like Postgres has. I’m talking about a query optimizer like SQL Server has, like Oracle has, like many other good database engines have.

They make a lot of assumptions about stuff. I think the first thing that if you’ve spent any time thinking about or reading about query optimizers, like SQL Server is one of the first things it assumes is that you’re starting with a cold buffer cache. Like no data is in memory. Like any read you do is going to be a read from disk. And it has all sorts of costing things built into that. Like some are fixed costs like, you know, doing an I.O. Like what’s a random I.O. What’s a scan of an I.O. Like how much does it cost to like compute this scalar? A lot of things have fixed costs. Other things have sort of dynamic costs based on how many rows, the size of those rows, you know, stuff like that. Right.

And the thing that you run into with those sorts of cost models is that they are very, very highly generalized. SQL Servers is no exception. SQL Servers Query Optimizer does not know anything really about your system. Like it like there are some things that like SQL Server itself is aware of. Like when it starts up, it looks at your CPUs to kind of figure out like like what kind of SIMD support you have. It knows how much memory is in there so that it can work out memory grants, you know, stuff like that. But like as far as like, you know, what kind of storage you have, SQL Server has no idea. SQL Server doesn’t interrogate storage to figure out what’s underlying there. So I.O. costs are very, very highly generalized.

I.O. costs are also ancient, right? Like SQL Server still like costs I.O. like you’re on a spinning disk that has a huge penalty for random I.O. That’s why things like key lookups are costed very highly because the random I.O. So and Microsoft has been getting better about some aspects of the query optimization model by sort of like providing feedback to the model after a query or while a query executes.

So there are three three stages, right? There’s stuff that happens before a query executes. And you can think of that as stuff like table variable deferred compilation where SQL Server will be like, oh, there’s like you’re like like let’s just say you’re in a store procedure. You declare a table variable. You’re not using it. It’s not a table value parameter. It’s a table variable. You declare the store procedure. You insert some rows into it. So SQL Server will stop query optimization at that point and be like how many rows are in this table variable. Let’s come up with a query plan based on table cardinality.

Again, you still don’t get column level histogram information about what’s in the table variable, but you do know how many rows are in there. Perchance SQL Server might choose a different query plan if there are 10 rows in your table variable versus a million rows, right? That’s the hope anyway. Then there are some runtime feedback mechanisms like adaptive joins where SQL Server is just like, well, I’m going to do this loop join.

And if I hit enough, if I get enough rows out of this loop join, I’m going to switch to a hash join, right? So that’s like during intra execution. And then there’s post execution stuff. Memory grant feedback is the one that I think has been around probably the longest, but like there’s cardinal, cardinality estimation feedback in the product now too, where after a query executes SQL Server will be like, wait a minute.

I was way off on this thing. I should probably figure something else out. Other examples, you know, like the parameter sensitive plan optimization. That’s a pre-execution sort of mechanism to make things a little bit more, a little bit less ride or die for the optimizer.

So like, like, like, and I think these, these are very cool things because like one, one thing that a lot of people sort of miss about models. And I don’t mean like runway, sexy runway models. I mean like model modeling things, right? Like probably the most common thing that people will argue about models with is the weather, climate, stuff like that.

One thing that like a good model will take into effect is feedback, right? So like, let’s say you predict that like, like you have a query optimizer and you assign all these cost models to it. If your cost models don’t get any feedback on how good they did or how well they did when a query ran, your model is just going to keep making the same mistakes over and over again.

That’s why, you know, for a long time, you know, SQL Server would choose a query plan. And if that query plan was good, you were real lucky. If that query plan was bad, you were real unlucky because SQL Server would keep using that query plan and you would keep getting the same bad plan.

And, you know, it would be up to you to figure out some scheme of rewrite, indexing, you know, breaking stuff up into temp tables, whatever, in order to improve that. So I really like that Microsoft is actually providing more and more feedback mechanisms into the product. But at the same time, a lot of those feedback mechanisms are very, very conservative in how they’re like conservative in how they’re applied.

So, you know, any sort of like automation for that stuff, it’s like SQL Server has to collect a lot of metrics about how things are going generally, whether it’s whether it’s right about cardinality estimation, whether it’s right about memory grants, all that stuff. And then it has to say, OK, well, you know, most of the time I’m right about this.

Sometimes I’m wrong about this. Or, boy, I was just wrong right out the bat on this. But I think one thing that is going to have to start happening is it’s like it’s going to have to start rethinking alternative plan costing.

It’s cool if, you know, you figure out that cardinality was wrong for something in a query and that maybe costing and plan choices after that were incorrect. But just improving the cardinality estimate may not improve the overall plan that you get. Like you’re going to have to start thinking about like completely like like how that cardinality affects other plan choices.

And if perhaps a completely different query plan would be more suitable. This is because like especially like zooming in on this part, you know, you can have like a like and like I’m going to say this off the bat. I think like if you if you look in the world of query optimizers out there, I think Microsoft probably has the best one like across the board.

There may be specific things that other query optimizers do better or they have different like ways of handling things that are better. But I think just as a general cost model, I think I do think that SQL Server has the best one regardless of how much time I spend complaining about it. But, you know, like when you think about sort of like the quality of the assumptions you start with, Microsoft could probably do probably do better with IO costing.

And for that, the quality of the quality of the data you start with is mostly up to you. Right. Like, you know, how how up to date your statistics are, like, you know, how like if they’re full scan versus like the default sampling, things like that. And then like making sure that you’re gathering sort of the right metrics about the, you know, like the runtime metrics of the query when it runs mean like, wait a minute.

These are all the places I was wrong. This is where stuff took a long time. I need a different I need to like I need to like recompile this branch of the query plan because this was completely bonkers. I shouldn’t have done a loop joint here at all.

Like I should have done something way different. So there’s a lot of stuff that, you know, like and this is these are things that have been improving as server hardware improved. Right. Like, you know, if you think about when these sort like like query optimization first started, computer hardware was very, very limited.

Like you just didn’t have like any extra like there was no extra gas in the tank for stuff like this. Right. Like you like your hardware was dedicated to that workload. And, you know, I hate to say it, but the cloud does allow for a lot more flexibility with these things because you have like other hardware you could do this on.

Right. Like copies of the data, things like that. And, you know, you can start considering all sorts of like alternative plan stuff like early aggregations. Maybe, you know, I should have used like a batch mode on rowstore here.

I was wrong about this thing. Maybe I should have used a hash join instead of a merge sort merge join, things like that. There are lots of there are lots of like alternative costing scenarios that like need to start going out there.

There are some query optimizers I’ve learned that will come up with a set of like candidate plans and then run all of them. And whichever one comes back first, that’s just the winner. Right. Like that’s that’s just the one that gets used for the future.

So that that’s kind of a cool thing, but I don’t expect Microsoft to do that because who wants to run three copies of the same query on the same database? And like, you know, one one finish it like let’s say like like one finishes in 30 minutes and one like the other two are still running. Like you just going to like like let that one run for 30 minutes and kill the other two off and then start thinking about that one that ran for 30 minutes being like, OK, well, what can I fix in there?

Now, what was I going to say? Oh, man, I lost it. I was on a I was on a I was on a roll there and then I completely, completely, completely lost it. I talked I talked about this already, how all optimizer model assumptions are generalized to work across lots of different hardware.

Basically, that means like if your if your server had like four cores and 12 terabytes of memory or if it had, you know, like 128 cores and four gigs of memory, you would get the same execution plan on either set because SQL Server would have costed any like all that stuff equally like regard again, regardless of the hardware. There are lots of things that are sort of like out of model constructs. Just probably the simplest example of that is like let’s say that you pass in a blob and it could be XML, it could be JSON.

It doesn’t really matter which one. Like SQL Server has no idea what’s in that blob. Even when you parse that blob out, it’s not doing any statistical analysis of what actually came out of the blob.

You get you very much get fixed cardinality estimates and fixed costs based on just like SQL Server knows that it has to do something with XML or JSON. It doesn’t stop to look at what was in there. That’s why a lot of stuff with like string splitting or like like, you know, runtime XML and JSON parsing where you’re like joining to the result of it or like it’s an aware clause or something.

You just get terrible plans and cardinality estimates from because SQL Server is not like pausing to do that. Is it possible to do that? Yes.

You could have a rule where like any XML that you parse like or JSON that you parse or string that you split, you know, like would do some. Well, I mean, with XML and JSON, you have like pretty explicit with the types. You can be wrong about the types, right?

You can parse an integer out of JSON as a varchar max if you’re stupid, but SQL Server could like put that into a temp table, look at what’s actually in there and then do some work with that. But it just doesn’t do that yet or may never do that. I don’t know.

There are also like, you know, I want to say impossible to model constructs. Like, because the XML and JSON stuff, maybe that would fall into it because you would have to then trust the, like whoever parsed or pulled the stuff out of there to make the right choices, which is tough to do on a good day. But, you know, that would certainly fall into difficult, maybe not impossible.

I think sort of impossible to model stuff are things where like there are like there are no good statistics on things and there are will probably never be any good statistics on things. Table variables are probably the easiest example of that where, you know, like I’ve talked about a billion times on this channel. You get like there are ways to get table level cardinality to know that if there are 1,000, 10,000, 50,000 rows in a table variable.

But there’s not a good way to figure out if there’s like what’s like there are no statistics on what’s what actually makes up those 10, 50, 100,000 rows. So at some point, the optimizer is, I think, going to be less based on like like improving the cost models. And there are just going to be more rules that get applied and there are going to be more optimization choices where there are feedback mechanisms for it and where the model starts getting like more and more feedback about how it did when things ran.

Because that’s the only real way to improve things. Right. Like you can only model costs for so many scenarios.

At some point, your cost model needs to get some feedback and use that to adjust costs so that like when queries run, SQL Server has a much better idea of what to expect. Right. Like SQL Server has a much better idea of like, OK, like this choice, like when I run this query, this choice always turns out bad.

I need to make a different choice here. Right. I need to do something different.

So I don’t know. That’s about the end of that. But I think there’s a lot of room for adding in mechanisms like this. There are probably there’s probably a lot of room for adding in additional rules that you could apply to this sort of feedback.

And there’s probably a lot of rules that Microsoft could include. Or there are probably a lot of like planned exploration spaces and heuristics that Microsoft could include to do things a bit smarter. You know, my favorite example is when you join on an OR clause.

SQL Server should be able to unroll that into like a union or union all of like both sides of the OR clause with the correct sort of filtering out of things for one side. Because that’s a that’s a really big thorn in a lot of people’s sides when they start writing these queries. But anyway, I got to get going.

I forget. I have something to do at 2.30. We’ll figure out what it is in a minute, I guess. But thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. And I will see you again in the next video. Goodbye. Goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Catching Queries That Fire Triggers

A Little About Catching Queries That Fire Triggers


Video Summary

In this video, I share a simple technique for figuring out which queries are firing triggers in SQL Server. I walk through creating a logging table and a trigger that captures the query text as XML, allowing you to see exactly what statements are causing your triggers to execute. While this method isn’t suitable for legal or compliance purposes due to its batch-level capture of queries, it can be incredibly useful for troubleshooting and understanding which operations might be problematic in your database environment. If you’re interested in more SQL Server content, consider supporting my channel with a membership for just $4 a month, or engage with the community by leaving comments, subscribing, or submitting anonymous questions.

Full Transcript

Erik Darling here with Darling Data. Another beautiful, sunshiny day here. I can’t wait for winter to be done. I grow weary of this weather. I don’t like being cold. Probably the biggest impact it has on me is that I have to wear full-length sweatpants to the gym instead of my usual very comfortable 5-inch jogging shorts which allow for a much more comfortable range of motion across the barbell lifts. But I guess the one thing that keeps me hanging on right now is that there’s about two weeks until pitchers and catchers report to spring training. And that’s… I just wish I would have to go to the gym. I just wish I could sleep until then. Anyway, enough about me. Let’s talk a little bit about how to figure out which query fired that trigger. Now, the technique that I’m going to show you, I’m going to say this multiple times. If you need to audit things for legal compliance reasons or anything else, I would not absolutely swear by this technique. But if you just need to get a sense of which queries are firing this trigger, trigger off for some other troubleshooting purposes, this will probably get you where you need to go. Alright? So, for any lawyers out there, or for anyone out there who’s going to have something smart to say about what I’m going to show you, make sure we temper this expectation. We set this level appropriately before I even show you a lick of code. Alright? So, just remember that.

If you like this channel, if you think, wow, that Erik Darling sure has useful tips about SQL Server and how to dress to do barbell lifts, you can sign up for a membership. And for as few as $4 a month, you can support my efforts to bring you this amazing SQL Server content. If, for some reason, I don’t know, maybe you spent all your money on something that you thought was cool, like, I don’t know, gym membership. You don’t have $4 a month for SQL Server training. I totally understand.

You can do all sorts of free things that help this channel grow and expand, like your muscles at the gym. You can like, you can comment, you can subscribe. If you are interested in asking me questions, either about stuff that I say on the channel, like, not just leave a comment on the video, but if you’re just interested in asking me questions generally, you can go to this link right here, and you can submit a completely anonymous question that I will answer publicly.

Keep in mind, this is not private consulting advice. This is stuff that I’m going to answer live. Well, not exactly live. I’m going to record answers to it, and then I’m going to put them on YouTube. I’ve recorded my first one, and you’ll see it soon. So, keep that in mind.

All right. If you need SQL Server consulting, I am the best SQL Server consultant in the world outside of New Zealand, especially when it comes to these things. And, of course, my rates are reasonable. Just like my shorts.

If you would like some very high-quality SQL Server training content, by golly and gosh, I’ve got that too. You can get all 24 hours of it for about 50 USD. Again, down in the video description, this is all assembled for your lazy behind.

It’s probably not going to the gym. You’re probably already completely neglecting all of your New Year’s resolutions, and I’m very disappointed in you. You should take your health seriously.

As far as upcoming events go, well, SQL Saturday, New York City. The one, the only, the yearly event is taking place on May the 10th of 2025 at the Microsoft offices in lovely, crime-free Times Square. So, I look forward to seeing you there if you happen to find yourself in the area.

If not, well, I don’t know, I guess we’ll have to meet again some other time. But with that out of the way, let’s talk about how we can accomplish this amazing feat. Now, I’ve got a couple of tables here.

One of them is just to have a table to put a trigger on. We need a table to put a trigger on. Well, I guess we could use a view, too, but then I’d have to create a table and a view, and, well, I don’t know.

For four bucks a month, you only get a table, okay? All right? That’s what you get. And then we have a table that we are going to log some stuff to.

Now, there are lots of other columns you could put in here. There are many things you could put into this table that would give you more contextual information about the user’s stuff that fired the trigger. But I’m just creating a simple three-column table here.

And in this table, I am putting, if, again, zoom it, zoom it, zoom it. It only takes me hitting control and one four times to get it to work these days. That’s a nice change of pace.

At least it’s an even number. That’s what the number four has going for it. But the stuff that I’m going to stick in there is pretty simple. I’m collecting the username.

I’m collecting the time that it happened. And I’m collecting the query text as XML because I like, this is me personally, I like being able to click on a query and have the whole query appear in front of me. I just like the simplicity of it, so I’m using XML for this.

All right? And zoom in is going to keep messing with me today. Great. And then I have this trigger that I’m going to be using. And, you know, again, not production ready, not legal compliance ready level of stuff here.

But just to show you how this can work, this is what I’m using. Okay? So I’m going to catch inserts, updates, and deletes with the trigger.

IRL, that’s what the kids call at work. I would maybe not want to have inserts, updates, and deletes all caught by the same trigger unless I was doing something very, very simple. And in this case, I am.

All right? This is simple enough. Now, one thing that I think is critically and crucially important to start every single trigger off with is this little block. All right?

Before you do anything else, if row count big equals zero, if nothing actually changed, bailout. I like to have in my if blocks, even if I’m just doing the one thing after, I like to have begin and end in there. You only have to type it once, and it makes very clear what one thing you are doing.

All right? Sure, it’s a little verbose. It’s a little extra typing. But the nice thing here is your intentions are perfectly clear.

All right? Even though it would work to just say if row count big equals zero return, well, I like to make sure. I like to aim for very clear code as much as possible.

That may not show in all of the code that I’ve ever written in my life, but the older I get, the more I appreciate it. All that stuff. All that stuff.

And then, of course, we want to set no count on just in case. And then all I’m going to do in here is a, well, what should have been a relatively simple insert. Should have been.

But it turned out I had to do a little bit of extra typing here. I’m not sure that this is living up to my recently professed standard of code clarity. But, you know, it got me where I was going.

Where I’m going to be inserting. So, like, you can’t put for XML path out here, like in the insert. You have to nest it a little bit in here.

So, I’m just taking the event info column from this. Oh, boy. Where did you go? Ah! From this lovely DMV, sys.dmexec input buffer.

And I’m feeding the current SPID and the current request ID into sys.dmexec input buffer. So, I get a little bit more precision from the statement that I’m executing. And I’m going to put the text in there.

And this should all be pretty well set up for us. You can just make double sure there. And then down in this section, I have a begin transaction fully spelled out. And a rollback transaction fully spelled out.

So, and then within those, within the confines of that transaction, I am doing an insert, an update, and a delete. You might notice that I am inserting a row. I am updating a row.

And then I am, what do you call it, rolling that back. All right. Now, that’s about it. You might be wondering why I’m inserting the value 3 there.

Well, it’s because I had already inserted two rows and I don’t want any primary key errors to show up and make me look like a fool in front of my 6,000 dedicated, devoted data darlings out there. And I’m just going to run this real quick. And so, it’s going to run an insert and update delete.

And that is going to select from the logging table. And, you know, why I said this is kind of quick and dirty and why I said that it would probably not be great for legal slash compliance reasons is because it does capture, like, a batch of text. Right?

Like, we have the whole batch in here. And if you look at the query, and part of the reason why I like the XML is because I can just click on this and we can see everything. So, like, it has everything in here. So, if you had multiple statements in here that were, like, inserting, updating, or deleting from a table, then you would have, like, the whole thing.

Right? So, like, it captures a block of text. But, like I said, if you just need to kind of figure out where some stuff is coming from, or, like, you know, like, if you have, like, an ORM doing this.

Like, with the store procedure, you would get a whole lot of text back. But if you have, like, an ORM doing this, a simple CRUD app that’s sending, like, one statement in and that’s the end of it, this is probably going to be good enough to get you what you need to figure out which statements are firing the trigger and maybe causing problems for you in some way. Or if you just want to figure out, like, what actually does something that this trigger cares about?

Well, this is a good way to do it. So, pretty simple, pretty simple logging mechanism there. Again, not great for legal or compliance standards, depending on how rigorous those standards are for you.

But I don’t know. But, yeah, you can totally use this to just, you know, quick and dirty kind of figure out which statements are causing your triggers to go, pow, pow, pow, pow, pow. I don’t know.

I don’t know. Do something that you care about because you shouldn’t log things if you don’t care about them. If you’re never going to look at them, if you’re never going to analyze them, if you’re never going to review the data, I would not recommend logging the data because it is an extra step. And under high concurrency, you would be inserting, you know, you would be doing some extra work in the trigger.

So, you know, just be judicious in all things that you do in your database. The funny thing about databases is there’s just as much to get right in them as there is to get wrong in them. Takes a thread that needle sometimes.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that you will be judicious in all things that you do in your database. And I will see you in another video, another time, another place.

All right. Cool. Thank you. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 1

SQL Server Performance Office Hours Episode 1


Do you like me? Y/n
Does Erik prefer nested loops or fruit loops?
If Erik was an operator in an execution plan plus, what operator would he be?
Sometimes my sql server quits responding so I reboot it but then it takes forever to come back online. Is there an easier way to corrupt my data? PS I’m running on prem with azure disks so latency is a little high but usually performance is good. Also hi Erik, this was a test to see how many words I could fit into a form.
Can you fix my server performance for free? Here’s my execution plan <?\stmtsimple(@2$?@38,!@/&!;,$-“2(,&~¥’c/&1@,’]+_•
Can I edit my response?
Are you free later
What is the best Champaign to buy someone when you find out they don’t have to use Microsoft Fabric?
Hi Erik. Im having huge problems with a table that stores text files in a varchar(max) column. Can you suggest an alternative to that? Thanks
Where’s the beef?
What do you wish that more DBAs have read? Specific books, blogs, etc.

To ask your questions, head over here.

Video Summary

In this video, I dive into some interesting and varied questions submitted by viewers for our inaugural Office Hours episode. We tackle topics ranging from the nature of serial nested loops in execution plans to more complex issues like performance tuning and data storage strategies. Whether you’re curious about SQL Server’s behavior or looking for practical advice on improving your database management skills, there’s something here for everyone. I also share details on upcoming events like SQL Saturday New York City 2025 and offer insights into how you can support the channel through membership or other means. Join us as we explore these questions together and continue to grow our community of data enthusiasts!

Full Transcript

Erik Darling here with Darling Data. And we’re going to try something a little bit different today. I believe that I have fielded enough high-quality questions to do our first Office Hours episode. So we’re going to try this out. We’re going to see how it goes. So I hope you’re all prepared. It’s been a little while since I’ve done one of these. But here we go. If you like this channel’s content, if you like watching Intel, time out looking for drivers, which is always a good time, you can become a member of the channel. You can join 50 some odd other people who have said, Erik Darling, thanks for all you do. Here’s four bucks a month. And I appreciate every last one of you, as individually and as a group. If you don’t have the four bucks a month, I mean, you know, Christmas did just pass and a lot of people are still struggling with credit card debt. There are all sorts of free things you can do that help me grow this channel, like liking and commenting and subscribing. If you want your question to appear on a future episode of Office Hours, you can go to that link right there, which is also in the video description. It’s amazing how this works. And you can you can put your high quality question about SQL Server performance in the question bar and I will answer it right here. If you need help with your SQL Server, if maybe the answer you got was not quite enough to get you over the proverbial finish line, I am available to do all of these things with SQL Server health checks, performance analysis, hands on query and index tuning, dealing with your SQL Server performance emergencies and training your developers so that you can avoid SQL Server performance emergencies in the future. If you would like to get people trained, if you would like to get yourself trained, you can get all of my performance tuning content for about 150 USD. Again, this whole fully assembled link for you is available in the video description. For upcoming events, SQL Saturday New York City 2025 is still taking place on March 10th, May 10th. Good Lord, I can’t even read today. It’s gonna be interesting reading questions, isn’t it? May 10th at the Times Square building in May 10th.

So you should go to Manhattan. So you should go there and come learn some more stuff about about about data. It’s not just SQL Server, it’s data in general. So with that out of the way, let’s, let’s, let’s go answer some of these questions. Now, these are the questions that I have fielded so far from you, the public. And we’re gonna we’re gonna work our way through these. So the first question we have here is do you like me? Yes or no? Well, I mean, the obvious answer is no, I don’t like you. I love you. Why? Why would why would we stop it there? Why would we stop this? Why would we be such a low ambition? Why would we stop it there? Why would we stop it there? Why would we stop it there? So why would we stop it? Why would we stop it? And I would try to get it, and I would like to answer it.

serial nested loop on its own. So as long as the loops are parallel, I’m happy with either one of them. They are equal in my eyes. If Eric was an operator in an execution plan, what operator would he be? Well, I think that’s obvious. It would be delete. Let’s see here. Sometimes my SQL Server quits responding, so I reboot it, but then it takes forever to come back online. Well, accelerated database recovery would be a pretty good option for you, assuming that you’re on SQL Server 2019 or up.

Most of the reboot time of SQL Server comes from all that pesky transaction log stuff, and accelerated database recovery helps you avoid all that. Then there’s another question here at the end. Well, I mean, I guess that first one was more of a statement. Is there an easier way to corrupt my data?

Well, you know, you can always use no lock hints. You can always use dbcc write page. There’s many easy ways to corrupt data. All depends on how far you’re willing to go. Here we have a really good question. Can you fix my SQL Server performance for free? Here’s my execution plan. Well, believe your execution plan has become corrupted. I’m not sure what is going on there. It looks like it started off strong with a parameter called two. So this looks, this looks to me like perhaps an issue with simple parameterization, or perhaps this is an ORM generated query, because this is, this is not the type of parameter that I would expect a human being to name right there. And then we have, oh dear, we go up to parameter 38. That’s a, that’s a big one. 38 parameters. Perhaps you’re dealing with long in clauses. I don’t know. Something is amok there. Yeah. All right. Let’s, let’s, let’s move on a little bit. Let’s see. Uh, can I edit my response? I guess, I guess we never, I guess we never got around to testing that. Uh, yeah, the, the, the actual answer is yes, you can edit your response. If you, if you type in your question and then you’re like, wait, did I make a typo? You can edit your response. So that is a big resounding yes there. Are you free later? Well, this is not the type of question that someone would ask if they were confident in their knot work. I would suggest if you’re wondering if I’m going to be free later, you should learn how to tie better knots or buy stronger rope or do something else that would help you be more confident in whatever scenario you’re envisioning. What the, someone who really maybe perhaps should have edited their response. What is the best champagne to buy someone when you find out they don’t have to use Microsoft fabric? Well, assuming you meant the champagne that you drink and not the, the champagne that is, uh, I think, I think that’s how the lake is spelled. I was never quite that good at geography. Uh, but I assume that if you mean the, the, the, the drinkable potable sparkling wine, um, I, I, I enjoy Krug quite a bit. Uh, I enjoy Ruinart quite a bit. Uh, they make a very good Blanc de Blanc. Uh, so either one of those would be sufficient, but I just, I just want to point out that no, no one has to use Microsoft fabric. In fact, I would, I would do my best to dissuade anyone from using Microsoft’s beta version of Databricks. It’s, it’s, it’s a real joke. Um, it’s, it’s probably got about 16 months left until it’s something else. So, you know, don’t get too attached. Whatever, whatever bottle of champagne you buy will probably last longer and have a longer shelf life than Microsoft fabric. All right. So, uh, hi Eric. Hello. Hi. Uh, I’m having huge problems with a table that stores text files in a Varkar max column. Can you suggest an alternative to that? Well, um, you know, it’s, it’s, it’s a fairly well, well trod path, uh, that you should not be storing, uh, uh, blob files or blob data in your, in your database. It’s going to end up, it’s going to end up being rather painful for you. Uh, the smart thing to do there is of course to, to store a pointer to the file system in a reasonably sized, it could be a Varkar 500 some odd thing. Uh, usually that’s usually about enough to store a file path depending on how, how deep your directories go. Uh, and you could just store a pointer to that file on disk and, and just access that file on disk rather than try to store that in your database. Uh, file stream is a real lousy feature. I wouldn’t, I wouldn’t, I wouldn’t expect anything out of that if that’s what you’re using. Uh, if you’re just storing the, the, the data is, uh, well, I mean, you’re, you says you’re storing it as a Varkar max and not a, not a binary. So it’s not even anything that you could convert. If you’re just storing the, the text of the text files in there, well, God help you. Um, uh, I think if, if, if you’re, if, if you’re storing the text of the text files in the column, which is a different scenario, uh, what you would probably want to do is, uh, put that into a different table and maintain the primary key from both tables and just do lookups to, uh, the, the, the max column in this sort of lookup table for that. Um, that, that’s, that’s usually the best solution there. Uh, there’s also an SP table option option to store, uh, all blob data, uh, off row, uh, which can sometimes help with some things. I’ve messed around with that a bit in the stack overflow database and the posts table with the, the, the, um, the body column, cause that can be very long. That’s also an, that’s an Embarkar max field. Uh, and it, it certainly has some interesting stuff that it does, uh, to performance there. But, uh, yeah, uh, in general, um, depending on what your exact scenario was, is not a little bit light on detail here, but if you’re storing like blob data, like, like, like a file is a file in there, like file stream or something, uh, I would rather store a pointer to the file on the, on the, on disk somewhere. Uh, if you’re storing the contents of text files in a column, uh, I would prefer to separate that blob data out to a lookup table with maintain the primary key between the two. And then, uh, just make sure that whenever you, only if you ever need that stuff, uh, to, to, to join to that table, if you don’t, so that way it just gets in the way of less things.

All right, moving on here. Uh, where’s the beef? Hopefully in, hopefully in the fridge. Uh, or if it’s not in the fridge, then hopefully it’s, it’s in the process of being cooked. Uh, I, I, I would recommend a, a cast iron skillet for that. If you’re cooking beef, it’s probably the best, the night, the best way for you to treat your meat. It’s with a cast iron skillet. It’s well seasoned. Uh, if you, if you need, if you need fuller cooking instructions, I’m happy to provide them for you. All right. And our final question today, uh, what do you wish that more DBAs have read?

Specific books or blogs, et cetera. Well, that’s an interesting question, uh, because I don’t think that the problem is that, uh, there’s a lack of things that DBAs have read. I think there’s a lack of things that DBAs have comprehended. Uh, the problem isn’t often with like putting your eyes on words.

The problem is often actually comprehending what those words mean and how those words translate to action in the database. Um, I, I suppose it would be helpful if DBAs read some books on economics, they could understand that really nothing is free. That’s an important concept to learn in databases.

Everything has a trade-off. Um, I suppose, I suppose Milton Friedman would be useful in that regard. I don’t know. He seemed, he seemed, he seemed keen on that concept. Um, but as far as like specific SQL Server stuff, I mean, selfishly, of course, my blog over at erikdarling.com, uh, you know, from a performance point of view, uh, I think that, um, you know, if you’re, if you want to start like way back foundational material, uh, Craig Friedman, uh, F-R-E-E-D-M-A-N, like Friedman, Craig spelled the normal Craig way, uh, wrote a lot of stuff that is still very applicable about how a SQL Server works today. Uh, of course, Paul White, um, still, still writes to this day. Craig hasn’t written in quite a while, but Paul writes quite a bit. Um, uh, Paul, Paul is one where you, you do need to engage reading comprehension fully, though. That’s, that’s, that’s good there. Um, and is, I think another thing that’s very important to read is actually the documentation. Um, you know, the, the, the, the doc, the documentation for SQL Server is certainly flawed and it is certainly incomplete in some ways, but, uh, I do think that, um, many of the docs articles are at least reasonable attempts to, uh, teach you exactly how, well, teach you how something was designed, perhaps not how it works, uh, perhaps not what it was intended to work with, but at least to tell you how it was designed and, you know, give you all the, you know, the syntax for things. And there’s a, there’s a reasonable amount of stuff in there. Uh, as far as books goes, um, I would actually, uh, go back to, um, Craig Friedman there. Um, uh, so Kaylin Delaney, uh, used to write a lot of books about SQL Server. And back when she was writing these books about SQL Server, uh, a lot of people from Microsoft would contribute. Craig Friedman being one of them. Connor Cunningham was, uh, gotten involved with one or two of them, but there were a lot of smart people who were involved. I think Kevin Farley was in, uh, at least one of them. There’s like a whole list of like smart people who would contribute to her books on SQL Server. Uh, and you know, people who worked for Microsoft and had certain insights, uh, into how the engine worked that would be beyond what a normal person could reasonably surmise just from using it. So a lot of those books are still very valuable for foundational material. Granted, a lot of the specifics on Slava Ox was in on one of them.

Uh, but there was a lot of, uh, there’s a lot of good foundational materials there that sort of gives you a good idea about how SQL Server works as a database, how it functions, um, how a lot of, you know, the, like, you know, the, uh, like monitoring, uh, query plans, query execution, storage engine stuff. Granted some of the, like the details of that has changed over the years because of something, a lot of stuff about SQL Server has changed over the years, but it’s a very good way to get, um, a good sort of just deep dive into how databases work. Um, if we want to go beyond books and blogs and you just want like good general database knowledge, um, I think I mentioned this recently in another video, but, um, the Carnegie Mellon University database group, really smart guy named Andy Pablo pub, like puts all his lecture, his class lectures and notes and slides and like class projects. Like they’re all available on YouTube for free. You can sit there, you can watch them. It’s about an hour of class. Um, he has, you know, like, you know, he has spring fall semesters, there’s advanced, there’s intro to databases. There’s one going on right now about query optimization.

And if you just kind of want to learn a little bit about like, uh, like, like real date, like database internal stuff, not necessarily specific to SQL Server, but good database internal stuff, that’s, that’s a, that’s a very, very good resource. I can’t think of anyone who, uh, who has material, the caliber that Andy does, um, with like the, like the up to dateness of it. Cause it’s all, it’s all like current, he’s like still teaching. So there’s a lot of great stuff there. So, uh, that’s, I mean, I realize that’s somewhat of a roundabout answer. Um, and many of the things in there, you probably would have, uh, come across on your own already. Um, I think another good resource, uh, would be the database administer, administrators stack exchange site.

Um, a lot of questions that people have about SQL Server or databases in general have already been answered there. Um, and you know, like it’s not always, uh, always the easiest to find the exact thing, like granted, like site search is not, not a hundred percent there, but if you type in like some basic, uh, keywords and you, or, you know, you, you’re looking like for like kind of like a, like a subject or, you know, there’s like just something like some specific thing you’re after, you can probably find it there.

So always check the, like, you know, like the database administrator stack exchange site has a lot of great resources on there as far as Q and A goes, you know, where, you know, and it’s not just like, there’s like crappy fly by night answers either. There’s a lot of good detailed responses in there.

So that should, that there’s another good place to look when you are, um, when you’re trying to figure out, uh, or figure out a database problem, or you have a question about, uh, how to do something in a database or how something in a database works. It’s another very good resource. Anyway, uh, that brings us about to the end of this, this here Q and A. So, uh, I’m gonna, I’m gonna get going, uh, wait for some more things to filter in here. Again, if you would, if you would like to add anything to the Q and A, if you would like to be part of the Q and A, if you would like me to answer your question, uh, you can go to this link. The, the, the, the, the, the link is fully fleshed out down in the video description.

So you can click on that and you can, you can, you can submit your question and I will, I will put it up on the screen and answer it. That’s, it’s about how it works. So anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in the next video and, uh, hopefully I will be reading your questions on a future office hours.

There we go. All right. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little More About Parameter Sensitive Plan Optimizations In SQL Server 2022

A Little More About Parameter Sensitive Plan Optimizations In SQL Server 2022


Video Summary

In this video, I revisit a subject that I explored in an earlier recording—parameter-sensitive plan optimization introduced in SQL Server 2022. Despite initial excitement about potential improvements and changes, my recent re-examination revealed that Microsoft did not make significant advancements to address the issues I highlighted previously. The feature still faces challenges, such as its reliance on a specific skewness threshold for histograms, which can lead to unexpected results. To help viewers understand these nuances better, I also discuss new insights shared by Paul White in his Twitter article, providing additional context and detail that were not covered in my initial video.

Full Transcript

Erik Darling here with Darling Data, and I am revisiting a subject. Of course, Intel likes to revisit checking for drivers, but I’m going to revisit a subject that I recorded a video about. I forget when, but I was very excited because it was about the parameter sensitive plan optimization that Microsoft introduced in SQL Server 2022. And, you know, I’ve had some, you know, general complaints about it. Stuff like the extended events for it not being so great. It not kicking in in situations where I really thought it should. And, you know, like, a lot of the stuff around the way that it decides which query should use which plans and the way that it, like, detaches the queries from the dynamic, from the SQL SQL, they call them. So, like, like, like, like, like, usually with parameter sniffing, you know, like, my, like, my go-to for many years has been to use dynamic SQL, which, when you use dynamic SQL in the context of the stored procedure, that SQL is no longer, like, closely associated with that stored procedure. There’s nothing in the resulting plan XML for the dynamic SQL execution, the execution of that, that ties it to the stored procedure. That’s why you always helps to put a little comment in there that says, I came from the stored procedure.

And, you know, you would think that, you know, Microsoft being in charge of everything involved in it could do something crazy, like, you know, overcome that. But no, they stuck an object ID in there. But, like, you have to parse out a bunch of XML and get that. It’s a nightmare. So, like, Microsoft really dropped the ball on that part. But I got excited because, you know, one thing that Microsoft says quite a bit is, like, well, we’re not fully documenting this because we want to be able to make changes to it and, like, you know, cumulative updates. It’s, they don’t, they don’t do service packs anymore, because that was too hard, I guess. But, like, they want to be able to, like, like, make tweaks to things. Not that I’ve ever seen much evidence that they actually make, like, a lot of tweaks to things without having to, like, redocument stuff or have, like, you know, oh, this documentation applies to SQL Server 2022 up to cumulative update nine. But after that, it’s this different. It’s this. Like, so I understand why they don’t want to do that.

But, like, here I was thinking, wow, hey, maybe they fixed something, like, an accumulative update that made this better. But they didn’t. It turns out that Erik Darling of Darling Data left some dangling data in one of his tables. And that is why we got the parameter-sensitive plan optimization. Of course, this, some of the, I’m also re-recording this because there was some new information that my dear friend Paul White blogged about, or, I don’t know, wrote an article on Twitter about, or whatever you want to call it, and documented a little bit of the internals of it that I, like, I wasn’t aware of. So we got that stuff to talk about. So before we do all that, let’s get into our relationship, how you and I can successfully coexist and cohabitate this plane together.

I forgot to, I forgot to move on to that slide, didn’t I? I’m not re-recording all that. Sorry. We’re just going to have to deal with that. If you would like to support my efforts in bringing this incredibly high-quality SQL Server content to you on YouTube, you can become a member of the channel and join the 50-some-odd other folks who are giving me $4 a month, or sometimes more, by going to the video description right down here.

If, you know, you ran out of money, I don’t know what happened, maybe the recent downturn in the stock market has taken its toll on your finances, and you no longer have $4 a month to spare, you can do all sorts of free stuff like like and comment and subscribe. If you would like to ask me questions that I will answer on this YouTube channel, I just started collecting them, so I haven’t done one of these yet, but if you would like to ask me questions, you can go to this link, which is also in the video description, and submit a question, and when I’ve piled up enough of them, I’ve already got a few good ones in there, right?

When I piled up enough of them, I will record answers to them, and I will answer your questions. Just remember that these are being asked privately, but answered publicly, so don’t put anything in there that might make you feel bad at a later date. If you enjoy the things that I talk about here so much, and they hit so close to home for you, that you’re like, wow, we could sure use the kind of consulting this Erik Darling does, well, you can hire me to do any of these things, and as always, my rates are reasonable.

If you would like to get some training content from me, you can do that too. Again, all of these links are way right down there, all you have to do is scroll and click a little bit. You can get all 24 hours of my performance tuning content for about $150 US dollars.

Try beating that. It’s a tough one. Upcoming events, we have SQL Saturday in New York City on May the 10th of 2025. Exciting times ahead.

You can stay in a business class hotel, like a Hampton Inn, or I don’t even know what else is over there. Like a Doubletree? I don’t know. Whatever. There’s lots of hotels.

It’s at the Microsoft thing in Times Square. I don’t know. If you can’t afford a hotel room, you can always sleep in the Port Authority bus station or any subway station. Everyone sleeps in the subways in New York.

It’s very safe, very accommodating. With that out of the way, let’s party. Let’s talk about this thing here. This was the demo that I was really pumped on because it’s one that when I was first looking at the parameter sensitive plan stuff, I was like, it’s going to be great if we can fix this store procedure, and it never did.

This one came along, and I was like, nah, not doing this. But then I installed the cumulative update, and all of a sudden, this started working, and I was like, they did it. They listened to me.

They did a thing. They didn’t. They didn’t. Microsoft continues to not listen to anyone except themselves, whatever burned-out, toasted-bud executives are just like, yeah, more AI. Really, really, really doing your job there.

Good job. All right. Okay. So what I had done was I had – there’s a demo that I do where I show a deadlock between a read query and a write query. And I have a reset because, like, one of the things I do is I have to do an update in there.

So I do this update, and then I have another update at the end that’s supposed to reset things back to where they were. The thing is, apparently, I didn’t run that. So here’s what happened.

And it’s kind of funny. And this ties back to some of the stuff that Paul White talked about in his Twixtr article, is that the skewness of – that has to be met, the threshold that has to be met for the parameter-sensitive plan optimization to kick in is if you look at the histogram for a table.

So, like, this is the histogram for the votes table. This is a full scan update, full scan, like, index creation, like, quality histogram. The lowest value in the histogram, I believe for a quality rose, has to have 100,000 times that number by the highest one.

But in this – in the votes table for the vote type ID column, the highest one is 37 million. Right? That’s what this number is here.

3.7 plus E – sorry, E plus 07. That is 37 million. That’s 3.7 million. That’s 37 million. So this is the lowest one.

We don’t have 100,000 times that. If we – if we had 100,000 times that, we would need something with 73,300,000 in there. So I’m going to – I’m going to show you what happened.

Right? So when I was – when I was doing my deadlock demo, one of the things that I do is, like, flip the vote type ID column around. And I do that for this one particular ID.

Right? So now if I run this and we look at the statistics, of course, we’re not going to see anything useful. Right? There’s going to be nothing good in here. I’m going to update these statistics.

Now, I’m using a full scan just because I don’t want to leave anything weird to chance. So, like, normally when I finished – when I finish my, you know, read query, write query, deadlock demo, I’ll run this update down here to reset stuff. This is what I forgot to do.

So what happens is when I – when I had this row in here where vote type ID was zero. So I had one row with one equality value sitting in there. So when I created my indexes up here and did all this stuff, what SQL Server saw in the histogram was this.

Oh, I’m updating statistics again. I’m not showing. That was supposed to be DVCC show statistics.

Let’s do that. So what I – what happened when I created the index on vote type ID was I ended up with this one row with the range high key of zero with one equality row. You can imagine that it’s a lot easier to get – hit that skewness threshold when you have something with – why did that take two seconds to run?

That’s bizarre. All right. I guess format really does suck.

So we would only need one other value with 100,000 rows in there. Of course, we have lots of that stuff going on, right? If you look at – shut up. So if you look at, like, this, like, that one equality, we have lots of stuff in there with over 100,000 rows.

And that’s exactly why it worked. Microsoft didn’t actually improve upon the feature. They didn’t relax anything.

They didn’t, like, do anything better. They’re just letting this thing lie. I don’t know why. But they didn’t actually improve it. They didn’t actually fix anything. They just said, yeah, go ahead. So when you have that one row in there, you do get the parameter-sensitive plan optimization, right?

So, like, when you get the parameter-sensitive plan optimization, you know it because you get this additional stuff at the end of your query that you didn’t put in there that shows you, like, the minimum and the maximum stuff and then, like, which column and parameter match they chose to give you the parameter-sensitive plan optimization for, that they’re going to generate multiple plans, like, depending on which bucket this parameter falls into based on the histogram.

And the thing that will come up… Oh, boy. What’s all this?

Oh, we got a lot of stuff in there now. Oh, okay. Apparently, I’ve just been letting this run for a while. That’s my fault. Well, but the original one that we hit into, that I hit, was this skewness threshold not met, right? That’s this thing.

Zoomit is not listening to me once again. So when I don’t have that row, this is the message that I get, right? We did not have enough skew in the histogram in order for the parameter-sensitive plan optimization to be triggered. Do I think this is good?

No. If you have a value with 733 records and a value with 37 million records, you sure as all get-out have skewness in your histogram. You just don’t have 100,000 times the skewness.

You don’t have 73.3 million. You have 37-point-something million. So you’re, like, halfway there? About?

Not, you know, somewhere in that fuzzy approximate count distinct area. You’re about halfway there. But you are not there there. So Microsoft did not fix anything.

It did not give us any joy or love. I’m going to close this because that is just weird. But I just want to show you the difference real quick here. And that’s going to be for this store procedure, right?

So if we run this and we execute this just for vote type ID 4. I don’t need to show you the whole parameter sniffing thing for it. This is with that additional row in there.

With that additional row in the table for where I accidentally left a vote type ID set to zero, we get the whole parameter sensitivity thing. Kicks right in immediately. Wonderful.

You did a great job. You worked. But, you know, if we come back up here and we did what I should have done when I finished the reader writer deadlock and we set that back to eight and then we update statistics, we’re not going to get that anymore. Because SQL Server is not going to see that one row with the one equality value for vote type ID zero.

It’s just going to see that lowest one for vote type ID 4 that has 733 rows. And we’re not going to get it anymore, right? We’re going to lose that whole thing.

So if I go and I rerun this procedure, I just like to do the creator alter just to make sure that we’re getting fresh plans for everything. But now if I run that for vote type ID 4 and we look at the execution plan, we don’t have that whole, we don’t have all that stuff at the end. We just have me forcing the compat level to 160 so that I would be able to use the parameter sensitive plan optimization.

So first I apologize. One, to you because I got this wrong. I was incorrect about this in my video.

Two, I’d like to apologize to Microsoft for accusing them of getting something right. Big sorry there, all of you. Your record remains untarnished.

And I’d like to thank Paul White for publishing a nice article on Twitter, Twixtr, that I will put a link to in this video description. So you can go and read it with all the great joy that I read it and you can learn more about this stuff. And yeah, I think that’s about it.

All right. It is time to go to the gymnasium so that I can, I don’t know, I think squat and deadlift today. So that’s a fun combo.

Anyone wants to see workout videos, maybe it will be more popular than the SQL videos. Who knows? Right? I don’t know.

If you’re tired of seeing my face and you want to see my butt instead, the squat and deadlift videos are pretty good for that. So anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that I learned something. Don’t count on Microsoft to make stuff better. It’s the first one. And yeah, all right.

We’re going to hit the gym now. Anyway, thank you for watching. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Choosing Index Key Column Order

A Little About Choosing Index Key Column Order


Video Summary

In this video, I dive into the fascinating world of index key column order in SQL Server indexes. After a bit of an uncharacteristic “big arms” moment that seemed to take over my screen, we get right back on track as I explain how different types of predicates can significantly impact which index is chosen and why. We explore scenarios where equality predicates might not always be the best starting point for indexing, especially when dealing with skewed data like in our example table `post`. By comparing query plans using forced seek hints, I demonstrate the performance differences between seeking on a non-selective column first versus a selective one, highlighting how these choices can dramatically affect application response times. The video also touches on dynamic SQL as a potential solution for handling unpredictable query patterns, making it a comprehensive guide for optimizing your index design based on real-world querying behaviors.

Full Transcript

Erik Darling here with Darling Data. I don’t know why I did the big arms thing. It was just like that. Sometimes you just lose yourself in these moments. You get all worked up. You get all riled up and excited and next thing you know it’s big arms. Whatever. Stop looking at me like that. In today’s video, this is a video of the video. This is gonna be, I guess, I guess this is gonna go on the a little about playlist because we’re gonna talk a little bit about choosing index key column order. I’ve done a lot of videos sort of about how SQL Server stores data in B-Tree indexes, how you can take advantage of that for various things like seeking and sorting and all that other good stuff. But this one is a little bit different because in this one we’re gonna talk about how different kinds of predicates can end up being, like, how different kind of different kind of shut up until this thing all the time. Yeah, just scooted it out of the way with my elbow. Pretty good, right? Different, like, types of predicates can be more selective than others. And how the sort of general advice to index purely for equality predicates first isn’t always the best advice. And then finally, how sometimes you might even need to maintain multiple indexes for different types of predicates.

types of queries. And, you know, going even beyond that how you might need to use dynamic SQL, in order to get SQL Server, or rather to direct SQL Server to the correct index sometimes. Particularly because at least as of the 2020 SQL Server 2022, iteration of the parameter sensitive plan optimization, like it only works with the quality predicates, it does not work with inequality predicates. So if you’re dealing with a parameter sensitivity issue, because of like a greater than, equal to less than, less than, equal to situation with like dates or something, it doesn’t, it’s not going to help you at all there. So that’s pretty screwed up. I have another video that I have to do. This can be a follow up on the last parameter sensitive plan optimization video that I did. That’s going to be fun too. So yeah, we’re going to talk about that stuff. But before we get into all that, of course, it is time to talk about the health and wellness of this, this here YouTube channel. If you want to sign up for a membership to support my endeavors and bringing you this, well, let’s call it high quality SQL Server content. Because that’s that’s a nice way to that’s a nice way to frame it. You can do that right down in the video description. There’s a link to become a member, you can join the 51 other people who have chosen to hand over some small quantity of money to me every month. And be I guess thankful for that.

If you don’t have four bucks a month, I don’t know, maybe maybe you got robbed on the playground, you can like you can comment, you can subscribe. And if you want to ask me a question, if you have a like what a good SQL Server question about SQL Server performance, you can go to that link right there, which is also down in the video description. And you can click on that. And you can ask a question. And eventually, I will do a round of office hours questions and answer it. And won’t that won’t that won’t that just be grand? If you need help with SQL Server, if you are ready to ready and willing to bring in a young and handsome consultant with reasonable rates to solve all of your SQL Server performance maladies, you can hire me for any of these things. And once again, BeerGut Magazine rated me the best SQL Server consultant in the world outside of New Zealand.

If you would like to get access to my training, you can get all 24 hours of it for about 150 US dollars. Again, that URL, that discount code, coupon code, whatever you want to call it, right down there. Just click somewhere in this general area.

Upcoming events. Well, well, well, well, SQL Saturday, New York City 2025 will be taking place on May the 10th of 2025. You can go.

Times Square, Microsoft offices. I don’t know. Maybe you can make a friend in the Port Authority bathroom while you’re there. There’s all sorts of, you know, it’s a big, crazy, wild city.

Anything can happen. But with that out of the way, let’s party on, Wayne. So I’ve got my database set to compat level 140.

I forget why I did that. There’s probably a good reason, but it doesn’t matter too much now. And I’ve also got two indexes on the post table. If Zoomit will cooperate.

Aha, there we are. Finally, listen to me. One of them is very descriptively named PSC. It is on post type ID, score, and creation date.

Hence, PSC. The other one that I have is called… Oh, Zoomit’s going to do one of these things again.

All right. All right, Zoomit. The other one is called CSP. Mmm. Very descriptive again. Creation date, score, post type ID. I don’t know why I’ve decided to give them all funny little toupees.

Or maybe they’re eyebrows. I don’t know. Unibrows, I guess. Monterbrows, whatever they’re called. So I’ve got these two indexes. And we’re going to look at how different queries can use these indexes more effectively depending on the selectivity of the sets of predicates that we’re passing in.

Now, if both sets of predicates are equally selective, it almost doesn’t matter a lot which index you use. But in certain cases where post type ID is not selective, but the creation date filters are, or the reverse of that where creation date is selective, but post type ID is not selective, which index gets used can make a big difference.

And then we’re going to talk about one other funny little thing that can happen along the way as well. I’m just going to make sure that I have that set up. I’m going to make sure that these indexes are created, but they should be because I feel like I was pretty smart about this beforehand.

And thankfully I was. So this is our starting store procedure. And we have one instance of the query where we’re going to force SQL Server to use the, to seek into the index called PSC that leads with post type ID.

Kind of a funny set of stuff with the force seek hint. I don’t know if you ever read the documentation for it fully, but with the force seek hint, not only can you tell a SQL Server which to do a seek, you can tell it which index to seek into, and then which key columns, or which key columns you want it to seek with.

And it’s wonderful. You can, you can do all sorts of things. So if this syntax looks funny to you, I would highly recommend going and reading the documentation for the force seek hint, because you can do all sorts of neat stuff with it that you maybe didn’t know about, because you didn’t read the documentation for years like me.

Ta-da! But here we are. So one is going to fork, force seek, not fork, into the PSC index.

That’s this one here. The other one is going to force seek into the CSP index here. And we’re going to look at the differences in query plans when we have selective and non-selective predicates in play. So we’ve got query plans turned on up here.

Let’s run it for this one first. And what I want you to see is that post type ID 2 is very much not selective. There are 11 million rows that match the post type ID 2. And this date, but the date range we have here, the start and end dates are very selective.

It’s just a month, right? So 2008-12-01 through 2009-01-01. If you run this, we’re going to get both queries, but we’re going to see that the query plans are somewhat different in their speed and effectiveness.

So the SSMS will listen to me. And it finally did. We have our seek into the PSC index. So we sought to all 11 million post type ID rows.

And then we had to do a follow-up on that, a residual predicate after that, to apply the predicate on creation date. This all took 1.4 seconds.

And down here, where we did a seek into the CSP index, in other words, creation date first, and we just found that month of values first, that whole thing took about 19 milliseconds.

So we’ve already got a pretty big disparity here, where seeking to all 11 million post type ID 2s and then applying a residual predicate on creation date, that’s what this one is doing.

Oh, thanks for disappearing on me, tooltip. Very nice of you. So we do seek to the post type IDs that we care about. And then we have this residual predicate on creation date here. Now, this is not because score is in the middle.

Score is the middle key column, but range predicates like this, you know, often do end up as residual predicates anyway.

But notice that there is a difference between these two query plans. When we use the PSC index, and we did a seek to post type, all the post type ID 2s first, we didn’t need to sort the data afterwards.

Now, granted, this is 40,000 rows, so it’s not that big of a deal for us to sort in this case, right? 40,000 row sort is almost never going to be your biggest problem, unless like the next one is like 40 million rows or something, right?

You parameter sniff that initial memory grant and things go terribly. Because memory grant feedback, remember, that’s a post execution feedback loop. That doesn’t help you get more memory while a query is running.

That won’t happen. Memory grant is adjusted after the query finishes. So you might have a query run for like an hour spilling data everywhere. And then when it’s done, SQL Server is like, well, I can fix that for next time.

It makes a little note. So, but this happens because when we do this seek, we have, we do a seek in here, right? And we find the range predicate.

Then we have an equality predicate over here. But the way that the B tree is set up with creation date first, like even with an equality predicate on post type ID for two, like that doesn’t help us with the sorting of score, right?

Because score is a second key column. So it’s sorted by creation date. I did experiment with this with an index on creation date, post type ID, then score in descending order.

And there was still a sort because even with that setup, the, like you have like the creation date column and the way, and like I have other videos on how B tree index is stored.

You should watch those if you’re kind of not filled in on it, but like all the like individual creation date entries, like post type ID is sorted in there. So it doesn’t help to have the range and then the equality, but you still have to sort this column.

This column is still not going to be sorted the way you want it on the way out. So for this one, the, for the, the, the selective predicate on creation date, using the index that leads with creation date was the better choice, right?

By like, like just about a second and a half, right? So like looking at like a, like a fairly big difference there in like application response time. If we, if we do that backwards, right?

And we say where post type ID equals four, but then, right? Which is very selective, right? It’s about 700 rows, but then we have, oh, why is this red? This should stay pink.

For this, this is a, this date range is wide open. This is literally every date in the table, 2008 to 2014, right? So this is every single date. Now I could have made this a little bit smaller or something, but you know, for dramatic effect, I have left all of, I have left the entire range in.

And if we run this, things are going to end up just about backwards, right? So, and this one, we do a very fast seek into the index, into the index that leads with post type ID.

And in this one, well, I mean, look, look at the times, right? The last one was about like 20 milliseconds that was for the faster plan. This one is seven milliseconds. And this one is back up to like a second and a half.

So even though like, you know, even though like the top one is great for this set of predicates, it was not great for the, for the non-selective post type ID set of predicates.

So which index you use or like how you design the indexes really does depend a lot on how people query the data. If you have people who are always querying for a narrow range of dates, and like, you know, you have a quality predicates, especially on very skewed data, like the way the post type ID column is skewed in this table, you would probably want, you would probably want to do, have your index lead on creation date.

But if you don’t know, right? Like if we, if you look at the, the distribution of the post type IDs, like some of these are selective, like down here, like 25,000 rows down to two rows.

Those are all fairly selective. These two are not very selective. So like you probably don’t want to equal server, like you probably don’t want your queries going to find all this stuff and then applying the range predicate second.

At least, you know, the way the query is written here, there’s, there’s, you know, probably more, there are more of, there are more verbose ways you could write the query where you could probably, you know, have pretty, you know, reliable performance from either one, but it would just take, it would take way more typing and probably another index or, yeah, probably another, probably different index that has the ID column in it so that you could like, you know, kind of like correlate, you know, find the post type IDs, find the creation date, sort of join that stuff together or use exists or cross apply or something.

But anyway, like, so like really depending on how people query the data, you might have, there might be some other considerations as of like how you design your indexes.

Now, one thing that you could do if you’re just not sure how people are going to query the data is maintain both indexes and do some work with dynamic SQL in order to have SQL Server go to the right index, depending on some stuff.

Now, the way I’ve set this dynamic SQL up, like you don’t have to tell me, like I know, I just didn’t feel like making this the longest demo in the world that, you know, there are going to be cases where neither of these are true, right?

Like there’s going to, there’s probably going to be cases, there could be cases where, you know, post type ID is not in one or two and the start date, end date is still less than two.

And then, you know, and like the reverse of that for the bottom one. So like, I know these don’t catch every single possible outcome. This is just to show you that you can make it work with the set of variables, right, the set of predicates that I’m passing in.

So you could have like another outcome for this where like, maybe you just like say from posts and let SQL Server pick the table, right? Like if you, like, there’s some other outcome to this, you could just say, otherwise just, you know, hit the post table, do whatever you want.

Maybe you could still keep a foreseek hint on there and you could just leave it up to SQL Server, which index to foreseek into. But if we run this query, we will see, make sure that’s created, that for both of our dynamic outcomes here, right, for both of the sets of predicates we pass in, we can direct SQL Server to use the correct index based on how selective various predicates are, right?

So for one of them, like all we care about is, you know, if the post type ID is one or two, like those are not selective. So we want SQL Server to go with creation date.

And then if post type ID is not one or two, then we want SQL Server to use the post type ID indexes. That would be, that would be very selective for all those other ones.

And again, you know, you would just, you would just need like some like fallback for this, but like else, you know, just from post with foreseek, do whatever you want. So anyway, this is just a little bit about some of the index design stuff that you might have to think about.

Remember that, you know, not every scan touches the whole table. Not every seek is quick and direct to a few rows. If you look, think about the queries that ran before, we did have seeks, but we had seeks that span like almost the entire post table, right?

For the creation dates, they did span the entire post table. For the post type ID too, they did a seek, but they did a seek to 11 million out of the 17 or so million rows that are in the table.

That’s generally not what you want, right? A seek that seeks through most of the table or all of the table is not like much of a seek at all, right? Might as well be a scan, right?

Just a scan that they spelled wrong and pronounced wrong and labeled wrong and I don’t know, a bunch of other stuff. So anyway, these are just some of the things that I find it helpful to think about when I am helping people design and tune indexes specifically for queries.

You know, if you’re just looking at missing index hints, stop. Like, if you’re just looking at missing index suggestions either in query plans or in the DMVs, just stop.

It’s not going to help you do this sort of fine-tuned work. Like, the, it’s, like, SQL service is not good enough at telling you what indexes would be really appropriate very quickly, like, prior to optimization, right?

It just doesn’t know. Can’t do it. It’s not that smart. Even, like, the Azure auto, auto index tuning stuff is still crap. Like, don’t, it’s useless. I don’t know, I don’t know how many, I don’t know how many hours they buried in that, but, I don’t know.

You gotta keep the summer interns busy, I guess. Well, anyway, I hope you enjoyed yourselves. I hope you learned something. I hope that you, you have found this video to be enlightening, both from the, the point of view of index design and 4C Kent usage and dynamic SQL usage and all that other good stuff because we did, we did, we did cover some ground in here, didn’t we?

We did, we did get through a few things. Anyway, thank you for watching. I’m gonna go take a shower. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.