SQL Server Performance Office Hours Episode 4

SQL Server Performance Office Hours Episode 4


If I’m trying to resolve a consistent deadlock by increasing the transaction isolation level, am I doing anything obviously wrong? I thought it was a bad sign when I was stuck between picking Snapshot and Serializable.
You have said that table variables, CTEs, Change Tracking, and Azure Managed Instances all suck. Do you have a full list of “features” to avoid?
My company wants to move to Azure, but Azure DB performance sucks. What do I do?
Any experience with using a UTF8 varchar collation vs an nvarchar data type? Seems like in most circumstances UTF8 is just unambiguously better for for performance and storage on net-new development on SQL 2019+
why SQL Server sometimes change it execution plans when maxdop change from n to n+1 (n>1) (eg maxdop=14 and maxdop=15). No data changes, no statistics changes, no confif changes… only maxdop changes between runs

To ask your questions, head over here.

Video Summary

In this video, I delve into some of the questions submitted by our community during Office Hours. Starting off, we tackled the issue of using higher transaction isolation levels to resolve consistent deadlocks, emphasizing that while Snapshot Isolation can be beneficial in certain scenarios, stricter isolation levels generally increase lock contention and are likely to exacerbate deadlock issues rather than solve them. Moving on, I discussed various T-SQL constructs like table variables and Common Table Expressions (CTEs), highlighting their performance pitfalls and the reasons why they might not always be ideal choices for complex queries. The conversation then shifted to Azure SQL Database and Managed Instances, where I expressed my disappointment with these offerings, suggesting that while there are occasional use cases, in most scenarios, Azure’s products fall short compared to alternatives like Amazon RDS.

Throughout the session, we also touched on the nuances of UTF-8 collation versus N-VARCHAR data types, exploring how they might impact performance and storage in SQL Server 2019 and beyond. The discussion underscored that while UTF-8 can offer advantages for certain datasets, its benefits are not universally applicable, and careful consideration is necessary to determine if it’s the right choice for your specific use case.

Full Transcript

Erik Darling here, and gosh, you know what? It’s time for another Office Hours, where I answer your user submitted questions from the bottom of my heart. If you would like to support the things that I do here for you on YouTube, on your tube, you can sign up for a membership. They’re cheap, they’re easy. It’s a fun way to say thank you. About 60 other people have done it. So far, so, you know, you wouldn’t be alone in your gratitude. If you don’t have any spare change, you can’t hit the tip jar, because you’ve already tipped your barista $30,000 this month, you are more than welcome to do other things like like and comment and subscribe. And if you would like to be a part of the Office Hours magic, if you would like to ask me a question and have me answer it, that link also down in the video description. So I save you a lot of work. If you would need help beyond just asking a question on the internet, which many of you do, I am available as a consultant with very reasonable rates for all of these things. Health checks, performance analysis, hands-on query and index, server tuning, you name it. Putting out performance emergencies and training your developers so that you have fewer performance emergencies.

All together. If you would like to get some training from me to sort of reinforce any of the concepts and topics that we talk about here. I have about 24 or 25 hours of training available for about 150 USD when you use that discount code. And that lasts for life. It is not a subscription product. If you die and are reincarnated, you would have to resubscribe. I cannot offer. I cannot offer. I cannot. I cannot offer. I cannot offer. I cannot offer. I cannot offer. I cannot offer. I cannot offer transfers of the training to new souls. That is beyond my capabilities. I am not an omnipotent being. So I’m sorry. You’re just going to have to deal with that. SQL Saturday. SQL Saturday. New York City 2025 taking place May the 10th with a pre-con on May the 9th. Some performance tuning stuff with Andreas Volter. You can catch those things coming up in three months about. Around about three months. Close enough.

Yeah. Yeah. Today’s November. Sorry. February plus three is May. So that’s five. Good. All right. Finger math. All right. Let’s do these office hours. Okay. We’re going to hit up Excel and we are going to zoom in. We are going to answer this first question right here.

If I’m trying to resolve a consistent deadlock by increasing the transaction isolation level, am I doing anything obviously wrong? I thought it was a bad sign when I was stuck between picking snapshot and serializable. Gosh, you have got yourself all twisted up, my friend. No, higher isolation levels increase the lockiness.

So first, let’s separate snapshot and serializable. They do have some guarantees in common, but the mechanism by which they provide those guarantees are much different. So if you’re using serializable or repeatable read to try and reduce deadlocks, you’re not going to have a good time. Both of those isolation levels just increase the likelihood of deadlocks because they are far more intrusive.

Snapshot could certainly help deadlocks between read queries and write queries and to write queries, assuming that you are set up to accurately use the snapshot isolation level for both of those. And you are prepared to handle errors when there are conflicts between them. Read committed snapshot isolation. Since you did not provide further detail, you just said snapshot.

There are two optimistic isolation levels with snapshot in the name, read committed snapshot isolation and snapshot isolation. And snapshot isolation, the SI, could certainly help with rewrite conflicts as long as you are prepared to handle those or conflicting write queries that are deadlocking. Read committed snapshot isolation would not help with that.

Would not help with to write queries, but would certainly help with read queries and write queries. There’s not really much else for detail in this one, so there’s not really more I can say here. But I would urge you to not use stricter isolation levels to reduce deadlocks because that will have the opposite effect.

So, yeah, careful there. All right, moving on. Our next wonderful question here is, you have said that table variables, CTE, change tracking, and Azure managed instance all suck.

Do you have a full list of features to avoid? Well, we do have to separate these things into some categories. So, table variables and CTE are T-SQL constructs, which require a very carefully informed choice to pursue the usage of.

Table variables have many issues, performance issues when it comes to SQL Server. You can’t insert, update, or delete using a table variable. So, like with the table variable as a target, using a parallel execution plan.

So, if you have like a big heavy hitter insert query to find the data you need to put in there, having it run with a parallel query plan is probably going to be a lot better than having it run with a single threaded query plan. Table variables on their best day only get table level cardinality. You don’t get a statistics histogram with a table variable.

So, two fairly big downsides for the serious use of table variables in queries that have high performance demands. They can be very good for high frequency execution queries where the ultimate plan shape doesn’t affect query performance because all the stuff SQL Server does to temp tables to make them useful for performance reasons with more demanding queries doesn’t get done to table variables.

So, the caching, creating statistics, stuff like that. Change tracking, I mean, yeah, I tend to hate change tracking. I like change data capture a whole lot better.

And Azure managed instance is, it’s just a, it’s such a sad story with Azure managed instance. I really had high hopes for it and boy did that let me down. But these are all quite different categories of things.

I don’t really have a full list of things that you need to strictly avoid. I would, I would put Azure SQL DB and Azure managed instance on that list. But we’ll probably talk about that a little bit more when we get down to the next question.

Azure managed instance isn’t really a feature. Azure, it’s just a crappy way to host SQL Server. So, no, I don’t really have a full list of stuff.

But I do try to, you know, put it out there that, you know, all of these things are in the product for a reason. There are often reasonable uses for things. My main goal as a consultant though is performance tuning.

My job is to make queries go as fast as possible. And my disdain for a lot of the things that I talk about comes from the fact that they often don’t have good performance repercussions when used. So, like CTE in SQL Server, there’s no way to materialize them.

Unless you just dump them into a temp table on your own. Pound sign temp table. When you re-reference them, the whole query inside them executes over and over again.

That’s often not the behavior that a lot of people are after. So, like my experience with these things is generally just from a performance point of view. There are perfectly good reasons to use them when performance either doesn’t matter or when they’re not causing a performance.

They’re not the reason for a performance problem. Change tracking bugs me because it adds overhead to every single insert, update and delete. You have to not only maintain the table, whatever triggers are on there, whatever index views might have that table included in them.

Whatever foreign keys might be evaluated when that table gets modified. But, like all the indexes on that table. But now you also have to maintain the change tracking tables.

And like that all just sort of, like it just adds up and adds up and adds up. You’re just not doing yourselves a lot of, like any favors by adding more overhead to every transaction. That’s why I like change tracking because it’s asynchronous.

It works off the transaction log after things get done. So, like, you know, for like the T-SQL stuff, you know, again, if you’re going to use whatever, however you’re going to write your query, make sure that you are completely informed about the choices you’re making and the way you form the query.

Whenever you want to add in, like, a new thing, whenever you want to make SQL Server start doing a new thing on top of all the other stuff it’s doing, just remember that that’s not going to be free, right? Like, absolutely nothing is free.

Like, auditing stuff isn’t free, whether it’s triggers, like a built-in audit, you know, monitoring data changes, whether change tracking or change data capture, neither of those are free. Like, even change data capture, which I prefer, it’s a lot more data to hold usually, right?

Because you have this, like, shadow change copy of the table living in the CDC schema. Like, nothing is free. There are times when you can use these things and they will not have any awful performance repercussions on your server.

But, like, at some point, like, everything, with the exception of managed instance, and, like, I don’t know, if you have, like, a useless toy 5G database workload, maybe general purpose managed instance will do okay for you.

If you have actual needs, you’re gonna have to pay out the nose to the back teeth, up, up, up the backside, money-wise, to get the business critical version, which just isn’t worth it. Like, it’s just not worth it.

It’s too much money for what you get. And adding the fact that Microsoft support is, like, just a bunch of blind rats, like, running around, bumping into each other. Like, there’s no point in using Azure for anything.

So, but, you know, we’re getting a little bit ahead of ourselves. But, like, these are, like, you know, the way you write the query, that’s one thing. This is, like, you wanting to add something into the workload with, like, temporal tables or something.

Like, all these things have overhead and all these things have drawbacks. But let’s actually answer this next question. My company wants to move to Azure.

Lo siento. Microsoft must have offered them a sweet deal if they want to move to Azure. But Azure DB performance sucks. What do I do?

Like, really, it depends on how sweet of a deal they’re getting. Microsoft is not below or above bribing people to use their services. Like, cutting them deals, offering them money for, like, like, consulting, like, fake consulting bucks that they can, like, pay people, like, Microsoft partners to, like, do work with.

There’s all sorts of stuff that Microsoft will underhandedly do to get people to start using Azure so that they can have a bad time once they’re there. You know, and, like, really, like, the only alternative cloud provider that I think feel like does things right is Amazon. RDS is a much better product.

Not perfect, but I think a much better product than Azure SQL DB or Azure Managed Instance. And, you know, like, the VMs in Azure are, I mean, they’re probably about fine. But I think what kills me is every time I think about it, like, Microsoft owns SQL Server.

They could have just run away with an awesome hosted managed product for SQL Server. They could have run the board with it. But Amazon just did it so much better.

I don’t know how that happened. I don’t have an explanation for you. But, yeah, you’re right. Azure SQL DB and Azure Managed Instance are pointless, pointless products. I wish they’d disappear.

The only thing that you could possibly do is make, like, either a feature or performance comparison case with Amazon RDS, which I assume would be commensurate to what your company is looking at in Azure. But if Microsoft is just throwing money at people or throwing service at people, you might not be able to avoid it.

You just might be stuck. Where your company is going to be pissed is when, like, after a while, they start, they realize that, like, you know, either the Microsoft money runs out and the bill goes up to normal or they find out that, like, that money is not going to go as far because the stuff that, like, where you land in Azure wasn’t nearly what you actually needed in Azure. And all of a sudden the price goes way up, right?

Like, that’s where people start getting pissed. And then, you know, it costs an arm and a leg to move from one cloud to another. So, like, the only thing you can do is, like, get way ahead of the curve and offer, like, a price performance feature comparison and just pray that Microsoft hasn’t bought them already. That’s all you can do.

Next question here. Next question here. Any experience… There we go.

Any experience with using a UTF-8 VARCAR collation versus an N-VARCAR data type? Seems like in most circumstances UTF-8 is just unambiguously better for performance and storage on net new development for SQL 2019 plus. Yeah, so the new UTF collations are interesting because they’re still VARCAR, but they have, like, the extended character sets.

Not the special characters, but, like, they have a lot of, like, the Unicode… Like, they can have a lot of the Unicode characters in them. I am not a big collation person, so I actually, like, doodle around the internet a little bit trying to find, you know, some information that I could be useful to pass on for you.

And what I got over here is… I’ll put these in the video description. There’s a very good article on collation and Unicode support on the Microsoft Docs site.

There’s also a pretty good question on the DBA Stack Exchange website with a couple pretty good answers. One of them by a fellow named Solomon Rutsky, who is a big, big, big collation person, where they go through… Where they both talk about some stuff.

What I found really interesting in this answer was this chunk of stuff. And it probably helps to have a little bit more of this front first. So, like, where…

So, like, the only scenario that truly benefits from UTF-8 encoding is one in which all of the following conditions are true. And these are the following conditions. Data is mostly standard ASCII values, but has a small amount of varying range of Unicode characters. The column is currently or would otherwise be in VARCAR Max, meaning it wouldn’t fit into VARCAR 4000.

There’s a lot of data for this column or set of columns. It’s one gig or more when stored in VARCAR. So, I don’t know.

Once you start getting into some of the longer stuff with a little bit more nuance, I don’t know. Unambiguously better, I don’t know about that. But, you know, you might check enough of the boxes here that you might have a better time with it. So, potentially yes, but I don’t know for sure.

Alright. Let’s come back over here. And let’s try to answer this one. This is an interesting one.

Why SQL servers sometimes change at execution plans when max.dop change from n to n plus 1, n greater than 1? E.g. max.dop equals 14 and max.dop equals 15. No data changes, no statistics changes, no confif changes.

I’m gonna assume that’s config. I’m gonna take a wild guess at that one. Only max.dop changes between runs. So, I’m trying to parse this a little bit.

What I’m unclear on, a little unclear on, is if you’re asking me why the dop of a query might change if a plan changes, or if the execution plan changes if you specify different dops. The max.dop thing, like if you specify different dops for a query, you can certainly see different query plans. I actually have a blog post about that.

And I’ll put this in the show notes as well. And this was something that sort of surprised me a little bit. What I found out of it was that, like, the execution plan changed drastically when going from dops 1 to dops 2 and then above dops 2.

The reason why I found it interesting is because, you know, like, when you’re learning about parallel execution plan stuff, like, and you know, you start sort of like learning the process for it. What happens is someone will inevitably tell you that all queries start as serial execution plans, which is true. And if the cost of that serial execution plan is greater than your cost threshold for parallelism, and there’s nothing like a scalar UDF or an insert or modification where the table variable is the target that would prevent parallelism for the entire plan, then SQL Server will explore parallel execution plans.

If the parallel version of the serial plan is cheaper than the serial plan, then SQL Server will choose the parallel plan. Now, what kind of gets lost in there is that SQL Server, when it starts exploring parallel plans, will explore different parallel plans from the serial plan. It’s not just like, if we add parallelism to the serial plan, like, is it cheaper?

Like, though that is a choice, SQL Server will also explore other parallel plans. So the query plan for this thing at dops 1 looks like this, right? You have a couple scans, merge join, top, scan, oops, scan, another merge join.

And then at dops 2, the query plan looks like this. You get a couple scans, you get a hash join, you get a top, you get another top. And then you get over here, you get a nested loops join and a seek.

And then at dops higher than 2, the query plan changes again, where it almost looks like a parallel version of the original serial plan. Remember, this is the original serial plan, where it’s scan, scan, scan, merge join, top, merge join, right? And then the parallel version of this plan from dops 3 to 8 is scan, scan, merge join, top, scan, merge join, but now just with parallelism operators in it, right?

So if you change the dop of your query, if you say, you know, run it max.14, now run it max.15 or whatever, SQL Server might choose a different plan because you have more, like with more threads involved, a bit like the shape of the plan might make sense to do other stuff. And remember, you’re still going through all the cost-based optimization stuff at these different dops. And if you’re using different dops, you’re essentially telling SQL Server to make a new plan.

And so SQL Server is like, well, I’ve got this extra thread, and this other plan might be better with this other thread. I can do this more efficiently. Like, so, you know, it’s like you’re still going through all the cost-based optimization stuff.

The different dops just give SQL Server, like, another thing to think about what it might do some different work with. Now, you say that, like, data hasn’t changed, statistics haven’t changed. A lot of people say that, and it’s not exactly true, but I’ll go through and believe you on this one.

Most likely it’s just going to come down to SQL Server compiles a new plan, goes through cost-based optimization, right? Because you’re changing the allowed dops so SQL Server isn’t going to reuse a plan. It’s like sending a new query in effectively.

And when SQL Server goes and evaluates different plans with, like, even just one more dops, right? Like, this is dops 2, right? Where SQL Server makes these choices.

And then dops 3 to 8, SQL Server makes these choices. So you might be hitting, you know, one of the SQL Server’s famous tipping points where going from dops 14 to dops 15 changes SQL Server’s mind about, like, what would be the cheapest plan during cost-based optimization. So that’s what I got there.

But anyway, that brings us to the end of these five questions for this Office Hours episode. I’d like to thank you for watching. I’d like to think that you enjoyed yourselves.

I’d like to think that you had a good time. And I’d like to think that I’ll see you again in the next video. But I’ve been wrong before. I’ve been misled. Led astray.

Lied to. Let down. It’s a sad, sad life. It’s a sad life. Anyway. I’m going to delete these and try to figure out what to talk about next.

Find some nice uplifting SQL Server messages for everyone. But anyway. Thank you for watching.

Goodbye. Hello, I’m my friend. Bye-bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Isolation Levels

All About SQL Server Stored Procedures: Isolation Levels


Video Summary

In this video, I delve into one of SQL Server’s most misunderstood concepts: isolation levels. As always, I aim to clear up common misconceptions and provide practical insights based on my extensive experience working with these settings. I start by explaining why read committed snapshot isolation (RCSI) is often the best default setting for most workloads, offering consistent results without the blocking issues that can arise from the default read committed isolation level. Throughout the video, I demonstrate how different queries behave under various isolation levels using a simple example in the `isolation level test` table. By showing you real-world scenarios and their outcomes, I help you understand when to use or avoid certain isolation levels based on your specific needs and workloads.

Full Transcript

Erik Darling here with Erik Darling Data, subsidiary of Batsmaroo, Inc. And in today’s video, we are going to discuss one of my favorite, least favorite topics when it comes to SQL Server store procedures, and that is isolation levels. The reason why it’s my favorite topic is because I am right about them. The reason why it’s my least favorite topic is because everyone else is wrong I spend a lot of my life trying to impose correctness on people where they are currently incorrect and struggling with just how incorrect they have been for their entire lives when it comes to isolation levels. Now, one thing that’s important to say up front is that everything I’m talking about here, of course, is specific to SQL Server. Nothing that I see here becauseэт Take it ways that there. It’s a good to remember to escape terrllen’t a little uあと. ecstall that a little bit of a little so that SQL Server for using a bit. ESPTest server stuff. But before we talk too much about isolation levels, we need to talk about our current financial arrangement. Mostly, I believe you should give me money. If you would like to become a paid member of this channel and support these videos, you can do that using the link in the video description. And for as little as $4 a month, you can maybe get me a better haircut eventually. If you like this content, but you’ve spent all your money on hair products, perms, bleaches, whatever it is you people do with your hair out there, you can do all sorts of free stuff like like and comment and subscribe, which also won’t get me a better haircut. But I don’t know, it’ll maybe blow my hair back. That would be cool. If you want to ask me a question privately that I will answer publicly, you can go to that link, which is also noted in the video description, and you can submit a question. And I will answer it here on a YouTube video to the best of my ability.

I am primarily focused on SQL Server performance questions. But I get a nice smattering of other stuff in there too. So you know, whatever works. If you need more help than a YouTube Q&A or video can assist you with when it comes to the performance and general well-being of your SQL Server, I do all of these things for money at a reasonable rate. I promise you, you will not be disappointed.

I guarantee your satisfaction. Another thing that I will guarantee your satisfaction with mostly is all of my training content. I have 24 hours of it available. And with that discount code, the final price tag is about 150 USD. And you have that for the rest of your life. There is no subscription required. But of course, if you want to keep giving me money, I will not argue with you.

That is one thing that I have learned not to do is when someone wants to give you money, you don’t say no. Strangers, whoever it is, say yes. Say yes to the dress money. Anyway, SQL Saturday, New York City is coming up May the 10th, taking place at the Microsoft offices in Times Square. There is a performance pre-con by the lovely and talented Andreas Volter taking place on May the 9th. That is a full day session. And you will see me on both days, assuming that my health and well-being maintains through May the 9th. So far, so good. Knock on that wood. But yeah, it would be lovely to see you there. We can give an appropriate amount of affection to each other, whether it’s a handshake or high five or fist bump or elbow bump or whatever you’re into. Whatever it is. Just no tongue stuff, because that would be cheating. Anyway, let’s get on with the party here and let’s talk about isolation levels. Now, I’ve talked about isolation levels a lot on this channel because there is a lot to talk about with them. There’s a full playlist called Everything You Know What Isolation Levels is Wrong, where I talk about many things at great length. If you would like that, you can go watch that for free. You know, finish watching this thing first. This is a nice video, but you can do that.

For most workloads that I see and work within SQL Server, read committed snapshot isolation at the database level is generally the way that things run best. With no interference, the majority of read queries and write queries will stop blocking and deadlocking and fighting with each other.

And you get more consistent results than would provide if you were to take the coward’s way out and just throw no lock hints everywhere. Because it gives you what most developers would expect from a database. Read queries won’t block write queries. That can happen under read committed, the default non-row versioning version of the default in SQL Server. And read queries, of course, get blocked by write queries under that one. And of course, read queries and write queries will deadlock with each other. Another nice thing that you get out of read committed snapshot isolation is consistent results. Like I said before, if you take the coward’s way out and you use no lock hints, you can get very inconsistent results. We’re going to look at a lot of this stuff in the demo portion of this video. You only get one. This is really important. You only get one of those things from read committed, the default isolation level. But you get all of them with read committed snapshot isolation. Read committed, the default isolation level, does not make any very strict guarantees. And of course, your read queries and write queries get into all sorts of terrible blocking and deadlocking entanglements. So the thing that I want to talk about, you know, I’m going to get a lot of today is sort of like, when you’re writing queries that read data, you should have a pretty clear idea in your head, how you would what results you would expect from those, like, like, what results you would expect those queries to return, when they’re running in concert, with modification queries, right, you need to understand, like, what is an acceptable level of fault tolerance for this query to return results quickly. And what results do you want to get back? Like, what results are you okay with this query? Get it, right? That’s an important thing. So I’m going to use the crap database for this one. I’m not going to use Stack Overflow, because a lot of the times when I make a database settings change like this one, I forget to reset it. And then I go to do something else. And I’m like, SQL Server changed. So we’re not going to do that. I’ve got a store procedure that just drops and recreates a table and populates it with some data. I’m just going to use that for convenience down here. Right? So we can we can run this, and it will reset this table called isolation level test. And it’s going to put some rows in there with some numbers, right? So not a big deal. It’s not anything that we need to care about too, too much.

But what I want to show you is different ways that different queries behave under different isolation levels, because that’s important stuff. So over in this window, I’ve got this I’ve got this quoted out for actually, you know, we don’t need this one. We’re not going to not going to get too far into that. But actually, you know what, I do want that for another thing. So we’re going to leave that there. Actually, I changed my mind. What I want to show you first is just that like, this select query running with no lock, nothing else is happening. So I can run this a million times, it’s always going to return return 84,000. Where you have to start asking questions about no lock is like, at what point in queries doing stuff, do you care about this, like seeing like, like uncommitted transactions, right? So like, if we come over here, and I start this transaction to update data, where ID equals nine, the no lock query isn’t going to get blocked, but it’s going to see that transaction uncommitted, right? So this this this number changed from 84,000 to 75,000.

Because we set some amount to zero for this one, then if I delete like all in the same transaction, if I delete ID 15 from the table, and I rerun this, this can immediately see the effect of that delete from the uncommitted transaction. And then if I run this final insert here, and I stick a row in there, like, again, this transaction still hasn’t committed, we still have this, like, this is still in progress, no lock can see, like all of that stuff, like up to like up to when that commit happens, right? So like, that commit finally happens. And like, I can now run this and I get 60,001 over and over and over again, that might not be what you want to happen. Like if you if you think that the correct results are, like, you should only see the end result of this unit of work, no lock slash read uncommitted are not for you, right? If this unit of work means like this signals that like you have completed work on the table in a way where now SQL Server can see the results. No lock is not a good choice, because you can see all the intermediate results from within this transaction.

That’s generally not a good thing. That’s generally not a good thing. If there is any like, like, if anything is going to happen in the world that’s important, based on what the like, like the query results are, not a good choice with no lock, right? So let’s reset the table. And now let’s look at read committed, right? So this database has read committed snapshot, read committed snapshot isolation enabled for it. So if I want SQL Server to obey the semantics of the default read committed isolation level, I need to add this read committed lock hint to both of these queries. Now this go like both of these are going to return the same results of 84,000. Right? So that’s fine. But let’s start, let’s start this transaction over again. And let’s run this update. Now, if I run this, we’re going to get blocked here. Right? So this query is blocked. And now this this delete, I can run this in this transaction. And this this insert, I can run this in this transaction. And now when I commit this transaction.

These queries are both of these queries, these queries are both going to return 60,001, which is like, you know, if you wanted that whole thing to be treated as that unit of work, cool, right? Because you got the final result after that unit of work completed in both of these queries sock. Where things get weird with read committed, though, is when things start happening in like different orders, where maybe like, like, you get blocked at one point, but then data changes in the in the in the table from data you’ve already read or data you haven’t read yet. So what I’m going to do this time is a little bit different. I’m going to begin the transaction. But instead of instead of running the update first, I’m going to run this. Right? So I’m going to up I’m going to delete I there. And I’m going to hit F5 here. And this query gets blocked. Now rather than getting blocked on ID nine, now this thing is blocked on ID 15.

So if I update ID 16. So if I update ID nine here, and then I insert ID 16 here, and then I commit this, these two queries are going to return slightly different results, because this query had already read data from the data from the table and said, I have these results. Right? So like it read that data calculated result from that data got blocked and then data changed from data that already read. So read committed has like, again, very weak guarantees and probably not something you want to rely on. Of course, now that everything is done, I can run these and they both get the same results. Right? So let’s, let’s make sure that that’s fully committed. And let’s reset the table. Now, where like other isolation levels, like serializable and repeatable read, you would get sort of similar results as read committed lock, except if you change data that these queries had already read, like if I did, if I could, if I do the update first, and then change the other stuff, both repeatable read and serializable will get blocked on ID nine, where the update happens, and then read results after everything changed. So it was like the initial run through a read committed lock.

Where that would be different is if I tried to change data that either one of those had already read, then you would probably get a deadlock that those two isolation levels are much more strict. But they’re also much more likely to cause deadlocks when you start like intermixing reads and writes because you have reads now all of a sudden holding locks. Read committed doesn’t hold read locks anywhere takes them and releases them very quickly. That’s why you can get data jumping around in your table, even when like even under read committed, right? So that’s that’s that. Now, if I take out both of these, both of these hints, right, both of these locking hints, and I run this, we’re back to 84,000, because I reset the table, where this is important now, and actually don’t need the second query for this one. Right? If I do this, right, I’m getting 84,000 back from this query, this query right now, because I changed the database default to read committed snapshot isolation. Now I’m using an optimistic row versioning isolation level. So what will happen is if I come over here, and I do like I could do repeat any of these in any order, and it would still be the same, but I’m just going to show you in like the order that I’ve done most of the other demos in. If I do the begin, tran and update, this query won’t get blocked, but it won’t be seeing data changes from that unit of work.

So the question you have to ask yourself is, are you okay with this data returning results from how this data looked prior to this unit of work happening, right? Because I ran the delete, and I ran the update, and we’re still getting 84,000 back. Right? So this is still returning the same number as we were getting when I put data fresh into the table. So the question you have to ask yourself is, when this happens, do I want my query to get blocked and wait for that unit of work to happen?

Or do I want my query to return results as you saw before that unit of work began? Now if I come over here and commit this, right, this is fully committed now. If I come and run this, now I’m going to get the updated result from 84,000 to 60,000. So the question to ask yourself is, are you okay with queries seeing data that’s in flight within this unit of work?

If you’re not, if you need data processing to finish before you want to report changes from that unit of work, NOLOC and read uncommitted are not for you. If you want to, if you want your queries to wait and return results from after this unit of work is complete, then you would want, or rather, then you would want a locking isolation level. Like meaning you would need read committed lock, which is like read committed the default database isolation level for SQL Server.

Depending on if you’re okay, and like, but again, very weak guarantees there, data can move all around in the table when that happens. So like, like those read locks get taken very briefly and then released. So if you’re okay with like data in the table changing all around your query, read committed might be okay for you.

If you’re not, you would probably need either repeatable read or serializable, but you’re going to see a lot more deadlocks and a lot more locking activity with those. Because those read locks like are held onto, right? SQL Server doesn’t just immediately release those the way it does with read committed locks.

If you’re okay with your query seeing the, like the last known good set of like data in the table until that unit of work completes, but your read queries finish immediately, like someone just sitting there like refreshing. And then this unit of work finishes and they get the updated result. Well, then that’s where optimistic isolation levels are okay for you, right?

That’s where you would want to choose is read committed snapshot isolation. If you want it to be the workload as a whole or snapshot isolation, if you want to choose specific queries to behave that way. So rather than thinking about, you know, like thinking sort of theoretically about which isolation level is okay for you across the board, think about the way you want most of your queries to function.

And then think about like when you’re writing a specific query, like what you’re okay with that query being able to see as other units of work, other queries, in other words, in the workload are doing their units of work, right? So like you need to think more about like expected outcomes and what’s a correct result and what’s not a correct result than specifically like which isolation level works everywhere because no isolation level works everywhere. Some isolation levels work great.

Like what I’m saying is optimistic isolation levels work great across the majority of queries and workloads that I see. There’s only some very specific parts of that work, those like only specific queries, only specific parts of that workload that have stricter needs where an optimistic isolation level might not be the best choice. So think about that stuff when you are thinking about what is appropriate for your query to be running under.

Most of the time it is not going to be, excuse me, no lock or read uncommitted. Most of the time it is going to be either what you would see in effect from an optimistic isolation level or what you would get at the end result of read committed, repeatable read or serializable. So anyway, I hope you enjoyed yourselves.

I hope you learned something. I hope that you will watch my full playlist. Everything you know about isolation levels is wrong because I go into a lot more depth on a lot of these different topics, much more than I can fit into this one video. But it’s a good place to start learning more.

Start or progress upon your long journey to being correct about isolation levels. It took me a while too, but eventually with enough trial, tribulation, practice, reading and thrashing my hands against the post, I eventually got things I think mostly figured out.

So we have that going for us. Anyway, where was I? Thank you for watching. Oh, I enjoyed this and I watched the other thing. Yeah, cool.

All right. Well, that wraps this one up. I’m going to get demos and everything prepped for the next video, which hopefully you will stay tuned for. And with all that, once again, thank you for watching. Goodbye.

And I hope you’re happy. Bye. Bye. Bye. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Some Notes On Locking Hints

All About SQL Server Stored Procedures: Some Notes On Locking Hints


Video Summary

In this video, I delve into the fascinating world of locking hints in SQL Server queries, addressing some common misconceptions and highlighting their practical applications. I start by discussing how even with a good index in place, locking hints are merely suggestions to SQL Server, which can escalate locks based on the situation. This leads us to explore useful scenarios where locking hints are particularly beneficial, such as preventing race conditions under read-committed isolation levels and ensuring fully parallel inserts into temporary tables. I also cover some potential pitfalls, like how clustered indexes or identity columns can interfere with parallelism, and demonstrate practical examples using T-SQL code to illustrate these points.

Full Transcript

Erik Darling here with Darling Data and got, I don’t know, mildly educational video here about some stuff with locking hints and SQL Server queries. This is by no means a deep dive into every single locking hint. Most of them are fairly easy to figure out, but I do want to show you some misconceptions about locking hints. And then some useful locking hints, situationally useful locking hints. And then some potential gutches with a couple other locking hints. Alright, so that’s what we’re going to talk about here. I probably should have progressed that slide before I started yakking about locking hints. But anyway, before we get into all that nonsense, let’s talk a little bit about the health and and and I class, thatare.

support my efforts to bring you these interesting educational SQL Server videos you can sign up for a membership right about here in the video description for as few as four dollars a month pre-tax before before it comes to me it does get taxed YouTube does take 30% out of that so I’m looking at basically like three one dollar scratch tickets and a Lucy from the bodega for every subscribe for every paying member if you can do that great if you have already blown your financial load this month on scratch tickets and Lucy’s at the bodega well you’re a person after my own heart and you would have to do something a bit less expensive in order to show your undying loyalty to being a data darling you can like you can comment you can subscribe and if you would like to ask a question on office on my office hours episodes where I take a question excuse me that you ask semi privately and I answer it very publicly this link is also available down in the video description as well and you can submit a question and I will answer it assuming that it is legible and vaguely understandable if you need help that goes beyond just asking strangers random questions on the internet I am of course available to do consulting for SQL Server all sorts of good stuff health checks performance analysis hands-on tuning responding to performance emergencies and of course making sure your developers smarten up and start not causing performance emergencies so all of that stuff is available on this beautiful face if you would like some recorded versions of this beautiful face teaching you about SQL Server you can get all 24 25 hours of my performance tuning content for about 150 US dollars and that will last you the rest of your life good for you there upcoming events we are still looking forward to SQL Saturday New York City launching May the 10th taking place in Times Square beautiful crime-free clean friendly Times Square where you can come and you can go to the Microsoft offices there and you can you can learn all about all sorts of data stuff so that’s a that’s a happy that’s a happy time for everyone you can get lunch I’m served by me not made by me but I’ll I’ll hand you a bag maybe maybe I’ll just let you take a bag if you’re really nice I’ll let you take two but you got to be nice anyway let’s talk about some locking-in stuff because it is fun and interesting now what I want to show you here is that even with a good index in place and even doing something like this where we are limiting the number of rows that come out of that we are like planning to modify the roll lock hint for a SQL Server is merely a suggestion all right so if we do this and we run the code in here SQL Server does not SQL Server rather SQL Server does maybe start taking roll locks but it does escalate those roll locks up to an object level lock if we were to change this to a much higher number like 90,000 and we were to rerun this SQL Server you’ll see that you know in this case SQL Server does actually take key locks right we do that stuff in here right so here’s all our key locks and well I mean there are some some page locks as well but all of the all of the X locks right we have IX locks on pages and objects and stuff those aren’t like the that those aren’t like the real exclusive locks but the X locks for these we do have those actually happening here right so for the actual X locks those are only taken at the row level but for everything else we have just intent exclusive but when we had like like when we had to sift through more stuff in that that that top 5,000 query when we had that set to 30,000 SQL Server was like well that’s a lot of key locks that’s a lot of roll locks we’re escalating that to a table level lock right remember that locks go from row to object or page to object they do not go from row to page to object so even the row lock hint can be ignored by SQL Server where I end up or rather the where I see people like forgetting that they need lock hints is covered in another great post by my dear friend and Canadian Michael J Swart I look forward to you joining me as an American in the 51st state my friend see you soon so we’ll have passport free friendship we do like I do see people messing this up a lot so when you use merge and that merge has multiple actions right tip cop most commonly update insert right if you just have a single action merge I mean granted there are like some interesting use cases for for that for doing that with inserts but usually like a combo merge thing you do need to either set the transaction isolation level for that merge statement to serializable or hint the serializable isolation level with your your the table you’re merging into okay so important stuff there a place where I love using locking hints is in places where people did not know that they were subject to potential race conditions in their code I’ve talked about this a bunch in some recent videos so I’m not going to spend too too long on it here because we have some other stuff to cover but if you are doing anything that could be negatively affected by concurrency either while your query is running remember read committed not a very strict isolation level allows all sorts of other stuff is not a snapshot of your data and you need to make sure that you are actually getting the current top thing to do something with like this then you need to make sure that nothing else can mess with your data while you are in the process of finding that data right so you need locking hints to prevent race conditions even under read committed the default pessimistic isolation level brace conditions are not alone with often in optimistic isolation level land it can happen to you anytime anywhere another place where I lay I end up using locking hints a lot is when I am putting data into temp tables now table variables without doing some real crazy tricks you can’t get a parallel insert into them but when I’m dumping data especially a large like a fairly large amount of data into a temp table well I usually want to make sure that I get a fully parallel insert while that’s happening there are some things that will prevent that for example if you have a clustered index on the table if you have a well clustered primary key but really any nonclustered index right or if you have an identity column on your table you will mess up SQL Server’s ability to do a fully parallel insert it’s worth noting that if you need something to behave like an identity column in your temp table just replace the I mean you can just like remove the identity attribute from whatever column that is and just use row number in your select list to generate that auto incrementing number row number does not have this limitation so you can get around a lot of problems with that so it’s any of this stuff of course any nonclustered index like I said that’ll also mess you up but we’ll get to that in a minute so let’s run this query and these there is not a tremendously big performance difference for what I’m doing here but there is but the execution plan is a good enough example and there is some benefit that you can see from these so in this query plan note that we do have a parallel a scan of the clustered index right we can have a little parallelism operator here but then we gather streams almost immediately afterwards and the insert into the temp table is done outside of the parallel zone right this marks the end of the parallel zone right here and this happens single threaded this all takes 1.4 seconds we can see that right there now we can get around that by adding a tab lock hint to the insert so again no no clustered index no identity column no primary key no nonclustered indexes it’s got to be a heap for this if we run this we will get a fully parallel insert and this will shave a bit of time off that notice now the gather streams operator is over here the insert is with before that and it has our little racing stripes on it and rather than taking like a second and a half it takes about 800 milliseconds so this is not for a ton of rows obviously the more rows you get involved with here the better this is as a hint to including your queries you can get very very similar behavior by doing select into right so we’re going to do we’re going to have a little select into magic here and we will see just about the same just about the same execution metrics on this one you know still 800 milliseconds with the table insert within the parallel zone there’s our little parallel racing stripes and the parallel zone ends here so you know I do get questions sometimes like hey is it better to create your temp table or just do select into and you know my answer is that it is somewhat situational I am generally okay with select into but you should be careful with the way SQL Server interprets with the way SQL Server interprets expressions with select into remember that like the beauty of creating a table is you fully control the data types the data lengths for strings the nullability of columns you get to pick all that stuff when you create the table you don’t really you don’t really get that with select into when you if you if you start like having like case expressions or you start concatenating strings to together or like you know you’re doing any math in your select list there’s all sorts of stuff where SQL Server might either implicitly infer or convert your expression to some other data types so if you are getting weird inconsistent results with select into in that regard you may want to think about just creating the table with the types you want but if you’re doing select into and you’re getting the parallel insert and then you do the create table insert in your plan slows down at all you most likely can get around that just adding a tab lock hint to the to the to the to the insert so that you get the fully parallel insert which is commensurate with the behavior when you do select into and like I like this a lot as a sort of a thing now some one like one thing that I always get questions about because like you know like a big part of my consultancy is helping people get away from bad habits and you know of course no lock is a fairly well known best habit and while we’re while I’m talking through you know various ways to get rid of no lock is generally like optimistic isolation levels are a much better idea people are like but dirty reads I’m like but wrong because optimistic isolation levels don’t allow dirty reads no lock and read uncommitted allow dirty reads optimistic isolation levels you can have like stale reads when you are like read like if you’re reading an older version from the version chain then you would expect right because you’re not reading potentially the most up-to-date version of the row but you’re not getting dirty reads but what a lot like you know where the conversation always seems to go is like well if I take out the no lock in to get blocked yes if I use no lock I can get the I can get a row back yes but what’s in that row and then ultimately like someone will ask well what about read past what does read past do read past is situally situationally a very cool and interesting hint if you’re not that’s why you’re not worried about what’s in the right place if you’re not doing something where you’re not doing something where you truly don’t care about rows that are locked you’re okay with skipping over those so what I’m going to do here is just run this simple update for one row in the users table right so begin train update increment reputation by one where user right where the ID equals 22656 so we run that and this has an open transaction and we’re often running and over in this window I have three queries I have this top one query right where I’m saying give me the top one user by reputation descending this will get blocked right because this thing is open and doing something this query with no locking hint gets blocked if I do read past well notice that I get this row back here right before I go on to no lock I want to show you what happens when I roll this back right so I’m going to roll back that transaction if I run read past with this now I get John skeet with a reputation of 1 million 47 thousand 863 right so this is the top reputation in the table coming back over here if I run this update and I do that with the transaction open and running read past skips over John skeet and I actually get the second highest reputation in the table if you’re okay with this if you’re okay with skipping over locked rows and just getting rows and just getting rows that are not locked return to you read past might be okay but that may not be a okay with your users right like that might not be what they’re quite after if they’re looking for stuff like this and of course with no lock well if we run this now if we run this now if we run this now if we run this now and we look we do get like John skeet again but we get John skeet with his reputation incremented by one is this a big deal for this hmm but if you had an optimistic isolation level what you would get back is his original reputation which is of course make sure that’s done this right so is no lock a big deal here probably not is read past a good fit here probably not is this query getting blocked annoying very much so but uh you would get around all of any any sort of like questionable discrepancy if you were using an optimistic isolation level here under both read committed snapshot isolation and snapshot isolation you would still see John skeet and you would still see him with the high reputation that ends in three rather than the one that ends with four right so his reputation prior to the update is that always going to be correct maybe maybe not uh the the important thing to understand here is that uh not every query needs the same locking guarantees and granularity uh there may be times when you can get away with no lock there may be times when you can get away with read committed there may be times when you can get away with a read past hint i find that most workloads work best under read committed snapshot isolation with specific queries targeted to use a slightly different uh isolation level or locking um locking mechanism when they’re reading through data so that part is up to you right if your workload is like you know if your database is under the default read committed isolation level and the only way for you to get queries to return anything ever is to use no lock hints you are much much better off with an optimistic isolation level than you are continuing with the no lock hints so that’s about it here uh that i mean that that is the end of the demo file you can tell there are no more line numbers after this so yeah we have run out of things to talk about so thank you for watching i hope you enjoyed yourselves i hope you learned something and i will see you in uh another video another time we will we will talk about some more store procedure stuff so thank you for watching goodbye

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Transactions

All About SQL Server Stored Procedures: Transactions


Video Summary

In this video, I dive into the world of transactions within SQL Server stored procedures, exploring their use and limitations. Erik Darling from Darling Data shares his insights, emphasizing that while transactions are crucial for error handling and ensuring data integrity, they come with significant locking ramifications and don’t offer the snapshot isolation many might expect under read committed. He also delves into practical considerations like nested transactions, save points, and the importance of appropriate locking hints to maintain data consistency. Throughout the video, Erik highlights common pitfalls and provides tips on when and how to effectively use transactions in your stored procedures for optimal performance and reliability.

Full Transcript

Erik Darling here with Darling Data. Look how happy we all are. Look how thrilled everyone is to be here. It’s a grand day, isn’t it? Today’s video we are going to continue talking about our good friend, our Lord and Savior, the store procedure. In this video we’re going to talk about the use of transactions in store procedures, what they’re good for, what they’re not good for. And, uh, sort of what people think that a transaction may guarantee that it definitely does not guarantee. So, we have that to look forward to. But of course, before we dive into those titillating, exhilarating topics, uh, let’s talk a little bit about, uh, your money and me. If you would like to support this channel and the wonderful content that I’ve create, uh, I create, uh, you can sign up for a membership down there in the video description. It’s the link that says become a member. Uh, and you can for as little as $4 a month support a starving consultant. Uh, if you enjoy this channel, but you just for some reason cannot scrape together $4 a month, uh, you can like, you can comment, you could subscribe. And if you would like to participate in the, the grandness that is me answering off the screen, you can click on the office hours, uh, you can click on the office hours questions. Uh, this link is also, I’m going to pretend that I’m grabbing it and throwing it down in the video description. This link is available also in the video description. You can click on that and submit your question. And if you’re, if you’re at all curious about if you maybe made some typos or you need to, you want to fix something, you can edit your question after you submit it and make sure and do some like proofreading on it. I would highly suggest, that you all do that because some of the questions that come in are a bit malformed. Um, if you need help beyond, uh, what is available here on YouTube or, uh, what, what you can do on your own, uh, I am, I am available to, to help as a consultant. I am a SQL Server consulting partner and, uh, I can do all sorts of wonderful things for you and your server health checks, performance analysis, hands-on tuning, uh, responding to performance emergencies, uh, and training your developers so that the performance emergencies stop happening. Wonderful set of gifts I have it. And I wish to give to you in exchange for money at a very reasonable rate. Uh, if you would like some training on SQL Server, just to, you know, maybe watch some videos, get a little bit better, a little smarter on your own. You can get all 24 plus hours of mine, uh, for about 150 USD.

Once you apply the coupon code at checkout, it will, it will magically reduce the cost of your training. And that training will be available to you for life. Lucky you. Uh, SQL Saturday, 2025 coming at you live and in person on May the 10th, 2025 in New York city. SQL Saturday. It’s a Saturday, the 10th. Uh, and of course, uh, May the 9th, we have a full day performance tuning pre-con with the lovely, talented, fabulous, and the radius of all to.

So you’ll have a wonderful time with that. And with that out of the way, let’s, uh, let’s talk about these transaction things here. No, we need to go to SSMS for this. Now, uh, transaction and, uh, error handling, uh, usually go together sort of hand in glove. The, um, the ebb and flow of the last few videos have been around, um, error and situation handling, uh, batching, uh, retrying, uh, logic, uh, and, um, stuff like that. Right. So all the stuff that kind of, and debugging all, all stuff that kind of ties in together when you’re trying to make a really robust solution for something in SQL Server.

Um, it doesn’t matter. So like a lot of store procedures are like, not just like running a query, they’re like building a process to do something in SQL Server. So you end up, so what you end up with is like these store procedures that have to like do a lot of stuff that maybe store procedures, uh, don’t always have to do. Right. Like again, like retrying stuff, um, self-healing, uh, debugging, uh, making sure that, you know, if something messes up, you’re able to get your data back into a normal state.

So like all of these things tie in together, uh, when it comes to building a good, uh, process in SQL Server. Uh, the main thing that you need really need to understand about, um, about transactions in SQL Server are the locking ramifications and the things that they don’t guarantee. So like the first thing to understand is that there is absolutely no respect for nested transactions, right?

If you, it doesn’t matter how many transactions you nest, uh, like let’s just say you like, for some reason you nested five transactions. If you ran rollback once, all five would rollback, right? If you, uh, have five nested transactions and you need to commit them, uh, you do have to commit five, like committing only decrements the transaction count by one.

You have to do a one, two, three, four, five commits, but rollback, they all rollback. Right. Uh, so the SQL Server does not do nested transactions in a meaningful way.

Uh, the second thing is that the default isolation level in SQL Server, um, read committed, uh, does not give you a snapshot point in time view of your data. Uh, and it does not, like aside from applying locking hints, like with a select query or like doing something else that locks people out of, uh, whatever data you’re touching, uh, or using something like sp get app lock to lock people out of the code that you’re, that you’re using to serialize access to that. Like you can’t really like with recommitted, you don’t get a snapshot view of the data.

So you have to really think carefully about, uh, like how data flows in your transaction, right? Like one query to the next when there’s, if you’re just doing a transaction with a single query in it, um, there’s not a lot of point because SQL Server already has, um, auto commit transaction. So any like insert update or delete is already happening in the context of a transaction.

Anyway, you really, it’s really having multiple queries and an explicit begin, try and do something that like gives you any ability to like, you know, acidify, uh, a group of queries. Uh, the other thing is that, um, not everything really needs to be in a transaction. Like again, single statements don’t really benefit at all from it.

Uh, the lack of nested transactions has been well documented over the years. And, you know, we do have save points there. It’s not really the same thing, uh, for the sort of remaining stuff in here.

Uh, let’s talk a little bit more about like what happens during a transaction specifically. Um, now blocking is probably the biggest thing that, uh, transactions will, will cause on your server. So like, unless your transactions are really, really fast, uh, you, sort of just like increase your chances of having like locking and deadlocking problems.

You also, um, like you have multiple queries in a transaction. It can make the block process and deadlock XML reports really confusing, right? Like, cause you could, you see like multiple queries in them.

Uh, if there’s like triggers involved or anything else weird, like you just make, you just make it a lot harder to sort of unravel exactly what happened in them. Um, so like be, just be careful with that.

Uh, remember that under read committed, the default crappy isolation level in SQL Server, read queries can block and be blocked by modification queries. And, um, read queries can also deadlock with modification queries.

Um, specific plan shapes and stuff will do that. So like make, make sure that when you’re using a transaction, a it’s rather make sure that when you’re using explicit transactions, make sure that a, that it’s absolutely necessary for you to do it because you know, like either from an error handling or like consistency perspective, um, stuff like that.

Because without it, because if you, if you don’t need it, don’t do it to yourself. Because if any one of those queries, like, let’s say like you have like, like two updates run and they’re really fast, right? And then you have a delete run, but that deletes take what the delete takes like five seconds.

Like some, like, let’s say you don’t have a good supporting index for, for that delete. The locks that those two updates that were really fast took are going to hang on. Until that delete finishes.

So you don’t just have two quick updates that like do something and get out. You have two quick updates that are now beholden to this delete that takes five seconds, right? So you just, you just like increase the surface area of your query in duration.

And the bigger that surface area gets, the worse, the worse your problems get. Because like, you know, like, like, as this query overlaps with other queries trying to do stuff, like, like all those surface areas line up in a way that either like exacerbate locking problems or lead to deadlocks just because like you had some updates over here and delete five seconds here.

And then something else wants to do something over here. And like, you just end up clashing with like the updates or something. Like those locks hold on and start interacting with other queries.

And you end up with a, like a problem, not just for like one of the updates, like all three queries in that transaction now have a problem. Because when they deadlock, it’ll roll back. So just be really careful with that.

Make sure that you use appropriate locking hints to make sure that you preserve a preserved data integrity. If you’re like reading data to modify it later and make sure that you are handling errors, rollbacks. I mean, I guess, and commits appropriately here and make sure that you have tuned everything to happen in that user transaction as much as possible.

Because everything you do between like begin transaction and commit or like whatever rollback happens, those locks have to hold on until that thing makes a decision, right? Like you can’t like just say like when they’re trying to be like, and release the locks now.

Like it just doesn’t happen that way. So just be really careful with all that stuff. Now, you also have to be really careful to understand how isolation levels work in the context of transactions.

Not just read committed. I mean, I know most of you out there are just so desperate to get away from read committed that you’re using no lock or read uncommitted. But if you’re using optimistic isolation levels, whether it’s read committed, snapshot isolation, or snapshot isolation, both of those behave somewhat differently in the context of transactions too.

I cover that in my video series, Everything You Know About Isolation Levels Is Wrong. There’s a section on read committed snapshot. There’s a video about read committed snapshot isolation versus snapshot isolation, where if you have questions about that, you should go watch that one because there’s a lot of good information in there.

But like let’s say you’re doing something like this, right? Where, you know, granted, not my favorite thing in the world, declaring a local variable and setting that local variable equal to a value here.

In this query, we’re looking for the ID of a user in the users table who has, like, reputation has not gone above one but has a post, right? Let’s say they have a post.

They’ve asked a question, but that question has a score of zero, right? No one voted on it. Poor thing. You poor dear. And we want to, like, we want to give you a mercy vote. Give you a pity vote.

You might, you know, of course, you might bail out if, like, no one meets that qualification. So, like, let’s say user to promote is still null. You might just say, no, we’re out of here, right? We’re done.

Return. Stop doing work. But, you know, then you go and do this, and you go and find, like, a post to update based on who that user was, and you update, and you give that, you bump that score up to plus one, right?

Just add a one to that score. Under read committed, unless you take out locking hints, like, here and here and here, you could end up with some weird stuff happening.

Leaving aside the obvious stuff like local variables, which no self-respecting performance tune or whatever do, right? Like this thing in here. You get whoever that first user is, you only get, like, their most recently committed data first, right?

Like, and, like, again, you’re not locking that data so that it can’t change between when you get that data and when you do something with that data. And also, you don’t do anything to ensure that, like, nothing changes between when you get that data and, like, when you do something with the data, right?

So, like, data can change around you up here because there’s no locking happening in the select query, right? SQL Server does not have a select for update. You have to add in locking hints in the from clause, you know, like, updlock serializable, serializable, whatever you need to happen to prevent changes to that data.

And that leaves you wide open to, like, data changing around you, right? Like, data can just change under concurrency. With that, what you could end up with, you can end up with something as wild as, like, that user could delete their profile, right?

They could say, I hate this site. No one ever upvotes my questions or answers. It’s terrible. Everyone just makes fun of me and downvotes me. They could just get sick of stack over from me, like, I’m out of here. Someone could go and, like, go vote on their question.

After you find, they’re like, ah, like, someone comes along and they’re like, wow, this is a great question. Upvote. Now, all of a sudden, that question has an upvote and it no longer would qualify for what our query is doing.

They could also go and delete their question, right? They could say, ah, you know what? I understand why no one upvoted this. Now that I read it, I realize I was being stupid. I opt myself.

I was being a foolish. I was being a silly little ninny. Ah, delete this question, right? So, like, nothing protects that data after you’ve read it and before you update it and do anything with it.

So, like, the transaction doesn’t do that unless you take out specific locking hints. Read committed for, like, the billionth time is not a snapshot consistent view of your data during a transaction.

If you were using snapshot isolation or read committed snapshot, or rather, if you’re using snapshot isolation, it would be. If you were using read committed snapshot isolation, it would be a lot closer to it. But, like, read committed, the default pessimistic isolation level, doesn’t give you that.

All read committed guarantees is that the moment you read data, it was, that’s the way it was committed in the table. And it’s very, very flimsy if you think about it or if you watch my series on isolation levels where I explain all of this stuff in great detail.

So, if you want those changes, oh, sorry, if you want to avoid those changes, you would need to add in the updlock and serializable hints to your select query and, like, the exist subquery and, like, basically any select before you do, like, modifications.

Even if you were to do all that work in a single query rather than two queries, you can run into the exact same issues without locking hints to protect the data that you’re reading, right? It’s, like, read committed locks do not hold on, right?

They just, they get, they happen very quickly and then they let go, right? So, just be very careful with that stuff. You know, so, like, everything in SQL Server, like, whether you specify a cursor or not, kind of under the covers is cursor-driven anyway.

But once those read cursors find the rows that they care about, anything goes, like, until, like, exclusive locks start being taken. Like, anything else can come along and be like, well, like, I got it now, sucker, right? So, some more pseudocode.

This is a pattern that I’ve noticed in more than a few clients now is they do something like this, right, where we’re going to do an insert into a table, right? We’re going to find some user stuff and we’re going to cross-apply some stuff.

And then what we want to do is avoid redoing data. Like, we don’t want to reinsert any rows, right? So, like, this is sort of like a broken up merge, right?

Sort of, like, don’t get me started on merge for various reasons. But let’s just say that, like, if you had a merge that was doing an insert slash update action or any merge statement with multiple actions, you still need, you need serializable hints on stuff.

But, like, let’s just say you’re, like, you’re doing this where you’re going to insert any new data, right? Any posts that don’t exist, you’re going to insert those. And then, I don’t know how that parentheses ended up there, but whatever. And then for any new data, you’re going to just do an update, right?

All sorts of weird stuff can happen in here because you are doing reads from multiple tables where, you know, again, under read committed, even in a transaction, data can change all around you.

And you can run into all sorts of strange stuff, especially, you know, if you’re not locking data out here, right? You’re not preventing changes to this stuff. You can end up with weird race condition stuff in queries like this.

This update is fairly safe, except we are just doing a read from here, right? So there’s a lot for you to think about. The idea of the code is to insert any new users into the high question score table.

And, you know, let’s just say that there’s another query that might come along and update the high score for existing users. The problem becomes fairly obvious, right?

Especially under concurrency. There’s absolutely no reason to roll back new users inserted into the table just because the update query fails for some reason. Let’s say that the badges table was recently modified to accommodate some new longer badge name, but the high question scores table wasn’t.

We got a truncation error. That thing fails. Something weird happens, right? But that truncation error shouldn’t invalidate the new users that got inserted. Likewise, let’s just say that if our invisible score updating query is not going to be a query produced a bigger integer than, like, you know, like, that’s an integer column, let’s say some, like, really awesome question came along and all of a sudden you had a, like, query that was, like, or rather a question that had, like, 2 billion points, like, 2.2 billion points that would be bigger than an integer.

You’re now in big in territory. You would also get an error there. So you just have to be very careful with all this stuff. Like, transactions really only encapsulate queries where correctness, like, data correctness would be affected by one of them failing.

Do I have the right affected in there? Let’s just do that. Just so we don’t have to figure anything out on the fly.

Just something would happen to it. You have to be very careful when you’re using transactions to, A, like, you know, make sure that you need them. B, make sure that they are as fast as possible. C, make sure that you’re doing the right thing as far as rolling back, error handling, all that stuff.

And also make sure that, like, the queries that you’re grouping together, like, go together to form, like, the Voltron of result correctness for the whole set of them. If you don’t need queries to ride or die together, they don’t need to be in a transaction together.

So, be very careful with transactions. Like I said, you can end up with just exacerbating any locking and deadlocking problems that you have in your system.

And they can also make the blocking and deadlock XML reports very, very confusing to look at and read through and figure out what happened. Various other things like triggers and barring keys can make life even more difficult, but that, like, that stuff is, like, also, like, intratransactional stuff where you, like, you know, triggers are, like, an implicit transaction, barring keys.

Those, like, and, like, especially with cascading actions, those promote to serializable. You just end up with lots of weird problems. So, be very careful when you’re using explicit transactions because if you start just, like, mindlessly wrapping, like, entire store procedures or, like, queries that just don’t need to be in a transaction together in them, you can end up with some really painful problems.

So, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video, which will be just as pertinent to store procedures as the rest of these have been.

So, I look forward to speaking more about that. Anyway, I’m out of here. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 3

SQL Server Performance Office Hours Episode 3


What’s the best SQL-realted book of the past ten years? I’m struggling to beat Database Reliability Engineering by Campbell & Majors.
How excited are you to attend the Microsoft Fabric conference this year?
Everyone I follow on YT and Twitter are SQL Server wizards. Please reassure a rookie that not every DBA has the level of knowledge that you and Brent have.
What kind of scenarios have you encountered in regards to Dynamics GP systems? Where there any common problems you resolved as well?
Hi Erik 👋 I saw on the engine documentation that it can be expanded. Is there any physical operator or rule that MS is sleeping on?

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five questions submitted by you, the viewer. We start off with a discussion on the best SQL Server books from the past decade, leaning towards “Database Reliability Engineering” by Campbell and Majors as the top pick for practical database management knowledge. Then, we delve into my thoughts on Microsoft’s Fabric Conference, expressing disappointment in its development and implementation. Following that, I share insights on common issues encountered with Dynamics GP systems, focusing on blocking problems and the challenges of optimizing queries within this Microsoft product. Lastly, I address a question about potential improvements to SQL Server’s query optimizer, suggesting unrolling joins with OR clauses into UNION or UNION ALL sets as an area for enhancement. Join me in these office hours for more candid discussions and valuable insights!

Full Transcript

Erik Darling here with Darling Data. And we’re going to have a thrilling episode of office hours, in which I answer 5 questions that you, the user, submit. How do you submit those questions? Well, that is a great question for you to have submitted right now. You can, there’s a link that appears here, that it also appears down in the video description. And if you click on that, that link, the one down there, you can’t click on that one. That’s not a clickable link on the screen. As devious as YouTube is, I haven’t quite figured out if that’s a possibility or anything yet, but that’s how you ask questions. If you want to sign up for a membership to support this channel, you can do that for as little, few, as $4 a month. There’s a link also for that in the video description. If you, you can, I don’t know, it’s like 50 something, 50 something, 60 people do that already. So you can, you would be in great company if you did. If you are, for some reason, don’t have four bucks or you think I suck and you just watch this out of spite and you want me to have some false sense of hope about the future of this channel, you can do other things to lie to me. You can, you can like, even though it’s a, it’s a hate like. You can comment, hate comment, and you can subscribe, which is the most hate thing you can do. The most hateful thing you can do is to get notified, build your rage every time I put a video up. If you would like to hate me live, you can hire me as a consultant. I do health checks, performance analysis, hands-on performance tuning, queries, indexes, settings, all sorts of other neat stuff. I deal with performance emergencies. If your server is burning down on fire and you can’t place an order or whatever, Oh, I can, I can, I can, I can usually fix this stuff. Then I can also train your developers so that you, you have fewer or less of those problems. Uh, if you would like to get some training, I mean, it’s gonna, it’s gonna shock you that, uh, a consultant has something that has something to sell. Well, I have about 24, 25 hours of performance tuning training. It’s all SQL Server, beginner, intermediate, advanced. Uh, if you use that coupon code brings the price to about 150 USD and, uh, you have that for the rest of your life. That is not a subscription package. It is for you forever. So you’re welcome. Uh, we have SQL Saturday, New York City coming up May 10th of 2025. Beautiful times square. Uh, from what I hear, it’s, it’s still, it’s still there. No, not, not, not in its former glory.

It’s sort of cleaned up. It’s like, I don’t know. It’s a weird place these days. You’re just going to go inside the Microsoft building and hang out for the day. You don’t have to worry about the outside stuff. You don’t have to worry about the, the New York centric view of times square. But anyway, let’s, uh, let’s get through these office hours questions. Let’s answer some of the burning things that my adoring fans have on their minds. Uh, the first question is, and, uh, we’re gonna, we’re gonna do a little spell check on this one for you, my friend.

Uh, what’s the best SQL realtor book? I know we all know that’s related, but that’s okay. Uh, of the past 10 years. So that would bring us back to the year 2015 at this point. I’m struggling to beat database reliability engineering by Campbell and Majors. Well, um, if we’re talking about the last 10 years, yeah, I generally think you’re right.

Um, and unless you are looking for academic books about building databases or like, you know, like, like database space problems that haven’t been solved, like, you know, like query optimizer stuff, cap theorem stuff, um, things like that. Right. Like, like, you know, like bigger database problems that aren’t just like you as like a data person, like working with the database, like making sure that that database stays up and healthy and fast and all that stuff. Yeah. You’re probably not going to do a lot better than that. Um, you know, all of the, the books that I really liked are older than 10 years at this point.

Uh, like the last, I think good SQL Server book came out in like 2012 or 2013. Really since then there hasn’t been a lot of like good, like, like fundamentals internals literature, uh, about SQL Server. Uh, you know, like I’ve said before, all the like Kaylin Delaney books where she had, you know, people from Microsoft, the co-hosting talking about the optimizer internals, all sorts of like isolation levels, locking all sorts of good stuff in there.

Um, you know, like, like good books about SQL Server had been non-existent really. Um, they’re just kind of all the same like rehash of things. Uh, you know, even, even the books written by Microsoft employees, uh, that have come out, um, since then, uh, were, uh, let’s just call them unproductive in many ways.

Um, you know, it’s, it’s, it’s a lot of like, uh, Microsoft company line material on things with, uh, very little admission of, um, you know, uh, where, where SQL Server as a product is somewhat lacking. Um, so, uh, yeah, database reliability, reliability engineering is probably the best, like working with a database book. If you have other interests with databases, um, there, there are probably better books out there that are more academic, theoretical, you know, relational algebra stuff.

But yeah, that’s a good one. Um, you know, not oddly enough, but like a lot of the stuff in that date, in that book, uh, is, I think a lot of the examples, if I’m remembering correctly, were like MySQL related. So it’s not even like specifically SQL Server stuff.

But anyway, uh, how excited are you to attend the Microsoft Fabric Conference this year? Shockingly, I was not invited to the Microsoft Fabric Conference this year. Can you believe that?

I mean, probably what I’m most excited about is to not attend whatever it’s named next year. That’s going to be fun. Uh, whatever the next squirrel Microsoft sees is, uh, you know, I don’t want to spend too much time on it, but man, uh, Microsoft has just lost a lot of goodwill with their inability to stay focused on, on a product and see it through.

Um, the, the, the way that they’ve switched up the, uh, everything that’s led up to Fabric so far has just been annoying to everyone except like a few groups of people. Like a Microsoft employees who will get fired if they don’t pretend they love it. Uh, B, uh, consultants who have to keep re-implementing the same thing over and over again with a different name.

And see like MVPs who are so terrified of getting kicked out of the MVP program that they won’t say boo about a crap product. So like, like, there’s a, like, those are like the weirdest people. Um, cause like, like that’s a closely overlapping Venn diagram or the, the, the consultants who deal with this stuff in the MVPs.

That’s almost a circle right there. But, uh, man, you, you want to talk about an unserious, unfinished product? What? Why? Fabric is like, fabric is just like the, it’s like, it’s like, it’s like, it’s like, and I’m not, I’m not ranking Databricks and Snowflake.

I’m saying like, you have Databricks and Snowflake. And then like, like, like, like, let’s just say like, you have Databricks and Snowflake here. And then you have Fabric like down here.

And there’s not even anything else in here. Like that, like that, that’s, that’s how bad it is. And, um, you know, like I, I have a bunch of clients who have been forced to use it by sea levels and like, because Microsoft either gave them huge price breaks on it or like actually fed them money to start using it. And they are just miserable.

So, um, you know, I don’t, I don’t, I don’t know how to help there. I don’t know. I don’t know what else to say. All right.

Next one. Everyone I follow on YouTube and Twitter are SQL Server wizards. Well, good. You should, you should follow people who you consider to be smart about a thing that you care about. Please reassure a rookie DBA that not every DBA has that level of knowledge.

Well, of course not. If everyone had that level of knowledge, there would be no point in me doing this. I would just be, I don’t know, stating the obvious to everyone.

Um, so of course not everyone has that level of knowledge. Uh, I, a lot of my clients have like full DBA teams full of people who maybe don’t have as much knowledge as I do. You know, uh, and there, there are all sorts of wonderful areas of knowledge out there for SQL Server.

You know, like I, I know nothing about SQL Server outside of like the query and index tuning stuff. I mean, like, you know, some stuff, but like, well, I guess what I’m talking about is like, like if you wanted me to like troubleshoot an AG outage or you wanted me to like, you know, like set up an age, a perfect availability group for you. Uh, like that would be way outside the realm of something that I’m comfortable with because that’s not stuff that I do and study and work with every day and care about.

Like I have friends who do stuff like that. My friend, Sean Gilardi, wonderful with availability, high availability stuff, knows it inside and out. Um, like I just know nothing about it.

And he actually like knows nothing about query optimizer stuff. So it’s like, you know, there are, there are all sorts of, there’s all sorts of space for people to know a lot and be ignorant of, of the rest. So, um, yeah, no.

And, uh, you know, in, in a funny, in a sort of funny way, like I miss a bit being like a rookie with this stuff because like, I don’t know, there was just so much to do. Like the world is this wide open place full of mystery and magic. Like there are so many things to test and try and there’s so much knowledge to sort of like, like eventually like soak in.

It was, it was, it was, it was great. Uh, these days, you know, it’s like you, you build up a callus kind of, and, uh, it becomes harder and harder to be impressed or surprised by something. It’s like you have to, you have to really do yourself now to get anything.

You know, it’s like just keeps taking more and more to, to, to build you up there. Um, you know, so, you know, uh, yeah, but I, I, first up, appreciate the kind words. I do, I do, um, appreciate you, you following me places and thinking that I’m wizard-like in some way.

Uh, but yeah, uh, it, it’s, it is, it is, it is a learning process and, um, we, we, we don’t, we don’t all start off like this. It’s the only person I think who started off, uh, knowing everything is Paul White. So there’s just, there’s just that.

All right. Let’s see. Our next question here. What kind of scenarios have you encountered in regards to Dynamics GP systems? Were the common, what were there, where, where there, any common problems you resolved as well?

Yeah. So Dynamics GP is a Microsoft product and it’s one of the, I mean, I know that they didn’t start. I think they bought it or something.

Uh, used to be something else. They bought it and it’s whatever. But, um, yeah. So it’s one of those unfortunate things where like, it’s another one of those Microsoft products where it’s built to use SQL Server, like built to have a SQL Server database backend.

But the people who did that work never spent one second talking to the people at Microsoft who built the database product. So it’s a real bad time.

It’s real unfortunate. Dynamics GP is like, like, I think for, for what it does, this is one of VWare’s products, the way it works with the database. Uh, unless you have like a pathetically small, like sample set for stuff, like you’re good. Like as soon as that gets to a meaningful database size, you’re in real trouble.

Um, as far as problems that I typically solve with Dynamics GP, um, there’s a lot of blocking. Uh, there’s a lot of selects with upd lock hints. Um, you can fix some of the blocking problems with indexes and with an optimistic isolation level, but not all of them because like the second one of those selects, selects come along with an upd lock hint.

You’re back to having a bad time. Uh, so, you know, you like RCSI is a good choice for it. Um, you know, like, like finding your crappy queries and like seeing if the indexes at all line up with what they’re, what they’re trying to do.

And then, uh, I think the, the final thing that you’re able to really do, cause like all the Dynamics queries come from inside the application. They’re not like store procedures or anything that you can like tinker with. So like your last resort for a lot of things is either plan guides or forcing plans in query store because there’s not much else you can do.

So like, even if you have, like, even if you have a great index in place and SQL Server is not using it because of cardinality estimation, it doesn’t want to do a key lookup or whatever. Like, like you don’t want to add a super wide cause some of those tables are massive, right? Like, like width wise and like, you don’t want to add just like, you know, 40 column includes, right?

That’s not a good time. So like sometimes you do have to resort to plan guides or forced, forced query plans in query store in order to like get your queries to actually use whatever indexes either you already have or whatever indexes you end up creating. Based on what your, what your worst Dynamics GP queries are.

But I have done a lot of consulting on Dynamics GP systems over the years. And every time I wish I had charged more. All right.

So the last question here, somewhat strangely worded, cause it almost sounds like they, they, they, they, they, they, they, they, they were asking a different question, but I understand it now. It says, hi, Eric. Hi, how are you?

I saw on the engine documentation that it can be expanded. It was a little unclear what it was at first, but I think I worked it out. That it is the engine. Is there any physical operator or rule that MS is sleeping on?

Well, I think if there was a specific space that I wish the optimizer did more work in, it would be unrolling like joins with or clauses to be like union or union all sets. That would be where I would go with it.

I think that, that also I hear, well, I mean, I guess they already have done work on that. I’ve heard rumors that Microsoft is working on how they handle sub queries to do that a little bit better because the current implementation is somewhat incomplete. But I haven’t actually seen the results of that yet.

I do know that in Azure SQL DB, I found a couple new optimizer rules that help you when you’re doing like the left join to find rows that don’t exist. So like select from table A, left join, table B on like the primary key columns where the table B primary key column is null. So in other words, there’s rows that are in table A that aren’t in table B.

They do have a couple of new optimizer rules to change that from that query form to like a not exist query form, either a left or right anti-semi join. So that, like those are two new optimizer rules that I have a lot of hope for. I do hope that they make it into SQL Server 2025.

They’re, they’re cool. They’re good things, right? They’re good optimizer progress. But I think, you know, ultimately you can spend a lot of time coming up with stuff like that. Like there are certain, like, like, honestly, like the, the join with the OR transformation would be a great one, right?

That would be fantastic to see. But I think ultimately the, the, the query, like you have the query optimizer here and the query execution engine over here, right? And these two things need to get a lot closer together because like, like you can spend all the time on rules and cardinality and, you know, heuristics and stuff over here and still come up with a God awful plan over here.

And with, unless these two things have a, like a really clear line of communication, like all this stuff over here is just like spinning wheels after a certain point. Because once that query starts running, it just eats, eats it anyway. Right?

Like, like, like the optimizer is like, it’s the perfect plan. I’ve got it. It’s right here. Nothing can go wrong. And then all of a sudden it runs and it’s like, it was wrong and like wrong in a lot of places. So like the query optimization and query execution need to become much closer friends.

Uh, I think in order for, um, Microsoft to maintain, uh, it’s, it’s, it’s, it’s, it’s query optimizer championship belt and for any major progress to get made. Um, because, uh, you know, like for, forever and ever the optimizer was just like, it came up with a plan. And as long as that plan was in the cache, we’d reuse that plan.

And like, there was no communication at all, but like, you know, with the intelligent, intelligent query processing set of features, there’s been more and more sort of like feedback and communication and stuff there. And like, those are great steps, but I think ultimately what you need is way more, um, you know, uh, way, way more like runtime flexibility, uh, for queries. So that like they’re, they don’t get stuck the way that they do now.

Like, like for a lot, aside from adaptive joins, which, you know, are not horrible, but they’re not, but like, if all you get is adaptive joins, there’s a lot of other stuff that you leave on the table. Uh, that could be like made, but they could be more adaptive at runtime that are not, um, you know, like, like, like one of the things that sticks out to me is memory grants. Where like, if you just get a way under memory grant, like you like at runtime, you’re hosed.

And I understand there’s a lot of cost and complexity and there’s a lot of stuff that can go wrong. If you were to just start like dragging memory in for a query, there are other things that have sort of dynamic memory, like, like index creation and some stuff with like, I think columnstore inserts. But, um, like query runtime memory, like you can only get that feedback after the queries run.

And like, if you had a query that ran for like an hour because it was spilling out to disk, you know, like, I don’t know. I don’t know precisely how valuable that feedback is after an hour, right. Or after however long that query runs for, like there’s, there’s stuff that I wish were way more runtime adaptive to prevent like, like the really bad query holdup stuff.

But I don’t know, I don’t know. I assume that smart people have looked at this and either disregarded it or decided to choose a different path for things completely. So, um, I don’t, I don’t think that I have anything, uh, particularly, uh, new or enlightening to, to, to provide to them.

But, um, there’s, there’s a lot of stuff that I look at when, uh, when I’m looking at query plans, when I’m looking at like performance problems. And I’m like, well, you know, it’d be nice if SQL Server had like a little bit like a fail safe on this stuff. You know, um, I think, I think one, one area where I think the optimizer, actually, sorry, two areas where I think the optimizer could use, um, a lot more work are, uh, costing for random IO and the value of early aggregations.

But before joins, cause you know, like, you know, lookups and loops and stuff are, nested loops are all still costed. Like they’re on crappy old spinning disc hardware. And, uh, a lot of the times you have to manually rewrite queries to use early aggregations, uh, because, uh, the optimizer just won’t.

Like, even if you have like a ton of duplicates and two tables of the columns you’re joining on, like you, you have to physically rewrite one or both, uh, to do like a group by or something before the join. Otherwise SQL Server just doesn’t choose to do that. And you end up doing way more work, dragging way more stuff along.

It’s just, it’s, it’s, it’s, it’s aggravating. So that’s some, some areas where I think the optimizer could use some work. Anyway, that brings us to the end of these five questions. Um, I will ask five other, I will answer five other questions in another video.

So you have, you have that to look forward to, but thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in that, in that next, in the next video, whatever, whatever that is, whatever magic mystery tour we go on.

All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Debugging Logic and Performance

All About SQL Server Stored Procedures: Debugging Logic and Performance



Debug examples.

Video Summary

In this video, I delve into the crucial topic of debugging stored procedures in SQL Server, emphasizing the importance of adding steps that allow for easy identification and resolution of issues. Whether you’re running a procedure directly from Management Studio or through an agent job, understanding where to direct your output is key. I also share practical tips on what aspects of debugging to focus on—logic, performance, and dynamic SQL—and provide examples of how to implement these in your procedures. Additionally, the video includes links for those interested in supporting my work, such as joining a membership program or seeking professional assistance with their SQL Server needs. Whether you’re looking to improve your stored procedure hygiene or need detailed guidance on debugging complex scripts, this content is designed to be both informative and actionable.

Full Transcript

Erik Darling here with Darling Data, and today Bats and I are going to be talking about something that I think is very, very important to the effective building and maintaining of stored procedures in SQL Server, and that is, of course, debugging. The longer and more complicated your stored procedures get, and especially if they involve dynamic SQL of any real consequence, the more important it becomes to add in steps so that people can find issues, errors, and insights along the way. But equally as important is understanding the context that people doing the debugging will be executing the stored procedures in so that you target the output to the correct venue.

You know, if the debugging is going to be taking place in Management Studio, and someone’s just going to hit F5 and run it and go off on their merry way, cool. You can just print and select and raise error all you want. It’ll do just fine. But if you are going to be running this sort of in a headless way via like an agent job or some other scheduled process or like from some like, you know, or even just like an end user doing stuff, you would have to direct that output somewhere else.

You would have to put that in the table or something and, you know, it’s up to you to figure that out and what needs to be included in that output in order for it to be useful to anyone who might come along and debug it later. Typically, I find that debugging should be verbose. You should start more verbose than you think you need and pare down what you don’t need rather than having to keep adding things in and in and in later.

But before we do that, let’s talk about some very important things here. Let’s talk about how you, Wary Traveler, can support my efforts to keep producing this content. I offer memberships to the channel. You don’t have to sign up for one, but I appreciate it if you do.

If you do not feel like signing up for a membership by clicking on the link in the video description right around in here somewhere, you can do other things that help me sort of get some, you know, joy out of life. Liking, commenting, subscribing. And of course, if you would like to ask questions privately that I will answer publicly on my Office Hours episodes, you can go to that link. Well, you can go to the clickable version of that link down below and ask all your questions.

If you need more help with SQL Server than asking a question can possibly provide. Well, there are a lot of folks like that out there in the world and I work with them quite frequently. make their SQL Servers go much, much faster than they are today.

If you need a health check, performance analysis, hands-on query or index tuning or both probably. If you’re having a performance crisis or you want to get your developers trained up to avoid needing all the other stuff. Well, I am a consultant and as always, my rates are reasonable.

If you would like to get some training on SQL Server performance tuning, you can do that from me. For about 150 USD, you can get for the rest of your life, all 24, 25 hours of my performance tuning content. Again, there’s links down below where you can have a wonderful time clicking on things.

If you would like to see me live and in person, I will be attending as an organizer, SQL Saturday, New York City 2025. It’s taking place on May 10th with a performance tuning pre-con by Andreas Walter on May the 9th. And that’ll be, they’ll both be great times.

So, if you’re looking for it, looking for something to do in May, in the merry month of May. You’ve gotten sick of strolling through parks. Maybe, maybe you could show up to the Microsoft offices in Times Square and do a whole lot of learning about a whole lot of data stuff.

But let’s move on. Let’s talk about debugging. Now, this is a script that I have put up on my GitHub page. There will be a link to it down in the video description.

There are links to everything in the video description. But some important stuff to consider when it comes to debugging. There are essentially three things that you might care about debugging in the context of a store procedure, depending on what it does.

Logic, performance, and dynamic SQL. So, you know, logic would be like control flow stuff, dynamic, like printing dynamic SQL. Performance would be like getting query plans from like the queries that matter in the store procedure.

You typically don’t want to turn on query plans for everything and have just like, SSMS crash because you tried to put 5 billion query plans in the messages and the return results. And then sometimes you may want to generate but not execute dynamic SQL so that you can see what it looks like to fix syntax errors.

It’s usually a lot easier to like see it printed out, copy and paste it to like a regular window, and then figure out where something went wrong, where that, you know, incorrect syntax error, like a single apostrophe is or something like that.

A lot of the times that’s going to be the silent, sort of like silent truncations of your strings. So be very careful with that. So, you know, not really related to debugging, just sort of good store procedure hygiene, no count and exact abort.

Depending on what the store procedure is doing, you may not want exact abort on, because that will force the whole thing to stop running and just basically like return, like the whole thing will terminate rather than just a single statement terminating.

So depending on what you’re doing, that might be good or bad. No count you usually do want on though, because like that blah, blah, rows return or rows affected message is annoying. So like stuff that I usually put in there are things for debugging dynamic SQL, things for debugging loops, and, you know, other ways to sort of collect sort of like runtime, like information about like contextual environmental information.

You know, like a lot of the scripts that I write, they are for SQL Server analysis, right? So sometimes I have to do things like collect version and addition information so I know which columns and which DMVs are going to be there. So what I’m allowed to touch and like, you know, different things that I might have access to depending on that.

So, you know, there’s, there are some important debugging things that you might want to get so that when you go and look at, you know, your final results, you understand why there was a problem with something. But, yeah, that gets that.

And then down here, you know, stuff like debug logic, right? So when you want to debug logic, one thing that you should, one thing that you’re probably going to want to do is have some message print that says the thing you’re going into before you go do that thing. Because like if this were dynamic SQL, you could print out the dynamic SQL, right?

That would be fine. You could, you could see what it got to. But if you’re just running a store procedure and you don’t, you like, you want to figure out which step you’re in and like which, what you’re like, what query you’re about to run. And then like, you know, when you’re finished with that query or something, then, you know, just a simple message like this, just informational only.

And again, like if you’re running this in SSMS and you’re fine with this, like reading through the results, because that’s what most of my work is. Like, I’m not like orchestrating things from other places. Like this might be totally fine.

If not, you would want to log this to a table somewhere. If you were doing this all in a big transaction, logging it to a table would be impossible because if you rolled everything, rolled something back, you would like if you’re like, okay. You can put stuff in a regular logging table and like see the steps from along the way.

When you are dealing with things like loops or you want to figure out if results get weird because of either like too many or too few rows happening with something. You would want to like, you know, you could like log the number of rows that came out of like they got that happened in this here, which is sort of like a silent, like set no count off. Right. Because you’re not, it doesn’t get immediately returned to you.

It gets assigned to this, this variable. And then if you’re, if you’re doing the debug logic stuff in here, then you can print, you could print that as a message and say, hey, this is how many rows you got from sys.databases from that query. Right. So you would say in the output, it would be like selecting from sys.databases.

This is how many rows are in there. Cool. And then if you were going to select from that table, this is what I like to do so that I know, like if a result was empty from something. Uh, because I don’t, I don’t always do the row count thing sometimes.

And I’m like, like, especially for the analysis procedures, it’s like, there’s not a lot of point. Uh, but so like what I usually do is towards the end of the procedure, I’ll put in something like this with just like all the temp tables that I use so that I can see which ones were empty. Right. So like if there’s something in the temp table, then we’re going to select the table name with, uh, like the right table name here.

We’re going to select all the results from that. But if there’s nothing in that table, like it would just say like, it would just be like an empty result set with like, you know, whatever is in the D pound sign D temp table. Sometimes you might want to say something like, Hey, that table is empty. Right.

So, you know, exactly where, where along the line things either went wrong or you might have to change stuff. Um, dynamic SQL, like I said before is sort of fickle. Right. So, um, like, it’ll like, it, like it, like it, like it, like it, like it, like it, like it, like it, like it, like it, like it, like it, like it’ll, it’ll truncate for just bizarre reasons. Right. Like, like, even if you have all your, like Unicode and prefixes in there, uh, you just like concatenate this one slightly smaller string onto like a much larger string.

That’s already in a, already in an VAR car max and should just get it blopped right on and like, just maintain the maxness of it. It doesn’t always work. So like you, like sometimes it’s a, like, it’s a very good idea to print out your dynamic SQL and one, one sort of like, I mean, just quick and dirty way of, you know, if like print has a limit to how many characters it can spit out and show you. So if your, if the length of your dynamic SQL block is, you know, like maybe like 8,000, like 4,000, 8,000, 8,000 bytes or something, uh, you might need to have a loop that go, that iterates over your, um, your string, your dynamic SQL string and prints it out in manageable chunks so that you can see the whole thing.

Uh, but, uh, and then other stuff that is neat to do, if you are using loops, uh, you, it, I always find it very, very useful, uh, aside from having debug logic to like, tell me when I’m going into the loop, when I’ve done certain things within a loop, when I finished the loop, uh, to let me know, like, which, like where I am progress wise. Right. And raise error is way better for, than, than print for this because raise error, you can put this no wait command on there.

And that, and like, like print, you’re, you’re stuck waiting for like buffers to fill up before print will like, like it’s forced to print a message with, with raise error, with no wait, you can skip a lot of that. So, you know, like you could, like, you could put this all into one raise error. I like to have it in three.

So it’s kind of like easy to read through, but you know, you are starting the loop with a low ID of this number, the high ID of this number. And like the total entries are there. So like, you know, then I have my cursor and, you know, entering my cursor loop and, uh, you know, some information about like, like where I am in the cursor.

Uh, I always, I always find like the, the fetching next and the incrementing things to be very, very useful because it helps me catch logical bugs where I maybe didn’t do things in the right place. Uh, and then of course, when you’re finished, uh, it’s great to print finished messages here so that, you know, you’re done with the loop and you can move on from there. Um, as far as like performance debugging goes, um, what I’ll usually do is something like this, right?

Where, uh, like I’ll like, like select a little bit of information about what’s happening. And then, uh, I will use set statistics XML on, right? So if we’re debugging performance, I want to get some information here.

I want to turn on query plans for a specific query. I don’t want to have query plans for absolutely everything in the procedure. I want to, when I get query plans for that stuff, then I’m going to get some finish information, some finishing information rather. Uh, we’re going to look at like say that this query finished, this was the end time.

And this was the difference in time there. Uh, we’re going to turn off query plans at the beginning of that so that we don’t like have any duplicate, like, well, like it’s only get a query plan for anything else basically. So there’s all sorts of good stuff you can do to get performance information too, right?

So like that’s often a very critical step is being able to like, when you have a performance problem, like focusing on the, focusing on the queries where, you know, like performance is, uh, either like important or an issue generally so that you don’t have to, you know, sift through 10 billion. Like variable assignment and like tiny little query plans that don’t add up to anything. It just gets annoying and frustrating.

And sometimes query plans get so long, like, like from store procedures that do a lot of stuff that like you try to scroll down to the bottom and there’s just this like big empty space in SSMS. And it’s like, okay, what happened to the query plan? Can’t render it.

Then guess what? You have to open up like the plan is XML and edit out a bunch of stuff and like delete a whole bunch of like the statement stuff and then reopen it. It’s a nightmare.

So don’t, don’t, don’t do that. And then, um, you know, of course, you know, error handling, try catch stuff, all very important when it comes to debugging. Uh, but like one thing that I find invaluable when it comes to debugging things is at the end of the procedure, I like to have a full accounting of what parameter values, like the way the parameter values got passed in.

And especially if I have to, if I end up doing any parameter value manipulation, I want to know what change in there. And I also want to know what variable assignments happened along the way so that I can, I can get some sense of like, you know, like if anything in here is unexpected is to like the flow of logic and the procedure, I want to know what values I ended up with. So there’s all sorts of good stuff you can do with a procedure like this, um, or with debugging, uh, to give you some sense of like, you know, like where you are in the procedure, stuff like that.

Uh, like how you’re making progress, again, printing dynamic SQL, things like that. Uh, a link to this script will be available in the video description because it’s useful for people to see these kinds of examples so that they can incorporate them in their own store procedures when they’re doing things. But, uh, you know, debugging is a very, very important step in, uh, building, uh, maintainable and, you know, like triageable procedures, right?

Because if it’s just a big, long procedure with some crappy comments in it and you start all of a sudden having problems and you don’t, like it’s either erroring out or miserably slow, you’re going to have a hard time figuring a lot of stuff out, right? And, you know, let’s face it, you know, like code is a lot like data. It never tends to shrink, right?

Like you never, never end up working with less data or working with fewer lines of code. Everything just keeps getting bigger and messier and more complicated and harder to use. So you should really, um, put as much debugging as you can into, uh, into things from the beginning.

So you’re not stuck or you, a, you’re not like caught flat footed when you start having a problem. B, it takes a long time to sort of like work all the debugging stuff in and not introduce more errors with the debugging stuff. So, uh, yeah.

Anyway, I hope you enjoyed yourselves. I hope you learned something. Uh, and, uh, I will see you in the next video in the series about store procedures where, um, I forget what, I forget what the topic is, but I, I, I promise it’s, it’ll be equally as exhilarating. So anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Replacing Subqueries With Conditional Aggregates In SQL Server

Replacing Subqueries With Conditional Aggregates In SQL Server


Video Summary

In this video, I delve into the world of optimizing SQL Server queries by replacing multiple subqueries with conditional aggregates. You’ll learn how to transform complex and repetitive subquery logic into more efficient join operations, significantly improving query performance without altering your database schema. Alongside practical examples, I walk you through a real-world scenario where we compare the execution plans and results before and after optimization, demonstrating not only speed improvements but also ensuring correct results. Whether you’re looking to enhance your SQL Server skills or just curious about how to make your queries run faster, this video offers valuable insights and tips that can help you become a more efficient database consultant.

Full Transcript

Erik Darling here with Darling Data. And we’re going to take a little break from talking about store procedures in this video. Though you might see this pattern in store procedures. Bats was sick of hearing me go on and on about store procedures and he told me in no uncertain terms, we should talk about something else today because it is Friday. It’s time to talk about something slightly different. So this is, today we’re going to talk about replacing sub queries. Sub queries, notice sub queries there is pluralized. Not, I’m not talking about replacing a single sub query because SQL Server’s optimizer does convert sub queries to joins, but multiple sub queries where you’re calculating almost kind of the same thing over and over again with conditional aggregates. All right. So, um, rather than making multiple trips with like the same query or very similar queries, uh, to a set of tables, you’re just going to replace that with some joinery. Um, but before we do that, um, but before we do that, um, we are going to talk about how you, you could replace money in my bank account with money, uh, from your bank account. You can do that in a number of, of very sweet, kind, tender ways. Uh, if you would like to support my endeavors to bring you this fantastic SQL Server content, uh, right, right in the video description, I want to make very clear the video description holds the answers to all of your link clicking questions. Uh, you can see, sign up for a channel membership and for as few as $4 a month, you too could feed a starving SQL Server consultant. Uh, if you like my content, but you don’t have $4 a month, um, I don’t know, I guess tax season is upon us. Uh, you can like, you can comment, you can subscribe. And if you would like to ask me questions, if you would, if you think, hmm, this, this Erik Darling seems to know a thing or two about SQL Server, I might, I have a question about SQL Server. You can follow this link, which is again, is in the video description.

And you can, you can get right to where you, you ask me the questions. If you need more help with SQL Server, uh, once again, I am the best SQL Server consultant in the world, uh, with the, the exception of one small island state in the, in the Pacific, which I’m, I’m tired of naming, saying by name. Uh, I don’t know. I don’t know. I don’t know. I’m gonna feel good about myself. Uh, I, uh, health checks, performance analysis, hands-on query, index, server, everything tuning. Uh, if you’re having a performance emergency, or if your developers need training because you keep having performance emergencies, well, golly and gosh, my rates are reasonable.

Uh, if you would like to buy some training from me, you can also get that for a very reasonable rate. Uh, but 150 US dollars for life. You can get 24 hours of it. Again, a fully assembled link for you to make that purchase is down in the video description. God, I make this so easy for you. I don’t understand why, uh, why you act like it’s difficult.

Upcoming events we have, and Bats is very excited about this because, uh, Bats is actually going to get some Pez put back in him for, for this one. Uh, we have SQL Saturday 2025 taking place, uh, in New York City. That is a New York City version of SQL Saturday. Uh, Bats is terribly empty right now. Uh, on May the 10th with a pre-con, uh, given by Andreas Volter on May the 9th.

Uh, you should, you should obviously attend both because it will do you a lot of good. Get out into the world to visit, maybe, maybe not a new place, maybe a familiar place. Uh, maybe see some familiar faces, maybe make some new friends. I mean, who, who knows?

You take public transit in New York City and just, you know, you could, you could meet all sorts of fascinating, interesting creatures. But, but, but, but, but, but, but, but, let’s go talk about replacing subqueries, multiple subqueries, with conditional aggregates. So I’ve been seeing this pattern in a lot of client code, like, over the years.

And for some reason it just never dawned on me. Let’s talk about it. I don’t know why. It’s a weird thing. Because it’s a thing that I’ve spent a lot of time rewriting in various, various situations, in various ways.

And, I don’t know, maybe I’m just finally sick of doing it. But, here we are going to talk about how, how, what it looks like and how you can fix it. Uh, so the first thing I’m going to do is just, um, clean up this temp table stuff.

And then I’m going to dump some rows into the temp table. Uh, you see we get 13,265 rows. And, uh, so what, this is what I see a lot is, um, like maybe for some reason, like you had all these subqueries in the initial insert.

And you were like, oh, this insert’s taking forever. Oh, I hate this. Oh, good.

Can, this is all messed up. And, so like, I don’t know, for some reason, someone thinks it’s a great idea to set, like, do like an initial insert to get some primary key values. And then come back and either update the table and update all the remaining columns.

Or, depending on how tricky the logic is, maybe spread that out over multiple updates. So, like, just as an example, like the logic for, come on, zoom it. Five times I hit control and one before this thing responded and zoomed.

Uh, like, like, like stuff like this can be kind of tricky to, to figure out for some people. So, like, sometimes I’ve seen, like, like other updates get pushed in, or rather other, um, aggregates like this get pushed into separate updates. So, you’re not just updating the table once after you insert some rows.

It’s just like, like multiple, multiple update steps, which just drags on and on and on and on and on. Um, you know, especially with the bigger temp tables, right? Right. So, uh, the insert for, I’m assuming it being a real, real problem.

Uh, the insert for this did not take very long at all. Just a hundred and seventy something milliseconds. Uh, not bad, right? It’s fine.

But then, uh, what we’re going to do is we’re going to do the update. And then we’re just going to sanity check the update by looking at John Skeet stats. Because John Skeet stats, uh, are a great way to sanity check most things. So, let’s run this.

And we’re going to wait for this update to finish. Because that’s, that’s just what we have to do. We don’t have a lot of choices. We must wait for the update to complete so that we can judge its efficacy, don’t we? All right.

So, this is the query plan for that. And this is not really the great kind of query plan. Like, this is, you know, granted, we could probably do some indexing work on this to make things a little bit faster. But, uh, let’s just say, I don’t know, maybe we, maybe we just want this thing to be faster without creating indexes.

And, you know, this, like, you know, having to deal with all that stuff. Maybe we just want to rewrite the query. And we want things to be magically faster without, like, changing the structure of the database.

Maybe we’re not allowed to for some reason. But, anyway, uh, this whole thing takes about 9.1 seconds, right? Not a good situation.

Nine seconds. We are unhappy with nine seconds. But what I want to do before we, before I forget, is grab John Skeet’s stuff from up here. And let’s just drop this into a little comment here.

And if you’ll bear with me for just a moment, I’m going to line some of these columns up with the magic of tabs. So that we have something that we can look at quickly a little bit later. So what you’re usually better off doing is something like this, right?

So let’s just clear the data out of that table. And what we’re going to do is we’re going to insert everything into the table all at once. But the way that we’re going to handle the aggregates is with this sort of conditional case logic, right?

So for the top question score, we’re going to say get the max score where post type ID equals one. Same thing for answer score. For total question score, we’re going to do that with sum instead of max.

And then for questions and answers, we’re going to do the same thing with sum. But rather than using the score column, we’re just going to use the number one, right? So every time post type ID is one, we’ll add one to whatever we’re summing up.

And the same thing for when post type ID equals two, then we’ll add that. Otherwise, we are adding zero to the total, right? And when we do this, we only need one join to the user’s table.

The thing is, this does have to be a left join because an inner join would restrict rows where people don’t have stuff in there. And there might be people who, for some reason, have a reputation of over 100,000 or over, sorry, over 10,000, right?

That’s a five digit number. So the comma goes there. And you might, so like this might actually get people out, right? Like maybe they have a high reputation for something else. I don’t know, their test accounts or something.

I don’t know. I don’t know. I don’t know what to tell you there. But when I tested this and I had an inner join there, we had fewer rows ended up in the 10 table.

And I said, that’s not good. Eric, you can’t show that to people. You can’t go out in public like that, not wearing pants. So yeah, this does have to be a left join because remember, subqueries do not eliminate rows, right?

So when you do a subquery like we did for these up here, these all end up as left joins, right? These are all left joins because they’re not filtering queries, right? There’s no rows.

No, these are not allowed to filter rows out from the results, right? So these just find the columns in the temp table and do an update based on what’s in there.

They don’t like remove rows from the temp table. They don’t like, there’s nothing gets restricted. So let’s, I think I already truncated this, but let’s do that once more just in case. And remember this, that whole thing above took about nine something seconds, right?

9.1 seconds plus the hundred milliseconds to get stuff to insert it into the temp table initially. And so let’s run this.

And this should run just in a, just about a second and a half or so. And I want to grab this from up here and just drop this down here so we can do a comparison to make sure things are correct.

Let’s make a little note, original result. And now let’s go look at the execution plan. And of course, even without doing any index tuning, we bring this down to about one, like about a second and a half, right?

1.6 seconds down from 9.2 ish seconds just by like eliminating the multiple, like, like hitting the post table multiple times and just doing a single join with that conditional logic.

Now, if we go visit the results and we say copy with headers, oops, for some reason I hit control and E rather than control and R and I don’t need to execute that thing all over again. That would be no fun.

But let’s line this stuff up and let’s just make sure our results are accurate or rather let’s just make sure our results match because that’s a pretty good sign that we tuned the query and we got things correct.

And if you look across here, everything lines up just perfect. And so this is not only faster, but yields correct results. Two very important things when it comes to query tuning. Some people might tell you that they tuned a query, but then you look at the results and they are completely different.

Of course, you do need to be a little bit careful. And when you, if you tell someone that the results are different because all sorts of things can make results look like they’re wrong, but they’re actually correct. I think, you know, and maybe, I don’t know if that’s worth a different video, but like, you know, like obviously different query plans, serial versus parallel, even like running the same parallel plan over and over again.

If you’re not ordering by something deterministic, the, you know, the results are going to come back in all different, like, like all different ways, right? If you’re ordering by a non-deterministic column where you just don’t have an order by on the, the comparing query at all, like the results might look all different.

Like you might have all different IDs coming back in different orders, but like the results for those IDs are all correct across. And then like, you actually do have the same set of IDs for both, just not in the same order.

So whenever you’re comparing things like that, you know, of course intersect and accept are very useful for that sort of exercise, but also just make sure that you’re ordering the final results when you’re comparing them to make sure that like, you know, like other things aren’t interfering with the result order that, so that they look like they’re wrong.

But anyway, this is a great way to replace multiple subqueries with conditional aggregates. Make your SQL Server queries go faster, get correct results and look like a big hero, right? Go from 9.2 second query to a 1.6 second query.

Gosh, you’ll, you’ll look, you’ll look real good in your next performance review. Don’t you? Don’t you?

Sparkling. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you, well, like I said, this video drops on Friday, so I do hope you have a great weekend.

And I will see you, see you Monday with an office hours video, so adios.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Batching Modifications

All About SQL Server Stored Procedures: Batching Modifications


Video Summary

In this video, I delve into the topic of batching modifications in SQL Server stored procedures, a practice aimed at managing large-scale data changes more efficiently. Batching helps mitigate issues like locking and blocking, as well as potential server resource overloads, by breaking down big tasks into smaller, manageable chunks. Throughout the video, I explore various scenarios where batching can be particularly useful, such as when dealing with updates that might require rolling back if an error occurs. I also discuss practical techniques for implementing batch modifications, including using primary keys to efficiently filter and process data in batches, setting up external control tables to manage loop execution, and considering isolation levels to handle potential concurrency issues. If you’re interested in supporting the channel or need consulting help with SQL Server performance tuning, there are links provided to get started. Additionally, I cover some rare but important considerations like preserving previous data states for rollback purposes, which can be crucial in certain situations.

Full Transcript

Erik Darling here with Darling Data, and we are back to talking about stored procedures in SQL Server. And in this video, we’re going to talk a bit about batching modifications. Because there are some things that come along with batching that not a lot of people think about as they get along in their process of stuff. Right? And, there are some important things to consider, depending on what kind of batch modification you’re doing. And, also, what, what, like, how you need to handle various, like, tying back into last video about server-seed with the error handling stuff, sort of figuring out, like, okay, well, like, what’s going to happen if under these, these circumstances, like, if I hit an error, if, like, something weird happens, like, what do I do? So, we’re going to talk a bit about that. If you like this channel, and the content that I produce, and you think, wow, Erik Darling is worth four bucks a month, there’s a link down in the video description, where you can become a member of this fine channel, and, and do that. But, don’t let four dollars a month be, like, like, like, the limit. You can go, you can go higher than that if you want. Like, I’m not gonna, I’m not gonna complain. If you want to, like, instead of buying me, like, some of one espresso, you want to buy me some of two espressos, that’d be, that’d be cool, too. If you are, if you’re, if you’re, if you’re more of a tea person, and you, you just don’t want your money going towards the, the evil coffee bean, you, you, you can do all sorts of other things to support the channel. You can like, you can comment, you can subscribe, and if you want to ask questions that I will answer publicly in my office hours videos, you can do that via that link, which is also a link down there to make it really easy for clicking. If you need consulting help with SQL Server, if you are having performance problems, and you, you think it’s high time you hired someone to come look at these things, well, guess what? My rates are reasonable. I do all this stuff, and I’m great at it. Thanks for watching. If you want some SQL Server training, I have that, too, also at a reasonable rate, 100, but 150 US dollars per month once you apply that there discount code, and, and, and, I don’t know, you, you, you sit, and you watch it, and you learn, and you get better at stuff.

You actually have to watch it for that to happen. It just does, does not happen through osmosis. You cannot just headbutt the screen and receive information. SQL Saturday, New York City, 2025 is happening May the 10th, with a pre-con on May the 9th, a performance tuning pre-con by Andreas Walter, and we will, we will have a great time on both of those days. I will be at both, um, serving you lunch. I’m gonna be dressed up like a, like a, like an old school lunch lady. Hairnet, apron, rubber, weird baggy gloves. I got, got the works. It’s gonna be fun.

Um, anyway, let’s talk a little bit about batching modifications. So, um, the goal of batching modification is to accomplish a large number of row modifications without, like, causing a big issue on the server. Um, you know, like, if you have a very, very large amount of data that needs to change in a database, um, batching modifications is a great way to lessen the impact by separating it into smaller changes, smaller batches of changes. Um, you know, like, from the perspective of, you know, like locking and blocking, uh, the transaction log and, uh, you know, like hitting, like physical problems with the server, like, like, like your log file fills up and like you, like you hit an error and the whole thing rolls back.

Um, you know, like having, like hitting a, like hitting an error in one batch is a whole lot less devastating than hitting an error when you’re like 99% of the way through a really big modification. Uh, there’s just a lot of stuff that, um, you know, there’s a lot of stuff that, um, you know, there’s a lot of stuff that batching modifications can help. One thing that it may not help is the overall time it takes to get that amount of work done.

Cause you still have to do that full unit of work, right? Like let’s say, um, you need, you have to do something to like 10 million, 20 million rows and doing it all in one chunk would take an hour. Uh, batching up the modifications, it might still take an hour, but the point is, during that hour you’re, you’re just biting off smaller chunks rather than trying to do the whole thing at once.

Right. That’s, that’s the goal of batching the modifications. Uh, probably like, you know, obviously like the canonical post for me about bashing modifications is Michael J. Swart, uh, originally published back in 2014.

I still reference it whenever I need to start something, uh, just to make sure that I’m like, like mentally, I’m like in the right place for it. Uh, you might have some other requirements as well. Uh, a while back I wrote this post, uh, called batching modification queries is great until, and, uh, this is about updates.

So I do want to walk over to that, uh, that code for a second, because, uh, there is some, there is some stuff in here that, you know, does bear repeating. So like, if you wanted to, um, if you wanted to write code in a way where like, like if you’re like inserting data into a table from like something else, like batching inserts, fine.

Like, you know, like you, you, you don’t have to worry about too, too much because you have the source data over here, right? Like you’re just pulling data from here over here. And then when you’re done with that whole thing, like with all your data is there and safe and sound, you can truncate this if you want.

You can also keep it for reference. I don’t, I don’t know. I don’t care if you’re deleting data and you’re just like, this is stuff that we no longer care about, or this is stuff that we legally have to get rid of and we can’t keep it around anymore. Well, I mean, again, like if a batch error is out for some reason and you move on to the next batch, you just have to figure out at some point, like, okay, what went wrong there?

But if you’re updating data and, uh, either something goes wrong or you’re in a situation where like, okay, look, uh, I have to update these million rows. I’m going to do it a thousand rows at a time.

But if any one of these batches errors out, uh, I have to change this data back to the way it was before. I can’t keep going. I like, I like there is a problem with like the data somehow that I need to keep going. Like one handy way of, uh, dealing with that would be to like do an update with the output clause and output the, the data that changed into another table so that you can roll back that stuff.

Because if you’re doing, uh, all these batch modifications, you don’t want to do it all within the scope of one transaction because then you have most of the same problems that you would have if you were trying to do stuff in just one big update anyway.

And if you do it in like smaller transactions where you’re like, you know, you can either like do a transaction for every update or, um, we’re going to talk later about the concept of, um, like, like batching batches kind of for like with transactions.

Like, like once you commit that, that’s, that’s done. And you can’t just, you can’t roll that back. Like you’re onto the next transaction once that starts.

So like, like for some stuff, you know, you may have to preserve the data as it was before you made the changes for updates in order to roll that stuff back to something else. So the update, um, the update stuff is very useful or rather do the update with the output to capture that data can be a pretty useful thing in those cases.

Those are pretty rare though. Like those aren’t things that I run into very often. Right. So I, and like, well, there’s, there’s two things about that. One, it’s not a requirement that I run into very often.

Most people are just like, well, if one batch hit a problem, we can, you know, we can just figure out what went wrong with that batch. We don’t need to roll back like, you know, the 90 million rows we just changed because, you know, there was a problem with row 10 million, right?

It’s not, not, not a very common requirement, but if it is, there are ways to, there are ways to preserve the previous data and then like batch back through and read and like undo all the work you did.

It’s a pain, but there are ways to do it. Um, it would be nice if, uh, like it would be nice if SQL Server had something like Oracle has with like the flashback tables.

Um, you know, the, uh, the temporal table SQL Server has probably wouldn’t, wouldn’t be the greatest thing for this, but they, you could use that too. Uh, temporal tables just have too many problems for me to like recommend them is, you know, a great source of truth for stuff.

So, um, one thing or rather a couple of things that are worth talking about in here are, you know, a, like how you find where you want to start. Uh, typically you want to use a primary key column because that’s going to get you a, it’s like, it’s not nullable, which is great.

Uh, B, um, you know, uh, it’s, it’s, since it’s a primary key, it’s indexed and you have, it’s very easy for you to sort of like page through a primary key where you’re like filtering on say, like where the primary key is greater than this and then order by the primary key.

It’s, it just makes life a lot easier. If you don’t have a primary key on the table, you’re stuck with some weird cursor options, right?

There’s some very strange ways to page through, uh, unordered data with a cursor. We’re not going to talk about that here though. Cause that’s, that’s, that’s a little, a little bit too, um, a little bit too, uh, exotic for, uh, for general consumption.

But there are two ways to do it, right? Uh, you could either, you know, just like declare some, uh, parameters or declare some variables or use some parameterized stuff.

And you can find stuff in like the, the table that you want, the rows in the table you want to affect in one batch and then go through here and another, another way you could do this, uh, and like, so actually one thing that was worth talking about here is, um, whenever I need to do something like this, I like to set up like sort of an external control table.

Because what happens is, uh, you know, if like, let’s say this batch runs for some amount of time, right? You can build, you can build a timer into this where you’re like, you know, if this has been running for six hours, it needs to stop.

Right. Like, you know, like even like Ola Hallengrin scripts have like stoppers in them. Like this job is only allowed, like after the job is run for this long, we’re like, don’t process any more commands. So you could totally do something like that here.

And you could totally do something like that with a control table. Another thing that, but one thing that I like to do is, uh, has to have an external control table.

And, you know, I like to capture some information about what, like, you know, what’s going on in there. So like, you know, like if I have multiple jobs that have to do something like this, I’ll put in the job name, the name of the loop that I’m currently in. If there are multiple like steps to it.

Um, if the name of the step that I’m in, if it’s the current step that’s executing, or if it’s just like the next thing, uh, the last completion time. And usually you compute, like a computed column where I’ll have like the start time and the completion time, like set, like, uh, computed out with date diff to like minutes or hours or seconds or something.

Uh, so that I can, uh, I can very easily, um, like sort of like gauge how, like if I, if I need to fix anything in any of these loops, right? Like, like how long does this take?

Wow. There’s something that’s terrible in there. Maybe, maybe an index or something. Uh, but I like to have a, like at minimum a table that helps me control if the loop should keep going because inevitably what happens is like, like if you don’t have anything built in to stop it, like you’re eventually, you’re going to be like, uh, okay, well I’m going to do something and, uh, this has been running for a long time.

Uh, it’s getting into the business. I’m just going to hit cancel. Like if you’re, if you’re in the middle of this, like, that’s kind of annoying. Like you want this thing to be able to finish. So what I, what I like to do is set up sort of a loop, what I call a loop controller table.

And, uh, you know, I’ll just say, you know, should I continue? Yes. And then in the loop, I’ll say, you know, uh, while the, the row count is greater than zero.

In other words, while I still have rows to process and, uh, you know, like I’m allowed to continue, I want to keep running this loop. So that way, if I decide that like this thing has been running too long, I can just update the loop controller table to false.

And then the loop will stop the next time before it starts another batch of modification. So I can at least finish the last thing that I’m doing, uh, cleanly before, uh, chain, before bailing out.

So like, you know, just, you know, like, and again, if, if, if you have a loop controller table with like, you know, job name, step name, stuff like that. And it, like, you would have to just tell your where clause to make sure you’re ending the loop for the thing, the specific thing that you care about stopping.

Another, another interesting question that comes up with, um, with these sorts of batch modifications is from, is around isolation levels. Um, again, this is not a terribly common requirement, but it is worth noting that no matter which way you do this, uh, you are subject to concurrent potential concurrency issues.

So if you care about that, you need to think about like a, like transactions and B locking hints for when you find data. Uh, so there are like, there’s essentially two ways of looking for it, but both of them are subject to concurrency problems potentially, like not, not always.

And you might not care about them, but if you’re, if you’re not okay with them, you do have to think about that is like, let’s say like you’re using read committed, right? The, the crappy isolation level.

If you’re unfamiliar with why I think read committed is a crappy isolation level, I highly suggest you watch my series. Everything, you know, about isolation levels is wrong, very good stuff in there, all about problems with read committed.

But, uh, generally like, let’s say you do something here and you find this data and then something happens here in another transaction, which alters like whatever, like the, like the, the correctness of the data that you found here, like maybe invalidates something that you found in here.

Like let’s say there was some additional where clause where like you’re saying where the largest key processed is greater than or equal to the last thing that I just did. And the last modification date is less than, is like older than it, like a week or a month or a year or something.

And then something happens that like changes that last modification date. And all of a sudden you would be doing something to a role that you shouldn’t be doing something to anymore. Uh, that could be a problem, right? Like all of a sudden you shouldn’t be, you shouldn’t be doing something with this.

Right. And that could happen if you did the update like this too. Um, because you know, in this, in this scope, you would, you would take a lock here, but in this scope, you would not have, um, you would not have that same, you would not have any locks taken that would prevent data from being read or prevent data from moving around data you’ve already read.

So something could happen, like even with like a sub query thing like this, where you would invalidate the stuff that you would, you would be working on in the update. So there’s, there’s, you do have to be careful. So in those cases, um, you know, like depending on how, depending on what you care about and what you need to, what you need to happen, you could either add in locking hints.

Uh, you could use a static cursor, which copies the table data that you’re going to be working on, um, if you don’t want that to happen, you can use a dynamic cursor where it would be working directly off the table data.

Uh, or if, if you, if you like a key set cursor would also technically preserve the key values, but not the, not any other table data. So the static cursor might be the best way to go depending on what you want. Uh, but you would might have, if you, if you do care about isolating the work that the batch is doing and not allowing any other queries to mess with that data while you’re in the process of like either finding it or modifying it, you might have to use, uh, you might have to, you know, use a transaction and you might have to add in, um, oops, that’s, uh, that should have been upd lock, not, not hold lock.

Hold lock is a synonym for serializable, which means I probably messed that up somewhere else too. But anyway, uh, so you would have to, um, you know, because like, actually this came up in the YouTube comments recently, one thing that I like about some other database platforms like Postgres, MySQL, I think even Oracle has it is the select for update clause.

So like, instead of having to be surprised by locking hints in your from clause, you could do something like in, like when you write the query, you say like select for update up here.

And then that it turns that select into a locking select. So you could, you could do that here, like take locks on the rows that you, you, you find in this batch and then like pass that in here.

But again, you would need, uh, you would need this down here as well. If you want it, like, so like if you’re finding the next batch, next batch, next batch max, uh, using a select before you do the update, you would have to put that in there.

And then if you are finding the, the, the thousand rows and aware clause, you could do that down here, but you would need the locking hint. If you care about like nothing else, like messing with that data as you make your way through.

Um, but, uh, like kind of going back to, um, uh, some of the stuff that we talked about before, uh, in the previous video about like error in situation handling, writing batches like this is, uh, a great place to have retry logic built in, uh, because there are going to be taught, there are going to be times when your batch gets either deadlocked or blocked and you don’t want to, uh, you don’t want to like keep trying to, or like cause a bigger blocking chain that might already exist.

You like just want things to kind of keep going and make progress depending on the batch or the, depending on the purpose of the batch, you might, you might not care to the point where you could put like a read past hint in your query as a, as a locking hint so that you skip over any rows that are locked.

But you know, that, that’s a, that’s a, that’s a, that’s a, that’s more like a queue table situation. What we’re talking about here is, um, where you like a situation where like, you know, let’s say you kept hitting deadlocks with, when you try to do your modification, or let’s say that, you know, um, if you don’t want your process colliding with anything else, you might stick like a lock timeout or like, you know, retry logic if you had a deadlock.

And this comes back to like how you would incorporate some of the, the retry stuff that I talked about in the last video, uh, with a, with a batching process so that you could very easily, um, you know, like if you hit a problem within your loop, you just go back and start your loop over.

Like you just retry that batch the next time around. So that, that would be one way of doing it. So, um, yeah, anyway, uh, there’s a lot of interesting stuff to think about when it comes to batching modifications.

Um, you know, a, uh, is, is this something that I can batch? Is this something that like, um, like I, I, I, I, it’s even allowable to, you know, split this up into chunks.

There are some situations where it’s not, uh, the best way to write the batch. Again, that Michael J. Swart blog post is probably my favorite. Like it’s a great starting point for things. It’s a great explanation of things. There’s a lot of good technical detail in there.

Um, if it’s an update loop and you need to like maybe, maybe roll that stuff back, then, you know, that, that would be, that would be one thing. Um, I guess that, that could go for deletes too, but that depends on how much you care about the data.

Uh, and then of course, um, you know, like all sorts of other concurrency stuff like should be at least on your mind when you’re writing this stuff. You might decide immediately that you don’t care, but at least, you know, have the mental conversation with yourself about, uh, how much protection, uh, your batching process needs.

And, uh, always, always, always give yourself a way to bail out of the loop. So have that external control table or like a job logging table, uh, where you can control exactly like, you know, how long a loop can run for, or, uh, if you want to bail out on the next loop through, because that gives you a lot of flexibility for controlling the loop.

It’s not just hitting cancel in the middle of it and wondering when things will finish. So thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in the next video where, uh, well, I haven’t quite, I haven’t, I haven’t quite, uh, organized the schedule yet. So, uh, the next video might be a surprise.

We’ll see. We’ll see how it goes. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Error And Situation Handling

All About SQL Server Stored Procedures: Error And Situation Handling


Video Summary

In this video, I dive into the world of error handling and situation management within stored procedures, emphasizing their importance in SQL Server environments. Whether you’re a seasoned DBA or just starting out, understanding how to handle errors and unexpected situations can significantly improve your stored procedure’s robustness and reliability. I explore various scenarios where you might want to implement error handling, such as validating input parameters, managing empty result sets, and dealing with critical processing issues. Additionally, I discuss the nuances of using row count big for efficient data retrieval and how it can be integrated into triggers. For those looking to support my channel or get deeper insights, I provide information on becoming a member, asking questions during office hours, hiring me for consulting services, and attending SQL Saturday New York City in May 2025. Whether you’re a beginner or an expert, there’s always something new to learn about stored procedures and error handling.

Full Transcript

Erik Darling here with Darling Data. And we have a very interesting topic for today’s video, which we’re going to continue on learning about stored procedures because stored procedures are an important part of most, you know, DBA performance tuning development work in most SQL Server environments. Granted, even if you are using an ORM, well, golly and gosh, you can call stored procedures from an ORM. Successfully in most of them. So they are worth talking about. Now we’ve covered a few things and today we’re going to talk a little bit about, well, as soon as I click on the right thing, we’re going to talk about error handling. I’m going to tell you ahead of time. We’re going to talk about some theoretical stuff around error handling. This is not just going to be like specifically error handling, like, oh, I hit an error and now I need to handle it in some way, but also sort of like situation handling with like parameter values and like local variables and stuff because sometimes you might need to do a little bit of magic on those to make sure that nothing too screwy happens.

So it’s not just error handling. It’s like bad situation handling in store procedures. If you would like to support this channel, well, man, do I have good news for you. You can become a member of the channel and join about 60 other folks who have so kindly offered me up some money every month in response to the incredibly high quality content that I produce here for the masses.

It’s a nice way to say, think of a little tip jar, except not one of those like, you know, annoying ones where someone’s like, give me a 90% tip for the croissants I put in a piece of tissue paper. It’s like this, you get better at your job in exchange for a little bit of money. If you, if you, if you, if you’ve been shaken down by some criminal organization and you no longer, they’ve collected their debts, you are missing a finger, your wallet, your life savings, your house, your car, all of your wife’s jewelry or whatever.

You can also comment, like, subscribe and all that other stuff. I recently started or rather resumed answering questions from the public, not just in comments and whatnot. If you want to ask me a question that will be answered publicly in my office hours episodes, you can do that by clicking on that link, which is again available just about there ish.

Might be off by a few pixels, but I promise you I’ll get you in the right general area. If you need help with your SQL Server, you are in need of a consultant to help you with the health, performance, well-being and sort of general attitude of your database system. RDBMS.

You can hire me to do all sorts of things. And as the wonderful editors of BeerGut Magazine have noted for many years running now, I am the best SQL Server in the world outside, well, in like three of the four hemispheres. All right.

I’ve got world domination in 75% of the hemispheres. At least until, at least until China steals, steals my tech. And then, then we’re all, then we’re all doomed.

They’ll be deep seeked. If you would like to get access to my core training bundle, you can get, you can get all of that for a hundred and, but 150 US dollars for the rest of your life. Again, in the video description, it is, it is available there.

And I promise you, it’s a wonderful deal. You pay the fee once and you have entry forever. SQL Saturday, New York City is coming up this May 10th in 2025.

It’s taking place in the scenic, now congestion-free Times Square, New York City. So you can, you can show up there. You can learn a whole lot more about data stuff and you can leave a happier, maybe not healthier person because you’re going to drink, you’re going to eat a sandwich.

You’re going to eat a, eat a cookie and you’re going to drink a lot of soda. That is, that is generally the tenor of how these days go. So that’s, that’s, that’s going to be what happens.

I don’t know that you’re going to be healthier. You’re also going to do a lot of sitting. There’s not, not a lot of physical activity. You’re probably not going to get your 10,000 steps in, but you can show up and you can, you can learn. So if you can get a break of mental sweat for whatever that’s worth.

With that out of the way though, let’s talk about error handling stuff. Now, no talk about error handling would ever be complete without noting that Erland Somerskog, our dear friend, our dear friend of the summer, summer bog or something, has written a, a very long article on error handling. Just like Erland, it has three parts and three appendices.

So if you ever meet Erland, you can ask him about his three appendices. Don’t, don’t ask him about his three parts. It’s a private matter.

But if you need to learn like the whole in and out of error handling, that’s the best place to go. That link will be in the video description, just like all the other helpful links I provide to you. So I would, I would go there if you are just completely new and lost on it.

What I want to talk about is sort of like what, when and why you would want to implement error handling and situations where you would maybe want to do something that raises an error because someone has like, like something has just screwed up along the way. It might not be like you hit, like you did like a primary key violation, foreign key violation, something like that. Like stuff that throws errors, like divide by zero, whatever it is.

There’s all sorts of stuff that SQL Server would be like, whoa, that’s an error. What I’m talking about are situations that might arise in your store procedures that you might want to check for and say, hey, this happened. Bail out, right?

Don’t go no further. Everything’s going to be screwy. So the type of like certain errors and conditions you might want to like, or rather the outcomes of them that you might care about are stuff like this. So you might want to like, depending on like where the, where the procedure is getting run, you might want to print or log an informational message and keep chugging along.

Right. You might want to say like, hey, like zero rows came out of this thing where like nothing else is going to happen. Let’s stop here.

Or like just log like zero rows came out of this thing. None of these results are going to be good. You could also do something like return that message to the user. If it’s something that they would actually see and be able to take action on.

You might want to either keep going. You might want to like roll back or like, you know, or, you know, just kind of stop execution. You might want to like just log the error somewhere and keep going.

Or you might want to use the type of error that you hit to retry the thing and like move on from it. If you’re not going to do any of this stuff, right? If you’re not going to do anything at all with your, with the error that pops up, there’s really no sense in doing like begin, try, like and catch anything.

Right. There’s just not a lot of sense unless you’re going to handle the error in some way or like, like raise the error to something else in some way. There’s no sense in error handling.

Like if you say like begin, try, begin, train, and then there’s just like a commit and there’s like no, like nothing substantial happens on like a catch block or like, you know, you’re just not doing anything with it. Then don’t even bother. Right.

Like if you just say like begin, train, blah, blah, blah, commit. And there’s no like, what if for that? There’s no, there’s no like alternative. You hit something crappy along the way. Nothing rolls out of our stuff back.

Like you just end up with bad data, right? It’s not a good situation. So unless you’re going to like do something that like offers some kind of like secondary outcome to everything happening just fine. There’s no sense in you implementing any of the logic at all.

So like just, just to start with this sort of simple example, let’s say we have a store procedure that accepts start and end date parameters. Right. So like someone’s like, well, I want to start a week ago and I want to end up with like, you know, whatever happened today.

Right. So you’ve got seven days worth of stuff in there. You could, if, if, you know, for various reasons, someone might make a typo or someone might like transpose the order of your parameters and put like the wrong date in somewhere. Like all sorts of this, you know, errors could like mental errors could happen.

Like they’re working with a dropdown. They’re like, oh, like, you know, oh, I thought I picked like the seventh of last month and the seventh of this month. But I accidentally picked the seventh of like, like the last month.

And then the seventh of the month before or something. So, or like they just messed stuff up. So like the dates wouldn’t return anything. You might want to think about helping them out in some way or letting them know or just something along those lines.

Now, just as an example here, and this is, this is something that I do in all of my analysis procedures, because with the analysis procedures, you know, like I’m running them in management studio. I’m not like running them from an application. There’s no front end to them.

It’s not like there’s no, there’s nothing, nothing like cool going on with them. I can, I can, I can make the decision to say, hey, I’m going to try to fix this for you. Right. So when you, so let’s say like, you know, like in my store procedures, one thing that I do like at the beginning is check for like weird conditions with parameters where things might not go the way people expect based on what got passed in. So like, I just, I usually have a simple check for something like, Hey, if the start date is greater than the end date, here’s what we can do.

So the first option is you can throw an error, right? This is just an informational message and then try to like fix the parameters. And I’ll show you examples of that below.

Right. The other thing you could do is say, Hey, why don’t you try that over again? Because you like the start date is greater than the end date. That’s not going to return anything.

There’s no sense in moving forward. Like, like, Hey, like reenter that stuff in the correct way. Right. So you have two options there. If you wanted to switch things around, you could do something like, uh, declare, um, a couple local variables, store the current start and end date values in there and then swap them around. Right.

So you could do something like that. Right. So that’s, that’s one option. Uh, the other thing you could do is mess with people and just like give them random values. Right. I’m not saying you should, it’s an option.

Right. Uh, the other thing you could do is say like, you know, like I would print an informational message and be like, start date was greater than end date. We’re going to default to something sane. So just, you know, make the start date seven days ago and make the end date like, you know, a minute from like right now.

So you get like all that, that, all that data for that week. You certainly have lots of options in there. There are other times where you might hit some like critical processing issue where maybe like data from something else wasn’t ready or like you didn’t pick anything up.

That’s like new to do anything with. And you might want to stop there. So like, let’s say you had some, like there’s some part of a query where you’re like, okay, I’m going to put data that I need to work with into a temp table and then figure it out from there.

If you hit a situation where like nothing comes from this, there’s nothing in that temp table, there’s no need to keep processing like empty set, empty steps. Right. There’s no need to keep going like, oh, well, join to this, join to this, join to this, join to this, join to this.

Just, it’s just a, it’s just a waste of time. So you would probably want to just throw that out to an error here. Now, there was an article recently on SQL Server Central and it was like, how do you use ADAT row count?

And I was like, what year is this? We should, everyone in the world should be using row count big, not ADAT row count. Row count big is the preferred method for this now.

It’s why Microsoft put it in the product. There’s no use to, no need to use ADAT row count any longer. Row count big, it’s the way to go. That’s also very critical for when you’re using triggers.

Now, this is not an error handling thing. Again, this is a situation handling thing where at the beginning of every trigger, before you do anything else, just put this check in. If row count big is zero, bailout.

Right. It’s no need to go further from here. No need to like go into all the different steps and check inserted and deleted and whatever you do when you’re triggers. And try to do all this stuff and cursor over everything and audit data and all this other stuff.

It’s just useless. Right. It’s no good. Now, when it comes to retrying things, there are lots of good reasons to do that. I’ve, you know, in my life, I’ve written retry logic into a lot of stuff.

And I’m going to share with you like three different situations for doing that. Three different kind of, not completely different ways, but just different times, like different examples of ways I’ve done it. So stuff that I usually do is like around like either deadlocks or lock timeouts.

So get SP get app lock is another one where you can do like a lock timeout thing. And those are all stored over here. So just sort of like a loose sort of like a skeleton, right?

Like the sort of like outline of how I do this stuff is, you know, you have some, I mean, these can be parameters depending on how you’re doing stuff. But you can like local variables are fine for these. And what I usually do is say like, hey, if my retry count is less than my max retries and success is zero, which it currently is, then I want to try to do this.

Right. So I’ll try to, you know, redo some step. This is just a stupid update query to have something in place here.

If this is, if this goes through and does its thing, then I set success to one and I don’t, I won’t reenter that loop. Right. There’s no need to, but if something happens in here and let’s say I hit a deadlock, right? These are the, these are the things that I typically look for with deadlocks, even though I think one of those is technically locked timeout, which we’ll see below.

These are, these are the things that I typically look for with deadlocks. And if I hit one of those, I’m going to increment my retry count. And one thing that I like doing is this, where I set.

So this is, this is like, like exponential back off. Right. So every time I do this, I wait a little bit longer to, to retry. So I’m not just like hammering in a loop, like real quick.

I want to like wait a little bit longer because I might be waiting for something else to finish. Right. So I might want to like back off a little bit. I’m going to, might want to back off exponentially and then retry in there.

Right. So I’ll set my, I’ll, I’ll set up like this thing. And every time this, um, every time we enter this, this delay will get a little bit longer. And then, uh, we’ll do like the wait for delay thing in here.

If it’s not a deadlock or some other lock issue, then we’ll just throw the error and be like, Hey, I don’t know what’s going on. This is, this is not, this is not going well. Maybe it’s a primary key violation.

Maybe it’s a foreign key violation. Some other constraint violence. It’s like something else is happening here. That is not good. We’re not waiting on a locking problem anymore. We’re done. And then of course, if six, if success is still zero, we will just log in. We’ll just raise an error and say, Hey, we tried as many times as we’re allowed to.

We’re, we’re done here. You could run this thing some other time. Uh, so that’s one way for deadlocks. You can also do the same thing with lock timeout. So rather than like, like again, the same starting thing and the same like reasons for entering the loop here, but then we set a lock timeout.

And if we can’t get a lock to do this, then this thing throws an error here. And then if we hit, that’s what it is. If error number is one, one, two, two, two here, then we’ll start the retry process again. Now, one really cool way of doing this is with SP get app lock with SP get app lock.

You can set, you can set a lock timeout right in there. So if you’re not able to get the lock you want with SP get app lock, which again is like a, like a weird logical, like locking thing.

It’s like not a table or an index or like any other kind of structure. It’s just like this in memory placeholder where like if there’s a lock that exists here, if SQL Server, like put something in the lock manager that says, Hey, go no further.

Like there’s an exclusive lock on this thing. You, you like you, you’re blocked there. Right? So SP get app lock is great for that. Now, I want to make it perfectly clear.

SP get app lock can and will cause blocking. Um, it’s right in the name. It’s SP get app lock lock, right? Like it’s supposed to do what it does.

It’s, it’s like written out in the name. This is what it’s used for. Take a lock and hold it. So nothing else can like, like you can use it to serialize code, like access to code blocks, all sorts of neat things.

If you’re, if you’re going to complain about SP get app lock that you tried it and it caused blocking, well then you, then it, then it did it right. Right.

It was doing the right, it was doing what it’s named after, like right there. It’s very explicit. So if you get, if you’re able to get, um, if like, if you’re able to get the, the lock result from this, right, would be, which would be greater than zero.

Then you would begin and do your stuff. And then you would, so this is a, uh, good, uh, the way that I typically use SP get app lock is I make the lock owner a transaction, right?

But that means you have to do a little bit of transaction handling to support that. So if this, if this completes, right, then you have to do like the commit transaction here, right?

There’s like a, there’s a begin transaction up here for all this, right? So this transaction is what owns this. And then if, if this is successful, then we commit the transaction after that completes. If the lock times out, then we have to go in here and we have to roll, we have to do a rollback because our transaction will still be open up there.

So when we, we get into here, we have to roll that back. So we’re not, we’re not like, we don’t have another transaction open, but then we can do all the exact same retry stuff down in here.

And like, we’ll do the same thing where we like, you know, if, if it’s something else, then we’ll throw the error. And then we’ll, if, if we exhaust our retry attempts, then we’ll note that here. So that’s like three different ways that I’ve done retry things and different retry lodging in different ways that I think, I think have worked out pretty well.

Now, there are a lot of processes that you might write, like especially batch processes that might require robustness in various ways that normal code wouldn’t. Batch processes often need certain error handling and logging methods built in.

You might want to retry certain things like I just showed you. You might need some self-healing. So like if anything fails, you can restart where you left off rather than like, like, like, like start again from the beginning and like fail a bunch of times because stuff’s already done and then move on to the stuff that didn’t get done.

I’m not talking about, I’m not going to talk much about that because that’s, that can get really complicated depending on what you’re doing and how, like how you’re, how you’re, how you’re doing stuff with the data.

But just as like, like a, like a sort of easy example, let’s say you were building a process to migrate data from like one really big wide table to a bunch of normalized tables. You wouldn’t want to like, you wouldn’t want to have to like redo a bunch of stuff in there.

You would want to be able to pick up from where you started from. Granted, there are like, you know, you can’t destroy the data in the base table yet because like people are probably still going to be using it until everything’s in the normalized table.

So there’s a lot to sort of figure out in there. Now, uh, sort of recently on across social media, I asked a question because like every time you talk about table variables, someone’s like, oh, they survive errors and rollbacks.

You can do all this stuff with them. I’m like, oh, okay, fine. So I asked around, I was like, Hey, like, like everyone talks about this. How many people have actually done it?

Cause like, I never see anyone using table variables for this. I always see people use people using table variables for the normal dumb stuff where they should be using temp tables. So like I asked the question, like, Hey, you ever done this?

Like what do you do? And most of what I got back was that people would use it for like either extended verbose logging of like, um, steps in a process, like an ETL process or something else, or they would use it to like capture errors and log those to a table because they would be able to roll back a bunch of stuff, but maintain what’s in the table variable and then dump the contents of the table variable into a logging table after the rollback.

But before like the, like the, the, like whatever catch block completed, which is fine. Um, one thing that I think would be great with table variables is if they were like, if you were able to like capture bad data with them and like, I would never use them like for the base of an ETL process.

Cause table variables mess up a lot of stuff, right? Like you can’t do a parallel insert into a table variable, which hurts in a lot of ways. Like if you’re like really doing ETL where you’re moving a lot of data around, that’s a, that’s bad news.

And of course, like the sort of like the bad, like, like base cardinality estimates, like, you know, like, like prior to the table variable deferred compilation stuff that I’ve, I’ve talked about, you know, you get either like the one row or like, you know, like let’s use a recompile hand or you, it’s just, there’s a lot missing from table variables that would make them good for large scale ETL processes.

On top of that, like, like, like you get no like distribution, like histograms for them. Like there’s no statistics on them. So they just, they just fail in a lot of ways where like, like if they had that stuff, I mean, there wouldn’t be table, they’d be 10 tables, but like, or like rather nothing would distinguish them from 10 tables, but like they had all that stuff that might be useful.

But one thing where I get annoyed with table variables is like, uh, let’s say I wanted to like do something, right. And, uh, I want like, like, I wanted to like put data from like some ETL process into another table. And I also wanted to like figure out if, um, like, like I like, like typical, like primary key constraint violation, something, right. Like I wanted to know which rows mess things up. Like, let’s say I create this table and I do something like this, right. I declare a table variable called survivor. Uh, I insert stuff into my table here, and then I output the inserted data into the table variable here, right. And then I can select from the table variable afterwards on the first run.

This is fine because there’s nothing in the table, right. I just created it. But on the second run, we just hit an error with a constraint violation and nothing returns from here. So what I, what I think would be a really useful, um, really good use case for table variables that SQL Server doesn’t have as far as I’m aware, no, no one has this. I think it would be really useful if there were a way to like output, but say like errors only so that you could get data that threw it, like caused a problem that threw an error in the table variable without like before the whole thing fails. So it’d be nice if like there were a way to be like insert errors only into like in the output clause or like rather output errors only into the table variable so that you could like, like figure out what data caused the problem and then log that or return that somewhere and be like, Hey, this is what screwed up. This is what you have to watch out for. But of course we don’t have that. And I don’t know if we’ll ever get that. I don’t even know if that’s a thing anyone’s thought of before, but I think it would be very useful. Anyway, that’s all I have to say there. So thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I hope to see you over in the next video and good goodbye for now.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

All About SQL Server Stored Procedures: Formatting

All About SQL Server Stored Procedures: Formatting


Video Summary

In this video, I dive into the world of T-SQL code formatting, sharing my strong opinions on best practices that can significantly improve readability and maintainability. I start by addressing some common issues like tab usage (which I strongly dislike), leading commas, and inconsistent capitalization—both all uppercase and all lowercase. These habits, while seemingly minor, can lead to confusion when trying to understand complex queries or stored procedures. I also emphasize the importance of proper indentation, new lines for logical sections, and grouping related code together to make debugging and maintenance easier. By organizing code in a structured manner, you not only improve your own understanding but also make it much simpler for others who might need to work with your code in the future.

Full Transcript

Erik Darling here with Darling Data. And I’ve got a big smile on my face today because me and Bats are going to be talking about one of our favorite subjects, and that is code formatting. Now, code formatting for me of the T-SQL variety essentially comes in, well, I mean, I guess you could call it three measures. One is the way that you write the code. It’s a query, right? The way that you indent things, capitalize things, so forth. Place your commas, right? And also the way that you organize code, so larger groups of code, which you might associate with something like a stored procedure. You are doing all of the things that sort of belong together. all together. So when you declare variables, you’re doing that all in one place. If you’re setting those variables equal to something else, you’re also doing that as much as possible all in one place. We’ll talk about an exception for that. You’re creating temp tables, you’re doing that all in one place and so forth because one of the worst things that I run into when dealing with client code is stuff that’s been added on over the years where, you know, you’re reading through a stored procedure and you’re like, oh, I’ve accounted for all these things. And then you get 2000 lines in and you realize some other knucklehead has, you know, added all these other things and declared stuff and created tables like down through the second half of the procedure.

And now it’s just like this mishmash, right? It’s just like it makes things a lot harder to follow. And when you were trying to figure out like the things that are in play for like what you have to work with, it is much easier and much more convenient to have everything as much as possible together. So that’s what we’re going to be talking about in the formatting episode of our journey through SQL Server Store procedures. But before we do all that, of course, we have to talk about the normal sort of stuff and how you can engage with your dear friend, Erik Darling, more. If you would like to support this channel, helpful little link right in the video description. You click on it, you give me four bucks a month, we’re all happy. You can choose more than four bucks a month if you’d like. But one pre-tax cup of one pre-tax single shot of New York espresso is just about as low as you can go there.

If you are for some reason opposed to caffeinated beverages, you can like, you can comment, you can subscribe. And if you would like to ask me a question that shows up in an Office Hours episode, there’s a link also in the video description that will take you to this thing and you can submit your questions. If you are beyond the help of YouTube lessons and like basically anything else and you need help performance tuning your SQL Server, I am available to do all of these things. And as always, my rates are reasonable. Also reasonable are my rates for SQL Server training. I’ve got about 24 plus hours of it. You can get it for about 150 USD. That belongs to you for the rest of your life. No backseas. Don’t want it back. You’ve probably spit on it.

As far as upcoming events go, SQL Saturday, New York City 2025 is taking place May the 10th. If you should attend, if you are a company watching this, you should sponsor. And we have a pre-con on May the 9th, all about performance tuning with Andreas Volter. So, he’s a smart fella who used to work at Microsoft. Probably the smartest thing he’s ever done is leave Microsoft. But you know, we’re all happy to have him back in the real world. So, with that out of the way, let’s talk about code formatting here. Now, that was the wrong tab to have up front. So, first, there are a list of things that I have rather strong feelings about.

I strongly dislike tabs strongly dislike tabs I realize that there every time I talk about my dislike for tabs there is a chorus of people talking about the the visually impaired and how when they use tabs they can do things better and and if you are in that situation if you’re in that condition and you and you want to use tabs I don’t blame you I used to work with a fellow who did QA he was in real tough shape and in the eyeball department you would walk by his desk and there would be like a single letter on the screen like I don’t I don’t know how he got anything done but he was he was very good in the QA in the QA department apparently he had he had worked out whatever whatever he had to get whatever he had to work out in order to to to be very proficient in his job but for me when I’m writing t-sql and let’s let’s be honest most of the time I am maintaining this t-sql on my own there are there are a handful of contributors who I am ever grateful for but I find tabs or more specifically when tabs and spaces get mixed to be quite offensive there’s nothing worse than when you’re trying to like format and line up your code and you’re like you know you hit enter and then you like delete two spaces and then you go to delete that third space and it jumps back like a fight two full tab stops for some reason you’re like a lot of stuff going on in there other things that I find offensive are leading commas anyone who says it makes taking things in and out of the select list easier has naked never taken the first thing out of this select list so that that causes problems for the 40 other columns you’re selecting things that I dislike but perhaps feel somewhat less strongly about all uppercase and all lowercase all lowercase of course just looks strangely juvenile and all uppercase looks like Oracle shouting I really dislike when new things are not on new lines so things like from join on where and or order by and group by I guess I guess partition by should be in that list too I think all of these things are deserving of their own lines so that you can figure out exactly where things start and end it becomes very confusing when you know just like it was like you’re reading a windowing function or something and it’s just like row number over partition by order like all across on one line these things deserve your attention select queries where it’s like select long select list from table and then like down here it’s like where how did you get that like put stuff where everyone can see it it’s especially helpful because you know you are generally reading t SQL in a very like sort of top-down fashion and being like often like the list of columns in the select list is maybe not quite as important as figuring out like oh we’re selecting from these tables join to these tables you know things like that so this stuff really does deserve new lines a lack of indenting and t SQL makes things incredibly confusing and proper indenting really does lead to it being a lot easier for you to figure out which blocks of things belong together especially if you are doing the in and end blocks if you have parentheses contents it makes it very much much easier to figure things out when things are indented and you can see this block belongs here belongs to this thing leaving as out of table aliases it’s just somewhat irksome there was actually kind of a funny story here is there’s one client who I was working with and we were looking at their code and we realized that they like they were using the old style like from table no lock hint right not with no lock but no lock was spelled wrong and they were like wait how does that even work and I was like because SQL Server is not interpreting no lock with two K’s is a hint it’s it’s it’s interpreting it as a table alias so that was kind of funny I strongly prefer the column equals expression aliasing because it lines up all the column names where you can read through them and see where things are when it’s like like you know you write out this whole complex long expression and then it’s as something it like the context switch to look all the way over here to see a column name and all the other column names are here not so not so great top expressions without parentheses should be banned I hate those uh align parentheses tend to work out much better because it’s a lot easier to mentally grasp what belongs in this group of things uh when you do stuff like this and give things weird little hugs it just looks strange it doesn’t look good uh there are other things that I I will usually uh tweak and tinker with as I’m formatting code but uh that’s about you know that’s about the list of like the the the most important stuff for me I think um when you think about formatting code though like I said earlier it’s not just uh merely about where like you know this stuff up there like the indenting and the grouping and the stuff like that it makes like when you’re writing code in a procedure it’s really really helpful to have everything that belongs that like like like like has a certain action like declaring variables setting variables creating temp tables stuff like that it makes a lot more sense to do all that stuff all together so you know which section of code is responsible for that so if there’s ever a problem you know where to go like if you’re you know declaring variables like we’re going to like 2 000 lines into a store procedure or creating a table like 3 000 lines into a store procedure and there’s ever a problem with that it’s just like okay now well it’s not up with the rest of them now I gotta go down way down here to go do stuff it’s annoying not a good way to spend your time um so the things that I will lend some leniency to uh in these regards it are select into and the reason why I will uh I will allow some leniency here is because there are certain times when um you you you you want to just do a simple select into it makes life easier um perhaps there is some other benefit to it um maybe for some reason creating the table and doing the insert you are just not getting a parallel fully parallel insert from it even with a tab lock hint no sql server gets weird sometimes uh so sometimes a select into is necessary and if that’s way down in a store procedure I get it it’s not not completely off limits the other thing that um i am lenient with is uh declaring cursors and I’m going to show you an example where I was even lenient with myself when it came to declaring a cursor of course uh it is perhaps extra strange for me because I I strongly prefer cursor variables over regular named cursors because the cool thing about cursor variables is you don’t need to close or deallocate them technically you only need to deallocate because deallocate closes too but when you use cursor variables you don’t have to do what you don’t have to remember to do either one sql server scopes them and like like a normal variable and closes them out when you’re when you’re done with the the block of code so let’s take a look at my store procedure sp log hunter and I’m using this one because um I think it is the shortest of my procedures and um that that’s good for all of us because I’ve already been talking for like almost 13 minutes and we haven’t looked at any code yet and I don’t want to spend a half hour going through sp human events or quickie store where there is an abundance of code uh though I would say that if you ever want good examples of formatting dynamic sql there there are there are very good examples in those procedures so here’s sp log hunter um we talked about anti settings uh before um like I said a lot of this stuff uh like I put at the top because this is stuff people should like be aware of like care about when they’re you know doing things with store procedures since I’m never going to be calling this from an application where it might have different ANSI settings nor am I overly concerned about filtered indexes computed columns or index views it doesn’t matter for me to put this up here uh then you know some you know a little bit of green text a little bit of information about what’s going on in here right uh where to go to get support even though everyone just sends me emails anyway uh and then of course we have our parameters and I like to document the parameters up here where they actually live in case you’re just looking at the script file but I also like to do that in a little help section so sql server has a couple very helpful views um sys.parameters sys.object all objects and well you could do sys.objects and sys.types and what you can do is you can actually get the list of parameters for a specific store procedure and you can like also like put them in a help section here where you know you can just say hey this is what this means you can put like all sorts of good information in there so I like to include a description valid inputs and default values for them so everyone knows where they’re going uh MIT license stuff because I’m cool like that and then uh moving into like grouping stuff where it belongs uh before I get into most anything else uh what I want to do is make sure that I am conditionally able to execute uh the procedure that I’m running so stuff like checking to make sure that we’re SA right that’s a good way to figure out if I’m going to be able to run this thing uh if we’re an RDS I can’t use I can’t read the error log the same way as I can when I’m on regular SQL Server even managed instance does allow you to get the error logs I think I’m pretty sure it does uh but um Azure SQL DB does not um RDS does allow you to get to the error log but it’s through this other procedure that doesn’t have the same parameters and searching stuff in there I am thinking about working out a fix with that where I would just read in the whole error log and then do the search off that but I don’t know we’ll see um but here’s our check for Azure SQL DB uh where we will we will not attempt to run this here if that’s going on there’s one other check that uh results in a return right that results in a full error and that’s if someone is searching for a language id that does not exist right so if you search for a language id 99 99 99 or something uh that’s not going to happen for you but then there’s uh fixes to stuff right so this is where if parameters are set to odd values I’ll attempt to psychically correct them so that the store procedure runs without any weird mixed up stuff going on uh so there’s a bunch of that and then down a little bit further this is where I declare the variables that I need to do different things in the procedure but this all happens in one place uh and since um I have uh what do you call it uh not a whole lot going on in this one there’s a very limited list of uh variables in here and I can set those very easily here where I care about them uh down here this is where I create my temp tables and this all happens in the same place right all the temp tables get created and then we uh go and we um do some work right and this is where we actually start doing things in the procedure to you know look at what we have available so uh I talked about this in another video this is all on one line you know just because it’s a it’s a debug statement maybe not the most consistent thing but uh sometimes it just is a little bit more convenient for your sake to just sort of stick stuff in like a debug thing in here like this so that you know your store procedure isn’t 10 000 lines of perfectly formatted debugs like this isn’t too bad but you know stuff that I like to do is make sure that when I am debugging things with the temp table I include the table name in what I’m selecting from so I know what results I’m looking at because if I have a like you saw I had like six or eight temp tables up there what I don’t want is a bunch of selecting from temp tables and not knowing where the results are coming from which like especially if one is empty like what do you do right like it’s going to be hard to figure out what you’re dealing with uh I don’t know some pretty boring stuff in here uh there was something else that I wanted to get down to I believe it was the cursor variable and that is right here so this is another place where I’m sort of lenient about when things do get declared and set up uh where I declare the cursor variable here and I set the options for that here it just makes like it it just makes a little bit more sense with stuff like this to have have it declared where it’s going to be used there aren’t a lot of other cases where I feel the same way about it about this particular thing but for me uh declaring cursors is one of those things because there’s not a lot of sense in declaring a cursor and then having like way up at the top and then having someone be like well wait a minute when when does this cursor happen right so like for me declaring the cursor inside of the like or like right around the looping code is the smartest thing to do right that makes the most sense to me uh and then uh you know helpful debug stuff so this is a multi-line debug I do have the begin and end and everything on its own line and then down here there’s just a little bit more of the uh what do you call it uh more debug stuff there was something else I was supposed to be looking for in here uh well yeah when we return the the results right uh so that’s that’s sort of an important stuff but anyway when uh we when we’re like returning the results again make sure that the table name is in there so I know that this is the result of like everything that got put into the error log table that of like the stuff messages we cared about right so like this whole big block of stuff way up here uh these are all the messages that I want to search for in the error log this is the important stuff where it’s like might be something wrong might be good informational stuff so down here is where I select out of that table and then if there if we hit any errors in the store procedure I log those to a temp table and if there’s anything in the error error like that like if any of the command statements error out in there those get caught here and I return those only if there’s anything in there so uh you know apart from the like actual query formatting stuff that I feel strongly about uh you know tabs commas new lines for things uppercase lowercase print parentheses indenting things like that uh I really it does make your life and the life of everyone who has to deal with your code a lot easier if you put everything that performs a specific function all together so again like declaring the variable setting the variables creating 10 tables table like table variables things like that uh it really does behoove you to do those things so that when other people go into work with your code they’re sort of aware of like just how much of these things like they’re gonna have to look at and work with and deal with and everything else uh it also just helps to be you know well organized uh when you’re when you’re writing uh larger complex pieces of code because you know there there’s there’s probably going to be enough complicated query logic without there being a whole lot of disorganized um code logic as well so uh anyway that’s how I feel about formatting uh I hope you enjoyed yourselves I hope you learned something uh I hope that you will take some of this stuff to heart and when you’re working with queries and stored procedures you’ll start to organize and format and uh you know get things looking a little bit nicer and tidier and uh contribute to nice clean readable code um you know when when folks out there in the world start talking about how CTE make things more readable they don’t they sure don’t nicely formatted code makes things a whole lot more readable and understandable than all of the CTE in the entire world so thank you for watching and I will see you in the next video and uh goodbye

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.