SQL Server Performance Office Hours Episode 4
| If I’m trying to resolve a consistent deadlock by increasing the transaction isolation level, am I doing anything obviously wrong? I thought it was a bad sign when I was stuck between picking Snapshot and Serializable. |
| You have said that table variables, CTEs, Change Tracking, and Azure Managed Instances all suck. Do you have a full list of “features” to avoid? |
| My company wants to move to Azure, but Azure DB performance sucks. What do I do? |
| Any experience with using a UTF8 varchar collation vs an nvarchar data type? Seems like in most circumstances UTF8 is just unambiguously better for for performance and storage on net-new development on SQL 2019+ |
| why SQL Server sometimes change it execution plans when maxdop change from n to n+1 (n>1) (eg maxdop=14 and maxdop=15). No data changes, no statistics changes, no confif changes… only maxdop changes between runs |
To ask your questions, head over here.
Video Summary
In this video, I delve into some of the questions submitted by our community during Office Hours. Starting off, we tackled the issue of using higher transaction isolation levels to resolve consistent deadlocks, emphasizing that while Snapshot Isolation can be beneficial in certain scenarios, stricter isolation levels generally increase lock contention and are likely to exacerbate deadlock issues rather than solve them. Moving on, I discussed various T-SQL constructs like table variables and Common Table Expressions (CTEs), highlighting their performance pitfalls and the reasons why they might not always be ideal choices for complex queries. The conversation then shifted to Azure SQL Database and Managed Instances, where I expressed my disappointment with these offerings, suggesting that while there are occasional use cases, in most scenarios, Azure’s products fall short compared to alternatives like Amazon RDS.
Throughout the session, we also touched on the nuances of UTF-8 collation versus N-VARCHAR data types, exploring how they might impact performance and storage in SQL Server 2019 and beyond. The discussion underscored that while UTF-8 can offer advantages for certain datasets, its benefits are not universally applicable, and careful consideration is necessary to determine if it’s the right choice for your specific use case.
Full Transcript
Erik Darling here, and gosh, you know what? It’s time for another Office Hours, where I answer your user submitted questions from the bottom of my heart. If you would like to support the things that I do here for you on YouTube, on your tube, you can sign up for a membership. They’re cheap, they’re easy. It’s a fun way to say thank you. About 60 other people have done it. So far, so, you know, you wouldn’t be alone in your gratitude. If you don’t have any spare change, you can’t hit the tip jar, because you’ve already tipped your barista $30,000 this month, you are more than welcome to do other things like like and comment and subscribe. And if you would like to be a part of the Office Hours magic, if you would like to ask me a question and have me answer it, that link also down in the video description. So I save you a lot of work. If you would need help beyond just asking a question on the internet, which many of you do, I am available as a consultant with very reasonable rates for all of these things. Health checks, performance analysis, hands-on query and index, server tuning, you name it. Putting out performance emergencies and training your developers so that you have fewer performance emergencies.
All together. If you would like to get some training from me to sort of reinforce any of the concepts and topics that we talk about here. I have about 24 or 25 hours of training available for about 150 USD when you use that discount code. And that lasts for life. It is not a subscription product. If you die and are reincarnated, you would have to resubscribe. I cannot offer. I cannot offer. I cannot. I cannot offer. I cannot offer. I cannot offer. I cannot offer. I cannot offer. I cannot offer transfers of the training to new souls. That is beyond my capabilities. I am not an omnipotent being. So I’m sorry. You’re just going to have to deal with that. SQL Saturday. SQL Saturday. New York City 2025 taking place May the 10th with a pre-con on May the 9th. Some performance tuning stuff with Andreas Volter. You can catch those things coming up in three months about. Around about three months. Close enough.
Yeah. Yeah. Today’s November. Sorry. February plus three is May. So that’s five. Good. All right. Finger math. All right. Let’s do these office hours. Okay. We’re going to hit up Excel and we are going to zoom in. We are going to answer this first question right here.
If I’m trying to resolve a consistent deadlock by increasing the transaction isolation level, am I doing anything obviously wrong? I thought it was a bad sign when I was stuck between picking snapshot and serializable. Gosh, you have got yourself all twisted up, my friend. No, higher isolation levels increase the lockiness.
So first, let’s separate snapshot and serializable. They do have some guarantees in common, but the mechanism by which they provide those guarantees are much different. So if you’re using serializable or repeatable read to try and reduce deadlocks, you’re not going to have a good time. Both of those isolation levels just increase the likelihood of deadlocks because they are far more intrusive.
Snapshot could certainly help deadlocks between read queries and write queries and to write queries, assuming that you are set up to accurately use the snapshot isolation level for both of those. And you are prepared to handle errors when there are conflicts between them. Read committed snapshot isolation. Since you did not provide further detail, you just said snapshot.
There are two optimistic isolation levels with snapshot in the name, read committed snapshot isolation and snapshot isolation. And snapshot isolation, the SI, could certainly help with rewrite conflicts as long as you are prepared to handle those or conflicting write queries that are deadlocking. Read committed snapshot isolation would not help with that.
Would not help with to write queries, but would certainly help with read queries and write queries. There’s not really much else for detail in this one, so there’s not really more I can say here. But I would urge you to not use stricter isolation levels to reduce deadlocks because that will have the opposite effect.
So, yeah, careful there. All right, moving on. Our next wonderful question here is, you have said that table variables, CTE, change tracking, and Azure managed instance all suck.
Do you have a full list of features to avoid? Well, we do have to separate these things into some categories. So, table variables and CTE are T-SQL constructs, which require a very carefully informed choice to pursue the usage of.
Table variables have many issues, performance issues when it comes to SQL Server. You can’t insert, update, or delete using a table variable. So, like with the table variable as a target, using a parallel execution plan.
So, if you have like a big heavy hitter insert query to find the data you need to put in there, having it run with a parallel query plan is probably going to be a lot better than having it run with a single threaded query plan. Table variables on their best day only get table level cardinality. You don’t get a statistics histogram with a table variable.
So, two fairly big downsides for the serious use of table variables in queries that have high performance demands. They can be very good for high frequency execution queries where the ultimate plan shape doesn’t affect query performance because all the stuff SQL Server does to temp tables to make them useful for performance reasons with more demanding queries doesn’t get done to table variables.
So, the caching, creating statistics, stuff like that. Change tracking, I mean, yeah, I tend to hate change tracking. I like change data capture a whole lot better.
And Azure managed instance is, it’s just a, it’s such a sad story with Azure managed instance. I really had high hopes for it and boy did that let me down. But these are all quite different categories of things.
I don’t really have a full list of things that you need to strictly avoid. I would, I would put Azure SQL DB and Azure managed instance on that list. But we’ll probably talk about that a little bit more when we get down to the next question.
Azure managed instance isn’t really a feature. Azure, it’s just a crappy way to host SQL Server. So, no, I don’t really have a full list of stuff.
But I do try to, you know, put it out there that, you know, all of these things are in the product for a reason. There are often reasonable uses for things. My main goal as a consultant though is performance tuning.
My job is to make queries go as fast as possible. And my disdain for a lot of the things that I talk about comes from the fact that they often don’t have good performance repercussions when used. So, like CTE in SQL Server, there’s no way to materialize them.
Unless you just dump them into a temp table on your own. Pound sign temp table. When you re-reference them, the whole query inside them executes over and over again.
That’s often not the behavior that a lot of people are after. So, like my experience with these things is generally just from a performance point of view. There are perfectly good reasons to use them when performance either doesn’t matter or when they’re not causing a performance.
They’re not the reason for a performance problem. Change tracking bugs me because it adds overhead to every single insert, update and delete. You have to not only maintain the table, whatever triggers are on there, whatever index views might have that table included in them.
Whatever foreign keys might be evaluated when that table gets modified. But, like all the indexes on that table. But now you also have to maintain the change tracking tables.
And like that all just sort of, like it just adds up and adds up and adds up. You’re just not doing yourselves a lot of, like any favors by adding more overhead to every transaction. That’s why I like change tracking because it’s asynchronous.
It works off the transaction log after things get done. So, like, you know, for like the T-SQL stuff, you know, again, if you’re going to use whatever, however you’re going to write your query, make sure that you are completely informed about the choices you’re making and the way you form the query.
Whenever you want to add in, like, a new thing, whenever you want to make SQL Server start doing a new thing on top of all the other stuff it’s doing, just remember that that’s not going to be free, right? Like, absolutely nothing is free.
Like, auditing stuff isn’t free, whether it’s triggers, like a built-in audit, you know, monitoring data changes, whether change tracking or change data capture, neither of those are free. Like, even change data capture, which I prefer, it’s a lot more data to hold usually, right?
Because you have this, like, shadow change copy of the table living in the CDC schema. Like, nothing is free. There are times when you can use these things and they will not have any awful performance repercussions on your server.
But, like, at some point, like, everything, with the exception of managed instance, and, like, I don’t know, if you have, like, a useless toy 5G database workload, maybe general purpose managed instance will do okay for you.
If you have actual needs, you’re gonna have to pay out the nose to the back teeth, up, up, up the backside, money-wise, to get the business critical version, which just isn’t worth it. Like, it’s just not worth it.
It’s too much money for what you get. And adding the fact that Microsoft support is, like, just a bunch of blind rats, like, running around, bumping into each other. Like, there’s no point in using Azure for anything.
So, but, you know, we’re getting a little bit ahead of ourselves. But, like, these are, like, you know, the way you write the query, that’s one thing. This is, like, you wanting to add something into the workload with, like, temporal tables or something.
Like, all these things have overhead and all these things have drawbacks. But let’s actually answer this next question. My company wants to move to Azure.
Lo siento. Microsoft must have offered them a sweet deal if they want to move to Azure. But Azure DB performance sucks. What do I do?
Like, really, it depends on how sweet of a deal they’re getting. Microsoft is not below or above bribing people to use their services. Like, cutting them deals, offering them money for, like, like, consulting, like, fake consulting bucks that they can, like, pay people, like, Microsoft partners to, like, do work with.
There’s all sorts of stuff that Microsoft will underhandedly do to get people to start using Azure so that they can have a bad time once they’re there. You know, and, like, really, like, the only alternative cloud provider that I think feel like does things right is Amazon. RDS is a much better product.
Not perfect, but I think a much better product than Azure SQL DB or Azure Managed Instance. And, you know, like, the VMs in Azure are, I mean, they’re probably about fine. But I think what kills me is every time I think about it, like, Microsoft owns SQL Server.
They could have just run away with an awesome hosted managed product for SQL Server. They could have run the board with it. But Amazon just did it so much better.
I don’t know how that happened. I don’t have an explanation for you. But, yeah, you’re right. Azure SQL DB and Azure Managed Instance are pointless, pointless products. I wish they’d disappear.
The only thing that you could possibly do is make, like, either a feature or performance comparison case with Amazon RDS, which I assume would be commensurate to what your company is looking at in Azure. But if Microsoft is just throwing money at people or throwing service at people, you might not be able to avoid it.
You just might be stuck. Where your company is going to be pissed is when, like, after a while, they start, they realize that, like, you know, either the Microsoft money runs out and the bill goes up to normal or they find out that, like, that money is not going to go as far because the stuff that, like, where you land in Azure wasn’t nearly what you actually needed in Azure. And all of a sudden the price goes way up, right?
Like, that’s where people start getting pissed. And then, you know, it costs an arm and a leg to move from one cloud to another. So, like, the only thing you can do is, like, get way ahead of the curve and offer, like, a price performance feature comparison and just pray that Microsoft hasn’t bought them already. That’s all you can do.
Next question here. Next question here. Any experience… There we go.
Any experience with using a UTF-8 VARCAR collation versus an N-VARCAR data type? Seems like in most circumstances UTF-8 is just unambiguously better for performance and storage on net new development for SQL 2019 plus. Yeah, so the new UTF collations are interesting because they’re still VARCAR, but they have, like, the extended character sets.
Not the special characters, but, like, they have a lot of, like, the Unicode… Like, they can have a lot of the Unicode characters in them. I am not a big collation person, so I actually, like, doodle around the internet a little bit trying to find, you know, some information that I could be useful to pass on for you.
And what I got over here is… I’ll put these in the video description. There’s a very good article on collation and Unicode support on the Microsoft Docs site.
There’s also a pretty good question on the DBA Stack Exchange website with a couple pretty good answers. One of them by a fellow named Solomon Rutsky, who is a big, big, big collation person, where they go through… Where they both talk about some stuff.
What I found really interesting in this answer was this chunk of stuff. And it probably helps to have a little bit more of this front first. So, like, where…
So, like, the only scenario that truly benefits from UTF-8 encoding is one in which all of the following conditions are true. And these are the following conditions. Data is mostly standard ASCII values, but has a small amount of varying range of Unicode characters. The column is currently or would otherwise be in VARCAR Max, meaning it wouldn’t fit into VARCAR 4000.
There’s a lot of data for this column or set of columns. It’s one gig or more when stored in VARCAR. So, I don’t know.
Once you start getting into some of the longer stuff with a little bit more nuance, I don’t know. Unambiguously better, I don’t know about that. But, you know, you might check enough of the boxes here that you might have a better time with it. So, potentially yes, but I don’t know for sure.
Alright. Let’s come back over here. And let’s try to answer this one. This is an interesting one.
Why SQL servers sometimes change at execution plans when max.dop change from n to n plus 1, n greater than 1? E.g. max.dop equals 14 and max.dop equals 15. No data changes, no statistics changes, no confif changes.
I’m gonna assume that’s config. I’m gonna take a wild guess at that one. Only max.dop changes between runs. So, I’m trying to parse this a little bit.
What I’m unclear on, a little unclear on, is if you’re asking me why the dop of a query might change if a plan changes, or if the execution plan changes if you specify different dops. The max.dop thing, like if you specify different dops for a query, you can certainly see different query plans. I actually have a blog post about that.
And I’ll put this in the show notes as well. And this was something that sort of surprised me a little bit. What I found out of it was that, like, the execution plan changed drastically when going from dops 1 to dops 2 and then above dops 2.
The reason why I found it interesting is because, you know, like, when you’re learning about parallel execution plan stuff, like, and you know, you start sort of like learning the process for it. What happens is someone will inevitably tell you that all queries start as serial execution plans, which is true. And if the cost of that serial execution plan is greater than your cost threshold for parallelism, and there’s nothing like a scalar UDF or an insert or modification where the table variable is the target that would prevent parallelism for the entire plan, then SQL Server will explore parallel execution plans.
If the parallel version of the serial plan is cheaper than the serial plan, then SQL Server will choose the parallel plan. Now, what kind of gets lost in there is that SQL Server, when it starts exploring parallel plans, will explore different parallel plans from the serial plan. It’s not just like, if we add parallelism to the serial plan, like, is it cheaper?
Like, though that is a choice, SQL Server will also explore other parallel plans. So the query plan for this thing at dops 1 looks like this, right? You have a couple scans, merge join, top, scan, oops, scan, another merge join.
And then at dops 2, the query plan looks like this. You get a couple scans, you get a hash join, you get a top, you get another top. And then you get over here, you get a nested loops join and a seek.
And then at dops higher than 2, the query plan changes again, where it almost looks like a parallel version of the original serial plan. Remember, this is the original serial plan, where it’s scan, scan, scan, merge join, top, merge join, right? And then the parallel version of this plan from dops 3 to 8 is scan, scan, merge join, top, scan, merge join, but now just with parallelism operators in it, right?
So if you change the dop of your query, if you say, you know, run it max.14, now run it max.15 or whatever, SQL Server might choose a different plan because you have more, like with more threads involved, a bit like the shape of the plan might make sense to do other stuff. And remember, you’re still going through all the cost-based optimization stuff at these different dops. And if you’re using different dops, you’re essentially telling SQL Server to make a new plan.
And so SQL Server is like, well, I’ve got this extra thread, and this other plan might be better with this other thread. I can do this more efficiently. Like, so, you know, it’s like you’re still going through all the cost-based optimization stuff.
The different dops just give SQL Server, like, another thing to think about what it might do some different work with. Now, you say that, like, data hasn’t changed, statistics haven’t changed. A lot of people say that, and it’s not exactly true, but I’ll go through and believe you on this one.
Most likely it’s just going to come down to SQL Server compiles a new plan, goes through cost-based optimization, right? Because you’re changing the allowed dops so SQL Server isn’t going to reuse a plan. It’s like sending a new query in effectively.
And when SQL Server goes and evaluates different plans with, like, even just one more dops, right? Like, this is dops 2, right? Where SQL Server makes these choices.
And then dops 3 to 8, SQL Server makes these choices. So you might be hitting, you know, one of the SQL Server’s famous tipping points where going from dops 14 to dops 15 changes SQL Server’s mind about, like, what would be the cheapest plan during cost-based optimization. So that’s what I got there.
But anyway, that brings us to the end of these five questions for this Office Hours episode. I’d like to thank you for watching. I’d like to think that you enjoyed yourselves.
I’d like to think that you had a good time. And I’d like to think that I’ll see you again in the next video. But I’ve been wrong before. I’ve been misled. Led astray.
Lied to. Let down. It’s a sad, sad life. It’s a sad life. Anyway. I’m going to delete these and try to figure out what to talk about next.
Find some nice uplifting SQL Server messages for everyone. But anyway. Thank you for watching.
Goodbye. Hello, I’m my friend. Bye-bye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
Was there supposed to be a video?
Stop reading Beer Gut Magazine and fix it, dammit.
It is fixed! Thank you!