The Broken fn_xe_file_target_read_file DMF In SQL Server

The Broken fn_xe_file_target_read_file DMF In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into a frustrating issue with the `sys.fn_xe_file_target_read_file` dynamic management function in SQL Server, which has been broken since 2017. Despite Microsoft’s addition of an event date column to make filtering easier, something went terribly wrong due to uncoordinated development efforts. I demonstrate how this defect impacts query results and execution plans, showing you a workaround using explicit date conversion. Additionally, I share my experience with the `SP_Human_Events` stored procedure, which I’ve developed to handle similar issues efficiently. The video concludes with a thought-provoking question about the quality of production code in Microsoft’s software products, encouraging viewers to consider the thoroughness of testing and the potential for hidden bugs.

Full Transcript

Erik Goshdarnit Darling here with Darling Damnit Data. And in today’s video, I’m a little fuzzy. Let’s fix that fuzziness. Let’s take the fuzz off. I don’t need help looking any fuzzier, do I? Things are fuzzy wuzzy enough in this face. Today we’re going to talk about a rather disappointingly broken dynamic dynamic dynamic dynamic management function related to extended events. Now this is one of those things that for as long as, so this function used to be fine, but then in 2017 Microsoft added an event date column to it so that you could filter out date stuff from the DMF rather than going all the way into the XML to do it, which, sounded great. Except the summer interns were up to their old hijinks and apparently they did not talk to the Windows file system folks because something went terribly wrong. I’m going to show you what went terribly wrong and how you can fix it. Now this is now, you know, extended events have been the replacement for Profiler since 2012-ish, right? I mean, I know that they dropped in 2008 and that’s when they were supposed to launch it.

like really get awesome, but they just never did. Microsoft adds a lot of extended events every version of SQL Server, but the whole process of collecting and, you know, like analyzing extended events has always been awful. That’s why I’ve spent thousands of lines of code and hundreds of hours of my life writing stored procedures like SP human events to overcome some of the large gaps that I’ve found working with extended events generally. But before we do that, I’m going to blow your minds on this beautiful Friday and I’m going to tell you that you can support this channel for $4 a month by signing up for a membership at the link in the video description. If $4 a month is just too much for you, you would rather stuff that money in your mattress and hope that it doesn’t get moldy and rotten, get mouse eaten or rat eaten or something.

You can like and comment and subscribe. You can fill my heart with joy by your mere presence on the internet. If you have a SQL Server that you need consulting help with, that’s what I do for a living. Also shocking, I know, I am not just a YouTube celebrity. I also get my hands dirty and do actual work with SQL Server that other people either don’t want to do or don’t have time to do or can’t figure out what to do. So that’s me. And as always, my rates are reasonable.

If you would like some very high quality, very low cost training and you want to spend money before all those Black Friday sales kick in and people want to charge you way more money for SQL Server training content, you can get all of mine. About 24 hours worth for the rest of your God-given life for 75% off, which brings you to about $150 USD. It’s a hard deal to beat. Black Friday or no Black Friday.

If you are still on the fence about Past Data Summit and you’ve never seen one of my videos before, you can rest assured that I will be there November 4th and 5th co-hosting two days of wonderful performance tuning pre-cons with Kendra Little. It will be more fun than you’ve ever had in your life. Probably more fun than you thought could possibly be had with SQL Server.

So you could do that too. But with that out of the way, let’s have fun. All right. So this is a very quick Friday video.

Now, this is the DMF, the Dammit Management function that I’m talking about. Sys.fn.exe file target read file. Rolls right off the summer intern’s tongue, doesn’t it?

Now, this issue has already been logged here and I’ll put the link to this in the video description. But this is a known defect and Microsoft has just always been like, yeah, maybe we’ll do something about it eventually. I don’t know.

I’m on SQL Server 2022. It’s been broken since SQL Server 2017. Will they ever fix it? I don’t know. But we got ledger tables. We got dot feedback.

Priorities. Priorities. Maybe, maybe does no one uses extended event. Maybe that few people use extended events that they just don’t care.

That could be the case. But here’s the problem. If you were to query this DMF and you were to think that this timestamp UTC column would be something that you could very easily filter on with some date math, say, I want to find the last seven days of data, you would be horribly wrong. You would be mistaken.

Something in the file format of the system health extent, the XEL files. This is not limited to the system health extended event. This is all of them.

It uses like, I think it’s called like Windows epic time or something like that. And it’s just a weird number because it’s an epic, right? It’s just a big, long integer and it doesn’t get filtered correctly when you do this. Right?

It’ll get displayed correctly. It’ll get converted to a timestamp in UTC when you query it. But when you actually touch the file, things go awfully wrong. The only way to get around that is to write your query like this and actually convert that column to the correct date time to with a specificity of seven and compare that to whatever date math you want.

Now, I’ve already run these queries because, you know, I’m pretty sure this is going to be a Friday for you and I don’t want to, you know, plug up all your time running queries, even though they only take about 500 milliseconds. But here’s the first query, right? And if you look here, there is no filter predicate there.

If you look here, we’ll just read that there. If you look at this filter, it’s just saying where expression 1000 is not null. And if you look at the nested loops join, we don’t have anything there either.

It’s almost like that predicate on that column gets completely lost in the shuffle. But even worse is we don’t get any results back for the seven days that we wanted the results for. You might compare and contrast that with the query below where there is a convert on there.

And we do get a number of results back. In fact, in all, we get 566 rows back. That seems significant to me, a 566 row difference.

Actually, the difference between 566 and none. One might run that first query rather naively and assume that the SQL Server System Health Extended event just doesn’t have anything of value in it that we could use to troubleshoot our SQL Server. But that wouldn’t be true, would it?

That would be wrong. That would be incorrect. Now, the execution plan for the other one, you might notice if you’re quite eagle-eyed, has an additional filter in it right here. And this filter is where that predicate actually does get applied and applied correctly so that we can get data out of the System Health Extended event for the range of time we care about.

Now, my free store procedure, SP Health Parser, you can find it at my GitHub repo, does use very similar queries to this to pull data out of the System Health Extended event. And this was something that I struggled with a bit when I first started writing the query. And I was thinking to myself, well, you know what?

If someone is on SQL Server 2017 or better, we should not go digging into the XML to look for the timestamp of things. We should just use what’s in this dammit management function. And you know what?

This was a really annoying thing to deal with. Because, like, you’re running this query that should work and it doesn’t work. And, like, you start sanity checking yourself and then, like, you actually feel quite nuts. You feel quite preposterously insane.

And then you find, you start digging around the internet about this particular function. And you realize that it’s just broken and you have to fix your query to get around the brokenness of the function. Now, I’m going to ask you, I’m going to end this video on a question.

Is this production quality code? And I’m not talking about my query. I’m talking about the type of code that you need to do this sort of thing in order to fix. When people talk about worrying about, you know, the quality of Microsoft software products, SQL Server, the cumulative updates, well, no longer the service packs, but the cumulative updates that Microsoft puts out, some of the features that it inserts into new versions of SQL Server, one might start wondering just how thorough the testing is on these things.

And one might start to have real questions about just how high quality the code being implemented in your enterprise Ferrari database system is when you start running into stuff like this. Because who knows what else those summer interns worked on? And who knows where else stuff might be weird and broken?

So, I’m just going to leave you with that question. Ponder that for a moment. If you were reviewing code and you came across a bug like this, would you let that code go out into your production workload? It’s a good question to ask, isn’t it?

Isn’t it? Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Of RECOMPILE Hints And Query Store: Where Are My Parameter Values?!

Of RECOMPILE Hints And Query Store: Where Are My Parameter Values?!



Thanks for watching!

Video Summary

In this video, I dive into the world of query tuning with a focus on recompile hints and Query Store. Erik Darling from Darling Data shares his insights based on extensive experience in performance tuning, particularly when dealing with complex queries involving temp tables or table variables. He discusses the frustrations that come with trying to tune such queries using only the data available in Query Store, highlighting issues like missing parameter values and the difficulty of reproducing slow query scenarios. To illustrate these points, Erik demonstrates how recompile hints can be used effectively, contrasting them with the limitations of traditional methods. By walking through a practical example, he shows viewers how to leverage recompile hints for better performance tuning outcomes.

Full Transcript

Erik Darling here with Darling Data. And we’re very happy. I promise. We’re thrilled. We’re thrilled beyond compare. In today’s Lent Apology Free video, we’re going to be talking about why PowerPoint, not Excel, why PowerPoint decides to double advance when I click on it. Just kidding. We’re going to talk about Recompile Hints and Query Store. So I do the majority of my performance tuning work looking at Query Store data. Most people who I work with don’t have a good third-party monitoring tool that captures the type of stuff that, like, I really get into, like, you know, performance metrics, queries, running, resource usage, yada, yada, yada, yada. It really gives me a good sense of which queries I need to go of which queries I need to go after to fix stuff. So most of the time I end of the time I end of the time in query store.

I’m dealing with a lot of sort of weird situations. Query store is wonderful. Query store is wonderful. But I think just sort of in general query tuning when certain things are going on in SQL Server can be quite frustrating.

Recompile Hints are probably the least frustrating. What can become very frustrating is when the query you need to tune uses a temp table or table variable. And the process of populating that temp table or table variable is not quite straightforward.

So, like, you’re working on code in a stored procedure and you would have to execute a whole bunch of other stored procedure stuff that’s not enough to get the, you know, temp table or table variable populated in order to execute the query that you care about that’s taking a long time. That’s the first part.

So, like, when you look in query store, you just get the query that was slow. Even if it’s in a stored procedure, you don’t get the whole, like, stored procedure plan the way you do in the plan cache. You just get that one single query.

And if there are table value parameters, table variables or temp tables in use, those don’t show up. Where it gets also frustrating is if someone is using local variables or if someone is using option optimized for unknown, the compile time parameters, I mean, the runtime parameters are never in there unless you play some weird tricks, but the compile time parameter values won’t be in there for local variables or recompile.

So that’s also a bummer. That also makes troubleshooting stuff really difficult because you have no idea how to re-execute the query to reproduce what’s slow in order to figure out what to fix. Bah!

What’s the other one that I really hate? Oh! So, if you pass in a parameter, but the parameter isn’t used in, like, a where clause, like, let’s say you have four parameters, three of them get used in the where clause, and one of them gets used in, like, a case expression in the select list, the value for the one that gets used in the select list case expression won’t be in with the rest of the parameter values because that had nothing to do with cardinality estimation.

So, three really frustrating things that you can run into when you’re trying to tune queries, and I really wish that there was, like, you know, like an instant replay button or, you know, something cached along with a plan that would make queries like that runnable so that you could figure out, like, you know, get an actual execution plan, look at the, you know, like the operator time statistics and get, like, some better clue about where in the plan you need to focus to fix things.

Sometimes you can look at a query plan and figure it out. Other times, you know, that cached slash estimated plan just lies to you in too many places and too many ways to make that useful. So, what recompile hints do is almost the opposite.

It can be, in very large execution plans, it can be frustrating, and I’m going to show you why. In smaller plans, it’s fairly easy to, you know, figure stuff out, but we’ll talk about that in a minute. Before we do, though, you might be surprised to hear, this might come as a shock to you, that you can support this channel by signing up for a membership, and that there’s a link in the video description to do that.

You can do that for as little as $4. That’s cuatro dólares. Dollares?

I’m not good at it. I’m not multilingual. I only speak like English is my second language. But if you’re unable to fork over the four bucks a month, you can keep me company in other ways that are also heartfelt, like liking and commenting and subscribing.

I think we’re at about 20 members, and we’re at about over 4,700 subscribers. So I’m feeling, feeling quite loved and adored these days. Maybe, maybe at 5,000 I will just burst with joy.

If you need SQL Server consulting help from a young, handsome fellow with reasonable rates, I am very good at all of these things. If you need me to do something else, we can discuss what that something else is and figure out if it’s anything that I am also really good at. Just, you know, make sure it involves SQL Server.

It’s the only thing that I ask of you. If you would like some high-quality, low-cost SQL Server training that beats the pants off the competition’s Black Friday rates, you can get all 24 hours of mine for the rest of your God-given life for about 75% off.

That brings it down to about 150 US bucks for you, special folks out there. If you are a live and in-person type of person, and you want to see me and Kendra Little co-present two days of performance tuning madness, you can do that this November 4th and 5th at the old Pass Data Summit in Seattle, Washington.

It’ll be fun. That’s all I have to say about that. It’ll be fun, and if you miss it, you’re going to regret it, because it may never happen again.

Just don’t know where the future will take us. But with all that out of the way, let’s do what we normally do and party hard and talk about recompile hints and query store. So we’re going to use Stack Overflow, and we’re going to get rid of some indexes that I had for something else.

And I’m going to create this store procedure that has basically the same query in it twice. Really, what these queries actually do is completely unimportant. All I need to do is show you the difference in how things look in query store.

So I’m going to give this a few runs just to make sure. Actually, we’ll see how long this takes. All right, one second.

We can give this a few runs just to make sure that everything ends up in query store where we want it, because without that, we are sunk. And now let’s use my free store procedure, SP Quickie Store, to find plans for this store procedure. You might be looking at this and thinking to yourself, wow, Eric, that’s amazing.

I can’t even find stuff by procedure name in the query store GUI. And you’d be right. That’s why I’ve spent, like, thousands of lines of code and hundreds of hours of my life working on this store procedure, to make life easier for everybody, because Microsoft won’t do it.

Isn’t it nice? So we’re going to run this, and we’re going to have two entries in here for our two queries. We’ll see two individual query IDs here.

And we’ll see, well, I mean, I suppose two different plan IDs, too. That all makes sense. And this first one is going to not have a recompile hint in it, and this second one is going to have a recompile hint on it.

We’re not, like, comparing how if recompile helped or hurt these queries, because we really just ran the query with the one parameter. It’s not like there’s going to be parameter sniffing or the thing ran in one second, so there’s not really anything to really fix.

The only thing that I want to show you is that in the plan without the recompile hint, if we go to the properties and we look over here, if we go to the properties of the select operator and we look over here, we will have the parameter list and the compile time value for the parameter here.

This makes it really easy to, you know, pull that query, pull the query text out of QuickieStore’s results, plop it in a new window, and then do something like, you know, create a temporary store procedure to execute it with that parameter value so that you don’t end up having to do anything weird, like, you know, declare a local variable which screws things up or whatever else.

Great. That’s how that happens there. Now, let’s look at, what did I just close? I don’t know what I just closed.

I might have closed the wrong thing. We’re going to have to go back and figure that out. Did I close? Yeah. Oh, no, it’s over there. Things just moved around strangely, I guess.

Anyway. Yay! Let’s look at the query plan for the query with the recompile hint in it. Now, if we go to the properties of this one, we are not going to see, oh, I actually hit the shift key.

There we go. We’re not going to see the same parameter list over here. Why? Well, option recompile does something different from, obviously, what option optimized for unknown or using local variables is, where you just have no record of what those values are in the query plan XML.

What option recompile does is it treats your parameters and local variables like literal values, and those literal values get used when you touch, well, hopefully when you touch tables and indexes.

But if we hover over this thing, we’ll see, oh, baby, easy, zoom it. We’ll see the literal value that we passed in to the store procedure here as a predicate scanning the clustered index.

Now, again, this isn’t a query tuning competition. This is just to show you how this behavior might look a little bit different. Now, where this can get annoying in really big plans is if you have a lot of different predicates or parameters that you pass in, and you have a lot of different tables that get used, and some of them have parameters and some of them don’t.

You have like 20 joins and like a where clause with like eight different predicates in it, and you hit eight different tables. It’s kind of a hassle to go to each table, figure out which literal value predicates got used there, and then reconstruct things that way.

I really wish that Microsoft would make this stuff easier, both for option recompile, option optimize for unknown, and all the other stuff. I realize that when you use option optimize for unknown, and you get this sort of density vector guesses, the message is that it shouldn’t matter what the compile time values are, because you get the same crappy guess no matter what.

But it really does help to figure out, you know, what the, like I feel like the compile time values should still be included in there, because if you’re troubleshooting a slow query, obviously the local variable guesses for the data distribution of what you searched for didn’t work out one bit, and you need to focus on reproducing that and figuring out how to fix it, like obviously aside from just nuking the optimize for unknown hint.

A great way to test that is to add an option recompile hint instead and see if you get a better plan. Anyway, before I drift too far along here, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. I hope that I do not need to apologize for the length of this video to anybody out there and the entire internet, and I will see you in the next video. Goodbye, farewell, take care of yourselves.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Why You Should Always Use Unicode For Dynamic SQL

Why You Should Always Use Unicode For Dynamic SQL



Thanks for watching!

Video Summary

In this video, I delve into the importance of proper string typing when working with Dynamic SQL in SQL Server. I highlight common pitfalls, such as using `varchar` instead of `nvarchar` for Unicode characters, which can lead to data loss and incorrect results. By emphasizing the need to preserve Unicode-ness throughout your dynamic SQL strings, I aim to help you avoid these issues and ensure that your queries handle international characters correctly. Whether you’re crafting complex Dynamic SQL statements or designing robust database schemas, understanding how to properly manage string types is crucial for maintaining data integrity and preventing unexpected behavior.

Full Transcript

Erik Darling here with Darling Data. And I think the theme of this week’s videos is going to be length. Some people have taken to complaining about my length in the comments. And so rather than apologize for my length, as I apparently often have to do, I’m going to do some short ones this week so that everyone can be comfortable with my length. Now, this video is going to be about Dynamic SQL string typing. Since we’re going to continue with the theme of apologizing for lengths, using the right type and length and strings for Dynamic SQL is very important. You don’t want to apologize for having too short of a length when you’re crafting Dynamic SQL. So always be careful there. But the subject of this video is more along the lines of make sure that you know exactly what kind of data is going to end up in your Dynamic SQL strings.

Because if you have Unicode characters in there, they could disappear in a variety of different ways when you are drafting and crafting and concatenating all your Dynamic SQLs together. So, if you like this channel, you can join like 20 other people in getting a membership for like $4 a month. I know. Big money. Watch out. Erik Darling might get a new Adidas shirt soon. Lord knows I’ve been wearing this one for three years or something. If you are unable to fork over a Paltry $4 a month, you can do likes and comments and subscribes. Because, you know, you gotta have options, right?

If you need SQL Server help from a consulting perspective, these are all things that I am very good at. Just ask anyone. And you can, you’ll get an answer, I’m sure. If you need some low-cost, high-quality training, you can get about 24 hours of it for about $150 USD with that discount code at that URL. The link for all this stuff is down in the video description. So, you can hang out in there if you’re feeling clicky.

I will be live and in person at Past Data Summit, November 4th and 5th in Seattle, Washington. That week, I will most likely be, what do you call it, intoxicated? No, I’m gonna put up some weird, like, livestream-y videos from Past Data Summit. I might try to interview some smart people. I don’t know yet. I’m sort of getting that figured out because I am enjoying the YouTube video content thing.

I’m just not sure if I want to have, like, a bunch of pre-baked stuff going on during Past or if I want, you know, live, fun Past content. Who knows? Maybe I’ll give Steve Jones a wedgie. Anyway, let’s get on with… Oh, we faded to black. Ah, crazy. All right. So, let’s talk about Dynamic SQL. And this is all, you know, a lot of the videos that I record are based on experiences and interactions with clients of mine, the nice people who pay me to have, you know, the spare time in the day to record these videos.

So, we’re just gonna make sure there’s nothing going on. Even though indexes and database contexts have nothing to do with this demo, everything you see here could happen anywhere, even in Azure SQL DB, if you’re stupid enough to use Azure SQL DB. So, excuse me. Here’s some Dynamic SQL, right? And if we run this, what I want you to focus on is that we did not correctly type.

So, like, when you run Dynamic SQL, if you’re gonna use SP Execute SQL, or even if you’re gonna do exec, you probably should use Unicode strings because Unicode strings will do a better job of preserving Unicode data. But if you’re running, if you’re using SP Execute SQL, you have to have the input parameter typed as Unicode.

It will not accept a non-Unicode string or parameter or variable or anything when you’re, you know, executing it. And say, no, it has to be Unicode, dummy. So, what I see a lot of people do is, so that they don’t have to worry about n-prefixing every single string concatenation block, is they’ll declare a varchar max variable or parameter, and they’ll do all their Dynamic SQL concatenating into that thing.

And then, at the end of it, they’ll set their Unicode max string like this. Sorry, I circled the wrong one there. I squared the, rectangle the wrong one. They’ll set their Unicode string equal to the non-Unicode string, and then execute Dynamics SQL with that.

The problem is, like you might have seen because I hit execute on all that stuff, is that you lose the Unicode-ness of your stuff in there. So, these are some Japanese characters in this string, and you can see right off the bat that we just end up with a bunch of question marks for there.

And even converting the string back over to Unicode does not resuscitate them. So, when we select that string, we get a string of question marks. Not good. Not what we wanted there.

Now, pay a little bit more attention to this one, right? Because in this one, we’re actually doing almost everything right. Where we preserve the Unicode stuff here.

Oops. Oh no, it’s red. Ugh. Let’s do that again. Let’s fix that. Let’s make my pointer pink again. So, here we do have the Unicode characters, and here we do have the Unicode characters.

But when we actually go and select that string, this happened. And this happened because the string inside of the string was not correctly typed. This stuff can happen in all sorts of weird places.

So, this one down here is actually done correctly, right? So, if we look at this, and we run this, now we preserve our Unicode-ness all throughout, right? We get that good stuff there. We get that good stuff there. We get that good stuff there.

We returned everything that we should have. And, you know, just to sort of keep going on the same thing, it’s going to be fairly obvious since I’ve showed you all this stuff.

But, you know, making sure that all of your stuff is correctly typed is really important. Even if you do something like this, the fact that you do this is just VAR card does not, it doesn’t help that you end prefix this.

SQL Server is like, oh, that’s actually Unicode data. We have to change that. SQL doesn’t do anything to help you. It does absolutely nothing. You get question marks back from this. You really do need, if you’re going to be handling Unicode anything, to have everything be Unicode.

Now, this isn’t, you know, just dynamic SQL. Dynamic SQL is just an easy vehicle to show you what I mean. But, this is something that comes from a lot of different places, right? So, like, database design, you know, people used to really hem and haw and nitpick stuff like, oh, are we actually going to use Unicode?

Oh, will this table actually get that big? And, you know, they would make kind of dumb decisions and they would, you know, start the database off with non-Unicode strings and do things like make identity columns integers instead of big ints. And, I feel like that kind of stuff really comes back and bites a lot of application developers and a lot of, you know, architect type people because they made bad choices at the outset based on, you know, things that really shouldn’t have, you know, like application specifications that really should override like, oh, this is just the best practice.

So, you know, when I’m, you know, consulting with people and trying to help them with something that they’re making a first pass at, usually a couple of the things that I make sure we go over is that, or rather are that, you know, if you’re not sure, if you don’t have like a particular domain assigned to an integer column, like, you know, there’s only going to be like 16 or 20 like valid statuses for something, make that a big end because you don’t know how big that table is going to get.

You make an identity column or a sequence column or even if you’re generating IDs in the application to make sure that they are completely in order and monotonically increasing with no gaps, you got to be careful, right? But if your application is a runaway success and all of a sudden you hit 2 billion rows and go boom, what if, you know, you start expanding into foreign markets where people, you know, use Unicode characters for a lot of stuff, all of a sudden those name fields and those address fields and a lot of other things might go boink.

So, not just Dynamic SQL, but application design everywhere. I think, you know, if you’re going to do things safely for the long haul, for the long term, you should use Unicode as much as possible. There are just a few specific things about Unicode that for Dynamic SQL that you have to be real careful with.

Like, you know, you can’t execute SPExecute SQL, which is the only safe way to execute parameterized Dynamic SQL so you don’t get SQL injected. Everything has to be Unicode and you have to really mind all the concatenation stuff to make sure that all your strings are correctly prefixed with that uppercase N so that you don’t lose anything in the concatenation.

You’re probably not going to get like the whole string implicitly converted over to VARCAR because that would be, you know, kind of ludicrous. But, I don’t know. Some people find my pronunciation of VARCAR ludicrous. So, I don’t know. I don’t really know what to tell you there.

But anyway, yeah, Unicode. Be safe out there in the database world. Whether it’s Dynamic SQL or, you know, table design or anything like that, Unicode generally is the better choice to make sure that your application is safe and sound in the long run.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I hope that you will be out there with all your full Unicode self. So, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Advanced String Searching In SQL Server

Advanced String Searching In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into advanced string splitting techniques in C-SQL and SQL Server, focusing on scenarios where strings are delimited by various characters. I start with a basic example of using the `SUBSTRING` function to extract text between two delimiters, explaining the nuances of its three arguments—especially how the third argument works differently from other substring functions. Then, I move on to more complex cases involving multiple delimiters and special characters like asterisks, exclamation points, question marks, and dollar signs. To tackle these scenarios, I employ `CROSS APPLY` twice to find the first and second occurrences of each delimiter, demonstrating a practical solution for extracting text between them. This approach helps in creating a minimal viable product (MVP) that can be easily adapted to various string patterns. If you’re curious about how to handle such intricate string manipulations or just enjoy solving complex SQL queries, this video is definitely worth your time.

Full Transcript

Erik Darling here. I hope no one heard that weird hand fart. With Darling Data, recently voted by BareGut Magazine to have the YouTube channel with the most accidental hand farts, which was really, I think the good folks at BareGut Magazine are psychic. How else could they know? Anyway, today’s video, we’re going to talk about advanced string splitting in C-SQL. SQL Server, or really just one aspect of it. I can’t possibly cover very advanced aspects of string splitting in SQL Server because that would be a long video. And I’ve gotten complaints about videos that crested the 20 minute mark. And boy, oh boy, the attention span on you kids. If you feel like you need to stim a bit and step away from the computer, YouTube was kind enough to provide you with a pause button somewhere over there. Oh, my fingers gone. Somewhere over in that corner. So you can always hit pause and come back after you’ve hand flapped and gargled and done your fidget spinner or whatever. But anyway, we’re going to talk about you and me and how you can buy me a fidget spinner. For the low cost of $4 a month, you can sign up for a channel membership, which will get you all of these videos.

If you don’t have the $4 a month, if you feel that I am unworthy of fidget spinning, you can like and comment and even subscribe to the channel and join 4,700 other dated darlings who get notified every time one of these gorgeous videos drops. If you need SQL Server consulting, I am of course available not 24-7, but sometimes seven days a week, usually during the day, but not before like 8 a.m. and definitely not after like 6 p.m. That’s when I do other stuff. I have a family and all that who also require my time, though the pay on that is significantly lower.

Anyway, if you would like to watch me do this whenever you feel like it, 24 hours a day, seven days a week, you can get all of my performance tuning content for life for about $150 with the discount code SPRINGCLEANING. What’s nice about that is that you don’t have to worry about anything after you get it. You know, there are no time commitments.

You can go off and stim and spin and whatever, flap your arms, whatever wackadoodle stuff you need to do, and then come back and watch more of it. If you think 20 minutes is a long time, 24 hours is even longer. If you would like to catch me live and in person, I guess this would be a total of 16 hours and not in 20-minute chunks.

You can catch me and Kendra Little November 4th and 5th at Past Data Summit in Seattle doing SQL Server stuff, performance tuning, getting wild all day long, between the hours of 8 p.m. and 6 p.m. most likely. If there is an event nearby you and you think, boy, this handsome visage that stands before me sure would be a great accidental hand fart addition to that lineup, let me know what that is. Who knows? Maybe I’ll show up.

And with that out of the way, let’s talk about this string splitting nonsense. Now, before we get into the advanced stuff, I need to show you the proper way to get, let’s just say, a string between two delimiters. In this case, we’re going to have the same delimiter twice, but in real life, actually in the example that we’re going to look at below, we will have a variety of delimiters in slightly different circumstances.

So, first, I need to show you the proper way to do this. If you just have two delimiters and you’re like, give me whatever’s between them. Now, it doesn’t have to be colons.

It could be any two delimiters, right? It might be a period and the next period. It might be a period and a comma or a space and then something. Whatever it is. There’s all sorts of uses for this sort of stuff.

So, we need the substring function. And we need to talk about the three arguments of the substring function. All right? The first argument, because this is something a lot of people mess up. The first argument of the substring function is the string that you want to sub for.

Ah. Minus one family-friendly point. All right? The second argument of the string split is the position that you want to start your substring at.

In this case, it is the position of the first semicolon in the string. Right? That one up there.

Plus the length of that character. This is something that’s pretty important. Like the length, you need to add that onto the position of that so that you don’t include that. If you want to include it, you can.

But in this case, we don’t want that colon to show up. We want the space between the colons. I’m losing more family-friendly points as the further this goes on. The third argument is not the end position.

A lot of people think, because in some places, substring does function differently. But the third argument in SQL Server is not the position of the end. It is the number of bytes after the first thing that you want to include in the string.

So that’s where things get more complicated. Because the first one, we just need to get the car index of the colon in the text plus the length of that. And the second one, we need to use advanced car indexing to get the position of the first occurrence in the string after.

Right? There’s a third argument for car index. And we’re going to start it at the position of the first colon in the string, of course, plus the length of the colon.

Damn. This is not going well. Then we have to do some other stuff.

We have to subtract the length of the delimiter and the length of the car index of the first occurrence. And that will give us the string between the two things. Now, I’m using the sys.messages table.

And I’m only looking for rows that have two colons occur in them. It’s medically improbable, but there it is. And this is just to make it a little bit easier.

Because if we didn’t have this, we would need all sorts of case expressions or other protections for the substring function to make sure that we don’t throw an error if we give an invalid length to the substring function. So that’s why that’s there.

But if we run this query, we’re going to get back the actual text of the message. And we’ll be able to verify in a few different places that this is correct. Right?

So let’s just take this one as an easy example. It’s from colon space percent d to colon. And that’s what we get right there in the parse string. There’s another good example a little bit further down that’s really easy to show in there.

I forget exactly where it is, but we’ll just look at this one. There’s two colons in that one. That one’s a little weird.

I don’t know. You get the point. It worked. Right? Actually, this is the… Actually, no. That one’s not so good. These ones are good. These ones are easy. So here we have is page percent d.

And that’s exactly what we get back in there. Some people would throw like a L trim, R trim, or a trim on this to get rid of spaces around that. I’m not that fancy.

So we’re just going to leave that in there. But anyway, the whole point is this all works. Right? This all works just fine. The situation I had to deal with was I needed to find the space between a variety of delimiters. And I needed to find the first occurrence of each one.

So what I’m going to do is show you a little bit about what the strings looked like for me. This is not, of course, exact. This is just sort of what…

This is enough to make a simple MVP. It’s minimal viable product or MVC, MCVE, complete example, minimal viable complete example or whatever they call it, where the strings were kind of weird. Some of them started with a number and then some of them started with a character of some sort or special character, not like a letter.

But they were all sort of set up like this where I needed to find the space between the first weird thing and the next weird thing. And I knew what all the weird things were. They were asterisks in this case, exclamation points, question marks, and dollar signs.

Right? So these are all the weird things that I had to find. And I had to find the first occurrence of whatever came first in the string and then the second occurrence of whatever came next in the string. And that required some serious brain time from me.

Now, just because I don’t remember what I actually did from this, I’m going to rerun all this. And I’m going to create these two tables and populate them. And then I’m going to show you that this table has one row for each instance of the special characters I had to find.

I didn’t specifically need this. It just made writing the query a little bit easier. So the substring is going to do exactly what we did up there.

It just looks a lot cleaner because all I have to do is put the columns in here and operate on the columns rather than have to generate all the expressions in the select and operate on all the expressions. Because that turns into a real confusing time with all the functions, sub-function stuff in there. T-SQL doesn’t make it easy to nest these things.

But what I did was I used cross-apply twice. The first cross-apply will go and find the top one. And it will get the top one from this query.

And what this query does is find the search position. So it finds the car index of the search string and the string that we care about. And then it looks for where search position is greater than zero because this actually helped me avoid a lot of errors.

And, you know, it was better this way. And then we order by the earliest search position. So search position ascending, so the earliest search position.

Then, in the second cross-apply, we’re actually going to reuse elements that project out of the first cross-apply. So note that this one is called x1, right? And if we come down here and we zoom in on this, in this one we’re going to do, this is like the second argument of substring.

We’re going to search from the search position of the search element that we care about and the string that we care about at the starting position of the x1 search position. Right? So, and then down here, rather than filter on greater than zero, we’re going to say where the position in this one is greater than the search position that we found for the first one.

Right? And then we’re going to order that by this. And now if we run all this, what we’ll get back is exactly what we should see.

Right? So just to highlight this string a little bit, the first weird character was a dollar sign. The second weird character was an asterisk.

So, and then the substring between those two was the number 23. And that holds up for these as well, where the first weird character was an asterisk. Sorry.

The first one, the first weird character was a dollar sign. The second one, the first weird character is an asterisk. That’s in the first position. The second one was an exclamation point. And the fourth position in the substring between asterisk and exclamation point was the number 12. So this works for all of these pretty well.

Granted, it might not be like the most explosively well-performing code in the world. If you have very, very large data sets, of course, indexing for this stuff does help a bit. But once you get into the realm of like searching through strings and stuff, a lot of performance stuff can happen that isn’t easily controlled by you or indexes or really anything else.

It’s all very fuzzy string stuff. As I’ve said before, strings were a mistake. Should not be in databases.

Everyone should have just learned binary and learned how to read binary representations of strings. I’m kidding. You shouldn’t actually ever have to do that. But anyway, I hope you enjoyed yourselves. I hope you learned something.

I hope that you find sort of weird query stuff like this as fun as I do to solve and write and figure out how to get it to work. I am not the sharpest knife in the drawer. So actually writing this code took me quite a bit of like, you know, head on keyboard moments.

But once it got there, boy, was I proud of me. I was like, I’m a big boy now. I don’t need a diaper anymore yet.

I mean, I might need one again. Who knows where the weekend will take me. But, you know, weird stuff happens between, in the delimited between Friday and Monday, weird stuff happens in there. Hopefully not with my colon, though.

Anyway, that’s probably enough there. Thank you for watching. Goodbye. Please don’t report this video.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Paging Queries And Batch Mode In SQL Server

A Little About Paging Queries And Batch Mode In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the nuances of paging queries and how they can be optimized for performance on SQL Server. Specifically, I explore the differences between using `OFFSET FETCH` in more modern versions of SQL Server versus leveraging Paul White’s method from 2009, which involves using `ROW_NUMBER()`. I highlight that while `OFFSET FETCH` is syntactic sugar for the `TOP` expression and doesn’t support batch mode execution, `ROW_NUMBER()` can be used to achieve similar results in a way that does benefit from batch mode. By walking through both query setups, I demonstrate how recompiling hints and compatibility levels play crucial roles in optimizing these queries. The video also touches on common pitfalls such as improper indexing, unnecessary locking hints, and the importance of separating informational columns from relational ones to improve performance.

Full Transcript

Erik Darling here with Darling Data, the only company that will hire me. Recently awarded by BeerGut Magazine is the only company that is using AI appropriately to generate BeerGut Magazine covers. It’s not really that great for anything else. Sorry to dash your illusions. I’m going to say something that’s kind of wild. If everyone says they have next-gen AI, no one has next-gen AI. That’s just the sad truth of the matter. In today’s video, we’re going to talk about paging queries in batch mode and how you write your paging queries can make a big difference if you’re on the type of SQL Server that can get batch mode happening. Of course, I have another video. I believe the title of it is still Paging Dr. columnstore, where I talk about actually creating a columnstore index to replace a whole bunch of nonclubs.org. Clustered indexes that were there to support paging queries. As you may know, when you’re supporting paging queries, they’re often based off of dynamic SQL generated in some way, either in the application or in a store procedure that generates dynamic SQL or whatever other bastardized way you have of generating dynamic SQL somewhere between the application code and the SQL Server that runs stuff.

Because those queries are all dynamic SQL. You don’t know what people are going to search on, order by, what columns they’re going to select, what other tables you might need to join to. And it just makes indexing for all that stuff, even if it’s just one table indexing for all the different stuff. Because I know the tables that you people make out there, they’re like 400 columns wide.

And indexing for every possible permutation of columns that could get passed in is a very tedious and difficult process. And having one columnstore index that makes all of those searches kind of generally good enough is a lot easier than having a billion nonclustered indexes to support a billion different potential search criteria. But here I’ve said too much. I’ve gone on too long.

What we need to do right now is talk about financial security, mostly mine. If you feel so inclined and so kind and so bold and you feel like Erik Darling deserves $4 a month for all this YouTube content, you can click the link in the video description and you can get a membership to my channel where you will be rewarded with more videos.

It’s your lucky day. Anyway, if you don’t feel that I’m worthy of $4 a month, perhaps you feel that I’m worthy of some other expression of gratitude, like maybe liking a video or maybe commenting on a video or maybe subscribing to the channel where you can join like 47 other data darlings out there in the YouTube-verse who watch my YouTubes and learn a lot about SQL Server for free.

It’s nice, isn’t it? If you are in need of SQL Server consulting, maybe from someone who looks a lot like me and you think, hey, that Erik Darling fella seems capable.

Let’s get him to help out with our SQL Server problems. I can do lots of stuff. And as always, my rates are reasonable. If you would like training that is not free, but you want to get inexpensive training that lasts for the rest of your life, you can get my everything bundle, which is about 24 hours of performance tuning training at the beginner, intermediate, and expert slash advanced slash pathologically brilliant level for 75% off, which is about $150 with that discount code.

The link to that magically is also in the video description, so you don’t have to worry too much about remembering all those letters and stuff up there. If you want to catch me live and in person, and actually, I’m actually hitting sort of a weird point with this, because I’m recording this video at the end of September.

Passes at the beginning of November, but these videos are already scheduled out past pass on my blog, but on my YouTube channel, they’re only about like two weeks ahead, or two and a half weeks ahead.

So pass is still, if you’re watching this from YouTube, pass is still going to happen. If you’re watching this from a blog post, pass is probably long done. You missed it.

Sorry. As soon as one catches up to the other, I’ll most likely take this slide out. But yeah, Pass Data Summit, November 4th and 5th. Seattle, me and Kendra, if you’re unaware of that arrangement, because this is your first video, I highly suggest you watch some other videos where I go into far more detail about what we’ll be doing.

If there is a SQL Saturday or a Data Saturday, but not both in your area, or some other SQL data-ish event, and you think, hey, Erik Darling, that fella sure knows a lot about those sequels.

I’d like him to come teach the sequels at my event. Let me know what that event is, so I can come do that, because Lord knows I need a break from this.

But with that out of the way, let’s talk about page inquiries and batch mode. This is not another weird, shilly video about paginated reports and Power BI.

I hate those words altogether. Individually, some of them are all right. But as a group, ruthless. Sort of like people.

Individually, some of them are all right. As a group, holy cow. Sweet meteor of death. End this madness. So, in order to talk about this, I need to go back in time.

More years than I want to admit at this point. I could grow a beard then, but it was all this color, not this color. And two of the first blog posts that I ever read were from around 2009 or so by Paul White, because it was a problem that I was trying to solve.

My first SQL Server job was at a market research company, and my initial exposure to SQL Server was doing a lot of file loading of stuff, and eventually I got entrusted with more and more stuff, and part of that was writing reports, and part of that was people being like, I don’t need a 100,000 row report all at once.

I just want to see what the top 100 or 500 or 1,000 rows are. And so I was just like, I don’t know how to do that. And so I started searching around various terms, flailing mercilessly at my keyboard, trying to figure out a good way to do this, and I came across these SQL Server’s central articles by Paul White.

Really only the first one stuck with me at first, because the second one was way over my head at the time. I’ve made peace with the second one over the years, but the first one is really what made the difference.

And not to get nostalgic and probably kind of emotional here, but these articles are really the first thing that kind of showed me that in SQL Server, the way that you write queries and do things differently can really make a big difference to performance, because I had found some other blog posts about doing similar things, but the results that I got from those were just terribly, terribly slow.

So these ones really, really changed stuff for me. And to this day, I still see people struggling with writing paging queries for a lot of reasons.

Like they put the entire select list that they want within the same query that they’re like figuring out the rows that they want.

You know, like one concept that I talk about a lot, it’s maybe not enough, both in my YouTube videos and in my actual for money training is the concept of separating informational columns from relational columns.

Relational columns are the ones that you, you know, filter on like your where clause, you join to, you have your group buys, your order buys. And like typically, like you can find some key column using like that, like those relational aspects of the query and then filter down to just the rows you’re interested in and then match those rows to, you know, the full table with the columns you care about.

So those are like the informational columns, the ones that you’re displaying to people, your relational columns are the ones that you’re doing all the other stuff to. Anything that requires math to do. You don’t need math to return a select list.

You need math to do like cardinality estimation and filtering and joining and grouping and all that other stuff. But you don’t need math for a select list. So think of it that way.

So I see people screw this up a lot. They’ll use join when they aren’t returning columns from the table and like exist would have been fine or not exist would have been fine. Because they use a join and not one of the exist clauses, they like have to put a distinct on it.

They’ll sometimes stick a bunch of logic in a view thinking that that makes life easier. And like one thing that almost everyone overlooks at first is indexing.

Because having good indexes in place to support your page inquiries is pretty gosh darn important. You know, another thing that comes up a lot when we’re talking about page inquiries is, you know, you’ll find these queries that are just absolutely festooned with no lock hints.

They’re just everywhere. And someone will be like, well, I don’t like this idea because someone might see weird results if they go from page one to page two.

And we have to have that query go back to the server. Let’s just cache the whole result in the application. I’m like, good lord. You care about data consistency when someone clicks to page two, but you don’t care about data consistency when you’re getting all the data that you’re going to shovel into the application.

Okay, sport. I got you. So since SQL Server 2012, when offset and fetch were introduced into SQL Server, there have been like basically two, you know, pretty approachable ways of writing paging queries.

I’m going to show you like a pretty good setup for one where within the paging portion of the query, and I forgot where the control key was, I am only getting the primary key of the post table, this ID comms, the clustered primary key.

And then outside of the relational part of the query, I get the informational part of the query by joining back to the post table on that clustered primary key to get the rest of the columns out.

Right? So that’s the first thing that I’m doing there. The problem with using offset fetch in more modern versions of SQL Server, not SQL Server 2012, is that there’s really nothing in there that’s eligible for batch mode.

And I’m going to show you a big difference between this query and another one, but what I need to do is add in a couple of hints here just to push the point across. One, I’m using a recompile hint so that there is nothing about the local variables I’m using to determine page size and the page number.

And the second is that I do have this query running in compat level 160. 150 would also do for this, but 160 is just the highest one. So I always sort prices high to low.

So we’re going to sort compatibility levels high to low too. And we’re going to use that. If we run this query and we look at what happens, we should turn on execution plans because that’s the kind and decent thing to do, isn’t it?

Make sure that we have a query plan to look at. Where would we be if we didn’t have query plans to look at? Imagine if someone said, imagine if someone just took query text and they put it in an email and they said, why is this query slow?

What can I change? And you’re like, why don’t you show me an execution plan? Because maybe an execution plan might tell me what’s slow in the query, not just having to guess it, you know, how many rows might be in these tables or what your indexes might be or any of that other stuff, right?

Good information to have. If we look at the execution plan, this thing takes, well, this is another sort of weird operator time thing around parallel exchanges.

This query actually takes about seven seconds. Why the gather stream says it takes 13 and a half seconds. I don’t know. I don’t write, I didn’t write that code. I’ve never seen that code.

I wish whoever is still at Microsoft and responsible for this code would put a little bit of effort in that, a little bit of elbow grease, get the summer intern, actually, summer is over, I guess, get the fall interns to do some code review on this and maybe start fixing some of this stuff.

But this takes about seven seconds. And offset fetch is what, as a lot of people accurately describe it, it is syntactic sugar for the top expression.

Since top is a SQL Server specific thing and offset fetch is, I guess, an ANSI compliant thing. It’s like every database, every other database does offset and fetch.

SQL Server is the only one that has top in it. It doesn’t have bottom. You can, of course, do top descending order, but no bottom. It’d be cool if SQL Server had a top and a bottom.

I would just get a kick out of typing, like, select bottom 100 or something. That would be fun for me. But anyway, this whole thing takes about seven seconds and we don’t have a good time with it.

And that’s going to be a lot different from another way. And this is the Paul White method. This is the method that you will read about. And these links will, of course, be in the video description.

But if you were to follow the link to optimizing, for some reason, with an S server side paging part one, you would see a query setup similar to this one, though I think mine has the superior formatting at this point.

We can give Paul a pass because that was 15 years ago that he wrote these queries. So perhaps his formatting standards have definitely improved since then in some ways. In some ways he’s questionable.

But, you know, New Zealand’s kind of like Europe. They all have weird tastes and habits. So in this one, instead of using offset fetch, we’re going to use row number.

Right? See, we are row numbering here. And then outside of the row numbering, we are filtering on that row number with some very fancy math, with some very fancy parentheses to get what we want. And just like the last query, we are recompiling to avoid any local variable weirdness.

And we are using the most recent compatibility level to reinforce the fact that batch mode on rowstore is something that can happen for windowing functions, but not something that can happen for top and offset fetch.

If we run this query, you might notice that it was a tad bit snappier. This was not a seven second ordeal for us to run this. This was, this whole thing took, if SSMS will be so kind as to let me pull this up.

This took about 1.1 seconds. Now, the reason why this, this plan looks a little funny is because it is mixed row mode and batch mode.

So some operators that execute in batch mode are only responsible for their time. And some operators that execute in row mode are responsible for their time, plus all their child operators.

So if we kind of come and look over here, we’ll see about half a second here. And that time go up here. Actually, I bet this is batch mode. Yeah.

So this happens in batch mode. Does this happen in batch mode? This does happen in batch mode. Right? So we got two batch modes. So we got five plus five plus zero. But then this gather streams is in row mode. And this gather stream shows us the 1.1, 1, 1.112 seconds.

And then that pretty much carries on through the top, which is row mode. Another, another parallel exchange, which is also row mode. This filter, which is presumably row mode, because it didn’t like, like this time didn’t change from this one to this one.

And then this adaptive join here, which since this is like, this is technically a, like an extension of having batch mode execution happen. The actual join type of the adaptive join is nested loops.

And nested loops does not support batch mode. So we get a row mode nested loops here. Basically what it comes down to is that the whole query at the end of it, this sort is going to be in batch mode.

That’s why the time goes down there. But the whole query takes about 1.1 seconds. This is something that you can get. Oops. This is something that you can get from a paging query that uses a windowing function because the windowing function is eligible for this window aggregate operator where the offset fetch version is not.

So if you’re writing batch inquiries on more modern versions of SQL Server, I do think that the way that you write the query matters a lot more than it used to.

Granted, there were some problems with offset fetch. If, and if like, even with good indexes, if you were like going deep into a result set that maybe didn’t show up as, as readily with using windowing functions, but the way that you write window, the way that you write paging queries makes a big difference now because the, if you use the windowing function version, you can get batch mode on rowstore with offset fetch, you can’t.

Now that would change if we were to create a non clustered columnstore index on the post table, uh, because then, you know, having that clustered columns, that’s not, sorry, that non clustered columnstore data source would make the, whatever operators and the, the plan that are eligible for batch mode, get batch mode, but without that in place.

And I don’t know if you’re going to be even allowed to do that sort of thing. Uh, we would, we, this, this would be the preferred method. So when you’re writing paging queries, uh, and you’re on newer versions of SQL Server, um, you know, there, there are definitely some tricks you can play to make them faster.

One of them being to use the windowing function method. Another one being to add a non clustered columnstore index to the table so that, uh, you get the, you get one index, all of your, you know, dynamic paging queries can use rather than having to have 40 different non clustered rowstore indexes to support every different combination of columns or where clause and order by and everything else.

So with that out of the way, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I hope that you will use this, these videos to start doing things more correctly in your SQL Server life. Um, it is, I believe Friday here, at least according to my swatch, uh, so I’m going to, I’m going to figure out some Friday things to do.

And, uh, I will see you in the next video. Uh, as soon as I’ve got sorted out exactly what the next, what I’m going to record next. So, uh, once again, thank you for watching.

Good, good job watching. full, Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

How To Get Rid Of NOLOCK Hints Forever

How To Get Rid Of NOLOCK Hints Forever



Thanks for watching!

Video Summary

In this video, I delve into a clever workaround for managing nolock hints in SQL Server queries—a technique that can help maintain database consistency and prevent potential issues related to dirty reads. By transforming your tables into views with specific locking mechanisms (like read committed snapshot isolation), you effectively discourage developers from using explicit nolock hints. This method not only simplifies code management but also ensures more controlled data access, making it a valuable addition to any SQL Server best practices toolkit. So, whether you’re dealing with team collaboration or just want to streamline your query writing process, this tip will surely come in handy!

Full Transcript

Erik Darling here with Darling Data. I’m going to show you a little secret to help you keep nolock hints out of all your code. All you have to do is replace all your tables with views and just sort of swap the names. So we’re going to create a view here called ViewMe, and then we’re going to create a view on that table with this readCommit. the view on that table with this readCommit.com. And then the next time some developer comes in and tries to run a query with a nolock hint, they’re going to get an error because there are conflicting locking hints. Pretty cool, huh? All right. It’s our secret.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Working With Binary and Varbinary Data In SQL Server

A Little About Working With Binary and Varbinary Data In SQL Server



Thanks for watching!

Video Summary

In this video, I dive into the world of binary strings in SQL Server, a topic that has been making its way to my top-performing blog posts on a regular basis. As someone who frequently deals with odd data structures like query plans and XML files, understanding how to correctly handle these binary values is crucial. I share practical examples from my work, including the nuances of converting strings to binary formats using `CONVERT` with specific cultures, and explain why simply casting or using `CAST` won’t cut it in this scenario. By walking through real-world issues like comparing query hashes and plan hashes, you’ll learn how to ensure accurate comparisons and avoid common pitfalls that can lead to incorrect results.

Full Transcript

Erik Darling here with, as the sign says, Darling Data. And as you know, signs never lie. Astrology signs, street signs, sign language, I don’t know, I guess you could count stuff like tea leaves and chicken bones and tarot cards and stuff in there too. Some of that slightly more open to interpretation, of course. So we’re not going to spend too much time on that. In today’s video, we are going to talk about binary strings. Why are we talking about this? Well, I’m going to be honest with you. Every month, or is it every month? God, it feels so often. I get these emails from Google because I have Google looking at my website and saying, this is the content that people find on your website. And one of the most popular blog posts that I have, which I don’t know, that says something maybe about my site, because, man, I have a lot of content on there that I find terribly interesting. But one of my top performing pages is about how to convert binary strings correctly in SQL Server. I’m not really sure how to take that. But here we go. Before we do that, and I don’t know, maybe I should start putting, I should start putting this in like, like random, like commando guerrilla warfare parts of the video so you just can’t skip over it. But I also try to make this a fun, lively and vibrant part of the video so that you stay tuned because who knows what I’ll say in these things. If you just love everything I do so much, if you just want to give me a big hug with money, you can sign up for a membership to the channel for four American dollars a month. And that’s a that’s a nice thing to do. Because the more people do that, the more the less in debt I am. So you know, that’s a great thing. If you are short for American dollars per month, for whatever reason, I don’t know, you’ve got a sick aunt and shkabogan, you can do other things to show your undying loyalty to darling data. You can like my videos, you can comment on my videos to make me feel like I’m not alone in this cold world. And of course, you can subscribe to the channel so that you get a notification along with like, I’m over 4600 other data darlings at this point.

So you can get notified whenever I drop drop this hot content on your heads. So those are those are other ways to do things. If you are in need of a SQL Server consultant, perhaps the kind of young and handsome consultant who can perform these rituals upon your SQL Server and heal it of its grievous wounds and injuries. I’m pretty good at all this stuff. And my rates are reasonable. If you would like high quality, high quality, low cost, there we go. Keep making sure I keep the highs and the lows in order there. On SQL Server, I offer all of the above beginner, intermediate and expert level training on things. God, my hand looks weird at that angle. It’s like long and knifey. You can get all of that training for life. Long live you. Long live you. For about 150 US dollars with the discount code spring cleaning, even though it is almost Halloween or getting into Halloween season. But a month away, I guess. It’s always spring at Darling Data.

If you would like to see me live and in person in about two months or actually, no, I guess just about a month now because it’s right after Halloween. Oh, boy. Time flies when you’re Erik Darling. You can catch me November 4th and 5th at Past Data Summit in Seattle, Washington. And as always, if there is an event near you that you’re like, God, why won’t Eric come to my town and talk to me about SQL or my weekend habits or, I don’t know, my fingernail collection, let me know what event is nearby you where we can talk about your fingernails and why you shouldn’t keep them. But with that out of the way, let’s talk about this SQL Server stuff, shall we? I think that’s a good idea. So I end up in my day to day life needing to pull data out of odd structures. By odd structures, I mean stuff inside a SQL Server that is usually stuff like query plans.

XML galore. The block process report, which more XML galore. The XML deadlock report, would you believe that’s even more XML? And it’s not on the list there, but the extended events, again, all XML. God, it’s just XML all the way down. Really, you would think that SQL Server was just like an XML document database, the rate that everything turns into XML in there. You may run into it elsewhere. You may find yourself importing XML files or JSON. I don’t know. Wave of the future. You might YAML. You might have another language that ends in ML or, sorry, another file format or specification that ends in ML that you need to mess with. And they all do strings weird.

The main issue that I find is when I need to compare the data that comes out of primarily XML structures, the stuff that I have to deal with is query related for the most part. So you have query hash and plan hash. Those are binary eight. And then you have SQL handle and plan handle. Those are var binary 64. So two entirely different beasts. And there’s, like, I know that there’s, like, some Microsoft posts with, like, shenanigans around, like, the stuff you can do with big ints.

I can’t always get that to work consistently. In some cases I can, in some cases I can’t. It’s, I don’t know. I just don’t understand what’s different in these things. Perhaps I am not XML enough. I don’t know. I’m not an XML expert enough to tell you. But I do want to show you some stuff that I’ve learned while dealing with binary strings.

That come out of various data sources. So if we look at this query, right, and we say, we want to find out if this string that came out of, let’s just pretend it came out of a query plan, is equal to this other binary value. SQL Server is going to say no. Right? That is not equal. That’s why we have a zero there. Remember, the logic for this query is if this equals this, then one else zero. So we do not equal this. So we get a zero back. Even if we do something like this, and we say, convert this string to var binary eight, we still get a zero back. Why? That is crazy.

We converted a string to var binary eight, or sorry, to binary eight. Why would it not match this thing? The answer is because when you convert strings in SQL Server that you want to say are binary or var binary, you have to use convert with one of the little culture things. There’s one conveniently right by my head there. There’s culture one and there’s culture two, but you can see that I’m using that same thing in the binary eight one that I’m going to show you now. So the difference here is that one does not attempt to add an OX to the beginning and two attempts to add an OX to the beginning. So depending on like what you’re dealing with, you might need to use one or the other. Right?

So like if we say, if we do these things, look at the different results that we get back. Right? So I want to show you all in one screen. If we just convert this string to binary eight, SQL Server does that like literally it says, Oh, this, this string is now, this is the binary eight representation of this string. All right. That, that, that, that, that is not what we want. What we want is SQL Server to preserve the binaryness of this string and actually treat this, this string as a bar binary value. We can do that with, with the OX using the, the one culture. And we can do that without the OX at the beginning, uh, using the, uh, the two, the two culture there. So if your string does not have an OX at the beginning, uh, that is one way to get that to happen.

The same thing goes for var binary of just about any length. I’m showing you what I get in here from, uh, like SQL and plan handles. It is the same deal. If you look at what we get from this one, this is not the binary that we have here. Again, this is SQL Server saying, Oh, you’re binary. Now you, you, you, you, you, we don’t care about you. We’re just going to turn you into something else. So the, the, the, the, the transmogrify you into some weird binary representation, right? Binary representation of what’s already a binary.

Big air quotes string. So if we were to change our query logic here to say, uh, convert, convert binary eight with the one on the end. Now we’re going to get that one. We’re going to get that match back. So if you need to do stuff like this with binary data that comes out of weird data sources, or if you’re getting a hankering to like, you know, pull data out of like, you know, XML for the plan cache block process report, deadlock report, or even, uh, some other extended event.

This is the kind of stuff that you have to do in my store procedures. When I’m like, this is from, these two are from quickie store. What I do is I pull the, I pull the value out, uh, as, um, as a string, but I have a computed column in my table that does the conversion over for me so that I make sure that I get the right value from whatever ends up in there. And this is the best way that I’ve found to do it. You can also do that with, um, with the regular, with, uh, sorry, with the, um, do I have two there? No, I didn’t, I just didn’t scroll down to the right one. You can also do that with SQL handles like this. So this one rather than just, this one converts to var binary 64 instead of var binary eight. Cause remember SQL and plan handles are var binary 64, uh, query hash and query plan hash are binary eight.

So that’s what I do when, when I’m pulling data, this is out for quickie store stuff in other parts of, um, my store, in other store procedures of mine. This is specifically from SP human events, uh, where, uh, I forget exactly which part this comes from. This might be from like the block process report stuff. What I’m doing specifically here is like notice for the first two that are binary eight, I’m doing the same thing where I’m getting that, like, you know, slash value slash text from the XML.

But for the top two, I could never get these to actually work like doing the excess hex binary thing, binary eight, they would always come back null or like an empty string or with the wrong value. So this is one of those things I was talking about, like this, this stuff can’t always work, but when I get it as a big int from the XML and I convert it as a, to var, to binary eight outside of that, then it works. Okay. So the big int to var binary eight and that case works everything else. No bueno.

But for the plan handle, which is var, var binary 64, I can do this neat little XML function trick with excess, excess colon hex binary. And that I can get to be var binary 64 natively. What’s the difference? Damn. Defino I look at this XML and it all looks the same to me. It just doesn’t work when I actually query it.

It’s strange. In SP Blitz Cache, there was a really, really, there was this weird bug in SQL Server up until like 2016 Service Pack something, two or two and a half or whatever, where it was like a plan hash or query hash values were becoming sort of corrupted in a weird way where they couldn’t like whatever was coming out of there was absolute junk. We were getting like wrong lengths and stuff. And this is what I had to do for, to get this to be right was I had to say convert binary eight, this bunch of zeros.

And then I have the right function there. And then I get a substring of the query hash cutting off the OX, just getting everything after the OX. And then I get, uh, well, sorry, that’s like the three to the 18th thing with this, this thing concatenated on and then convert that to a binary with the two culture. So that was a really nasty thing that was happening in there. As far as I know, I still can’t, I still couldn’t pull that out of SP Blitz Cache just because someone might be on a 2016 version that still has that bug.

So there, there, it’s going to stay with this confusing code, just trying to get a query hash and a plan hash to work. So be careful out there. If you need to work with a binary or var binary data, uh, how you handle it and convert, you basically can’t use cast for this because cast doesn’t allow you to supply a culture where you can tell, uh, like with convert, you can tell it how to treat the string with cast. Cast is usually a kind of crappy way of doing things. Um, I do kind of look forward to the day when Microsoft adds like the double colon cast that like Postgres and other databases have, but I don’t think that would work here.

I don’t think that would fix this either. I think you pretty much straight up need convert for this one. So, uh, be very careful out there with, with, uh, how you treat your binary data, because if you need to compare it to other binary data, it could be awfully, terribly, disastrously, and most importantly, incorrect. So, thank you for watching. Big chin. I hope you enjoyed yourselves. I hope you learned something. Uh, I hope that, uh, this video was of some use and utility to you. Otherwise, I’ll just go jump out the window. The last thing you’ll hear is the sound of shattering glass and then there’ll just be empty space on the screen for until my camera overheats, which takes about 40 minutes.

So, anyway, uh, I think, I, I think I hear Ma ringing the dinner bell. So I’m going to go, gonna go stuff myself full of chicken and broccoli. Uh, that’s what, that’s what, that’s what we do here at Darling Data. We, we stay healthy. Uh, anyway, thank you for watching. Uh, goodbye. Bye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Cardinality Estimation Feedback In SQL Server 2022

A Little About Cardinality Estimation Feedback In SQL Server 2022



Thanks for watching!

Video Summary

In this video, I dive into cardinality estimation feedback—a feature introduced in SQL Server 2022 that not many people are talking about yet. With prerequisites like Query Store turned on and being in Compat Level 160, it’s a bit of an underdog right now. However, I demonstrate how this feature can help correct cardinality estimation issues by showing a query that initially had wildly incorrect estimates but eventually got fixed after running multiple times. By the end of the video, you’ll learn about different hints and considerations for enabling this feature, as well as when it might become more widely adopted in future SQL Server releases.

Full Transcript

Erik Darling here with Darling Data, home of the most reasonable rates in the SQL Server universe. Actually, finally acknowledged by BeerGut Magazine to have the most reasonable rates of any SQL Server consultancy that has any competence whatsoever. I’m sure you can find someone much cheaper who sucks at everything. So enjoy that experience with your life. In today’s video, we’re going to talk about cardinality estimation feedback. Why does it say CE feedback? Well, you’ve got cardinality which starts with C, and you’ve got estimation which starts with E, and then you’ve got feedback. So it all makes sense, I promise. This was a feature that got introduced to SQL Server 2022 that not a lot of people have talked about. Mostly because there’s probably not a lot of people on SQL Server 2022. Some other prerequisites is you have to have Query Store turned on and you have to be in Compat Level 160. So if you’ve got all three of those moons aligned around you, maybe you’re getting this and you just don’t know it. I don’t know. But I don’t really hear anyone shouting from the rooftops how great their cardinality estimation feedback is. So I don’t know. Maybe this will help open your eyes to just how okay cardinality estimation feedback can be. So I don’t know. Maybe this will help open your eyes to just how okay cardinality estimation feedback can be.

So before we do that, let’s talk about you and me feedback. And let’s say if you enjoy this channel content, if you like SQL Server stuff, maybe you just like looking at me. Maybe you like the sound of my voice. I don’t know. Whatever it is that keeps you coming back here doing things. If you would like to support my endeavors with this channel, you can sign up for a membership. It’s about four bucks a month at the low end. You can, of course, choose to be more generous if you’d like. But, you know, four bucks a month, not bad. All right, depending on how much, how many bucks a month you have. If your exchange rates are currently just too much for you, even if you’re an American and inflation exchange rates are just too much for you, you can like, you can comment, you can subscribe.

And you can fill my heart with joy as I answer YouTube comments to fill in all the lonely, lonely gaps in my life. If you need help with your SQL Server, and I guarantee you all do, and you would like someone with very reasonable rates to come work some magic on your SQL Server. Be like a little SQL Server masseuse and give your SQL Server a nice time. I’m available to do all of these things and my rates are reasonable. If you could use some high quality, low cost SQL Server training, you can get beginner, intermediate and expert level stuff from me, all for about 150 US dollars for life.

So long live you. You can use that link and that coupon code or you can just click the link in the video description and guess what? It’ll bring you right there. It’s amazing. Advanced hyperlink technology, ladies and gentlemen. Wave of the future. Live and in person, of course, coming up in, oh geez, just about a month. November 4th and 5th, I will be in Seattle, Washington at Pass Data Summit with Kendra Little.

I thought I was going to say Cat Lady Kendra Little. I think she just has dogs though. But I’ll be there co-hosting two days of performance tooting mastery and magic and that’ll be fun. And if there’s an event nearby to you, your heart, your soul, some family friendly part of you, and you would like me to show up there to talk about SQL Server, tell me what that event is so that I can try to go there.

Let’s make sure that they’re family friendly parts because as evidenced just moments ago, this is a family friendly channel. And now with that out of the way, let’s start our SQL Server party. This is not a slumber party. We’re going to be wide awake for this one.

We are going to do this one good. So I’ve got this query here and I have run this query 17 times. And if we look at this query, well, hopefully if everything works the way it has worked the last three times I’ve run through this, we will have the same execution plan all 17 times.

And we can see looking through all 17 of these executions, the SQL Server made an oopsie. It made a boo-boo, made a boom-boom. Look what happened.

It was off by a lot here, almost 8,000%. It was off by a lot here. Well, actually, that’s almost the same number of percent that that’s wrong, right?

That makes sense, though, since it’s a key lookup. And then over here, well, it’s still off by the same amount because that’s just the nested loops joined from this key lookup. And then that index seek to post, well, we guessed 10,001.

We got zero. So clearly, cardinality estimation has gone amok here. How amok has it gone?

Well, enough for SQL Server to care about it. Enough for SQL Server to want to give us some feedback on our cardinality. Now, of course, you can do that. You can see which queries are affected by cardinality estimation feedback and all sorts of other feedback mechanisms in SQL Server with my free amazing store procedure, SP Quickie Store.

And if we run this query just like this, notice that we get a whole bunch of stuff back. But the important stuff that we get back is up here. We have this query.

We can see it. It ran. There’s a query plan for it. But look in here. Feature description, CE feedback. We have feedback data. We have some feedback hints.

Well, I don’t think there’s anything else in there at the moment, though. It’s just some lonely empty JSON. And we have this pending validation. So we ran this 17 times, and it’s pending validation.

I don’t think there’s anything else in here. Nope, that’s empty. So what we’re going to do is we’re going to come back, and we’re going to run this one more time. All right?

And what happened on that final time? We got a different query plan. Good Lord. Look at that. Cardinality estimation has been fixed.

We started scanning the votes table. And, well, I don’t know. We still got some bad cardinality feedback there. We still have some bad cardinality estimates on this. We guessed 104, 535, and we got zero.

But you know what? It’s okay, because we got that one right up there. Maybe we just need another round of cardinality estimation feedback. Maybe we need one on the join now.

I don’t know. Maybe SQL Server is going to keep working on it. I’m not going to run this thing another 17 times to find out, though. So now if we go look at Query Store via SP Quickie Store, and we look at what happened, now we have two query plans for this.

We can see that Query ID 3 has now generated a Plan 3 and a Plan 4. So SQL Server has formally acknowledged the new plan. And if we look in here, we’ll see that we have for our CE feedback that SQL Server has disabled a row goal and that verification has passed.

And that’s what gave us the new query plan. And we can see that SQL Server applied the hint. Option use hint.

Disable optimize our row goal. Isn’t that all fantastic? We didn’t have to do anything. SQL Server just finally looked at a query plan and went, you know, it’s just a lot going on.

We’re really messing up. Maybe we should try a different strategy. And it finally applied one. Now, this is cool because it did work eventually. It just took, you know, what, 17 executions.

And then on the 18th execution, SQL Server was like, we got this. We’re doing this. We’re going the whole way.

So it’s cool that it does eventually work. So I’m psyched on that. So, you know, in five, six years when everyone gets to SQL Server 2022, and just like when it’s on the cusp of being end of life or end of support or whatever it’s called, and maybe you’ll be in compat level 160, but probably not, which I understand.

But maybe you’ll see this too. I don’t know. I mean, SQL Server 2022 is already like going on three years old. Wow.

Makes you feel old, doesn’t it? Crusty. But there’s pretty good documentation about how this all works and things that are in play. And I kind of want to just scroll down to the different hints that it can apply.

So, of course, there is the feedback correlation, which it uses the correlation model assumptions. These are the different correlation models.

We have fully independent, partially correlated, and fully correlated. I don’t know why it’s not fully independent, partially independent, and fully correlated. I guess it doesn’t really matter.

Maybe there should be a fourth step where it’s partially independent and then partially correlated, because you don’t know what percentage is partial. Right?

Is it 50-50? The 75-25? 60-40? We don’t know. There might need another step in there. You know? Moon phases. There’s feedback join containment, which goes off either simple containment, which is the old cardinality estimator model of doing it, or base containment, which is the new cardinality estimation model of doing things.

And here’s the one that we got, the query optimizer row goal. Yeah, buddy. This is the one that kicked in for us and did all sorts of wonderful things for our query, some sort of.

And then down here, we have some considerations for cardinality estimation feedback. And this is all the stuff that you need to have compat level 160. You have to have this not turned off.

This will turn it off. You need to turn it on or have it turned on. You need to not use the disable feedback hint. And you need to have query store turned on so that SQL Server can track these things. There’s some more stuff in here.

I don’t know if you’re… Oh, there’s some queries. Wow. Microsoft wrote queries for us. I bet they’re great. Thanks, Microsoft. Ah, look at that.

Okay. Well, that’s enough web browsing for one day. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

I’m going to scroll away from those queries. They’re hurting my eyes. I do hope that eventually… Well, I don’t know. I have a feeling that sometime… Sometime…

Is it September 20-something here? I have a feeling that sometime in the next… Let’s see.

Three to five or so months, we should start seeing SQL Server the next CTPs rolling around. That’s probably a reasonable thing. Probably a reasonable timeline on that.

So I have a good feeling about that. So hopefully we start seeing that soon. And I don’t know. Maybe SQL Server 22 will just be kind of one of those in-betweener versions that no one touched, like SQL Server 2014 or SQL Server 2017. Because 2016 and 2019, people seem to touch.

And SQL Server 20 v.next? Maybe that’ll be the one. Maybe 2022 is just… It’s kind of a cursed release.

I don’t know. But anyway… That’s probably when most of you will probably start seeing this feature kick in a little bit more for you. We hope.

By we, I mean me. Because I like fixes that do things automatically. And I like when they work correctly. So this one appears to do okay sometimes.

So good job, Microsoft. Investing in the future. Anyway, I can’t remember what I already said. So I’ll say it again, just in case.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you over in the next video, which will have equally reasonable rates as this video. All right.

Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Optimizer Timeouts In SQL Server

A Little About Optimizer Timeouts In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into the intricacies of optimizer timeouts in SQL Server, dispelling common misconceptions about what they actually mean. Erik Darling from Darling Data shares insights on how these timeouts occur and discusses the different search phases that the optimizer goes through to generate a query plan. He also provides practical advice on monitoring and identifying queries that have met an optimizer timeout using Query Store and DMVs like `sys.dm_exec_query_optimizer_info`. Additionally, I touch on the importance of having a better relationship with your audience by encouraging support through subscriptions or memberships, which can be accessed via links in the video description.

Full Transcript

Erik Darling here with Darling Data. Feeling… Oh, I don’t know. Does it matter how I feel? Never seems to. The videos must flow. So here we are. In this video, we’re going to talk about… What is wrong with my face? It’s like a red… Looks like I got smacked. I have not been smacked recently. Perhaps… Perhaps to my own detriment. In this video, we’re going to talk about optimizer timeouts. Because everyone seems to think that optimizer timeouts mean… Your query took a long time to optimize. Which isn’t true. Oh no. Oh no, I’ve given it all away. But before we go in and we talk about these optimizer timeouts, we’re going to talk about how you and I can have a better, stronger relationship. If you love this content so much that you feel that… As little as $4 a month might be a good way to say thank you for all of these videos. You can subscribe or you can get a membership to my channel at the link below. In the video description. It’s a wonderful place to contain these links and whatnot. If you like this content, even the one-armed man type content, liking, commenting, subscribing, all wonderful ways to support my endeavors with SQL Server. At least, you know.

For now. If you need any consulting help with a poorly performing SQL Server, I have a variety of things that I am pretty good at. If you need something else, please let me know what it is. My rates are reasonable. For some low-cost, high-quality… It’s a good thing I didn’t mess that up and reverse those because low-quality, high-cost training… I don’t know. That sounds like what you get from everyone else. But if you want some high-quality, low-cost training, you can use this discount code and you can go to that link and you can get all of my high-quality, low-cost training for about $150 for life.

If you would like to see me live and in person, if you would like to see me strut about a stage with a microphone quite similar to this one, talking about subjects quite similar to what I talk about here, you can catch both Kendra Little and I co-hosting two days of performance tuning madness, madness, madness at Past Data Summit, November 4th and 5th in Seattle, Washington. Seattle’s lovely that time of year. You should see the sunsets over the water thing. There’s a ferris wheel and all sorts of other stuff. Quite nice.

Quite nice. So, with that out of the way, let’s talk about Optimizer Timeouts because this is something that comes up quite a bit in my consulting engagements when people pay me to talk about SQL Server. So, I guess, I don’t know. Are you stealing from me? Are you getting it for free? Oh, God. So, when your query passes all of the initial terribly boring stuff that queries have to go through before SQL Server starts optimizing them, like parsing and binding and whatnot, the Optimizer makes a first pass to assign this sort of initial starting cost to your query.

This is, I don’t know, sort of a low-level heuristic based on complexity. Things like the number of joins, sub-queries, I don’t know, things like that. Probably some other stuff.

I don’t know. Whatever makes queries more complicated to SQL Server. And it’ll assign an initial starting cost, which tells it not an initial time, like on a clock, but a number of steps that it’s willing to take in order to come up with a good enough query plan. So, some plans are trivial where there are no cost-based decisions.

I’ve talked about trivial plans on the Chan before. Some queries are relatively simple, but do have cost-based decisions. That gets you to, like, the, pretty much like the OLTP type queries, where it’s just like, oh, yeah, select a customer and their order or whatever.

So, that’s search zero. Some queries do require a bit more thought. That gets you to another search phase called Search 1, where SQL Server, you know, is willing to do a little bit, spend a little bit more time thinking about join types and ordering and whatnot.

And some require a real big think. You know, one of those, you know, grab a book and head into the bathroom type thinks about how to come up with a good plan for your query. And sometimes, some queries are real, real bad, and the optimizer will run out of memory trying to compile a plan for them, and you’ll get something that says memory limit exceeded.

That is what I wrote over here that was slightly cut off on the screen before. I do apologize for that. But the initial cost of your query sets what’s called a budget.

Most people are not aware of what a budget is. It is essentially your limitation. Something.

How much you are willing to spend on something. And that initial, that budget is dynamic and may increase. It will not decrease.

If the optimizer doesn’t explore every single possibility, if it does not exhaustively search through every possible combination of things, because it spends the whole task budget, that’s when it adds this statement opt early abort reason. And that’s when it’s timeout. So, to kind of show you where you can see some of the both initial phases, some of the stuff that I talked about up there basically.

You can look at this DMV called sys.dm.execqueryoptimizerinfo. And that will show you when you had a, how many times you had a trivial plan. And how many times like me you had no plan whatsoever.

And so just winging it. It’ll also tell you about all the search phases. So search zero, one, and two in here. It’ll tell you how much time it spent, how many tasks met it for all these things.

It will also tell you if anything went from zero to one or one to two. So that’s what I meant by it sort of being dynamic. It will also tell you if there are, if there have been any timeouts and if there have been any memory limit exceeded on here.

It might be interesting to monitor for memory limit exceeded, but you know, I don’t think most people are writing queries where that would be of any real benefit to you. So that’s where you can see that stuff. Where you can, how, one way you can find queries that have met that optimizer timeout.

Well, I’ve written a query here for query store where we can look and we can parse out the XML a little bit and we can find queries that have met the optimizer timeout. Now, what’s interesting about these is if, I don’t know, if we scroll down a bit, there’s all these queries that say stuff like msparam in them. There’s even this crazy XML namespaces query.

I don’t know who would write something that insane. But we have all these msparam queries. And if we open up the query plan for them, you can see that these are rather big and complex. The initial cost of this query though is 0.89 something query bucks.

So not a very expensive starting query. And, you know, if we zoom all the way out, it might be fun. But if we look at this one, we can see this is a rather large query plan.

So clearly SQL Server didn’t, you know, based on the initial starting cost was like, I’ll take this many stabs at coming up with a good execution plan. But if I get, if I go through all this, I’m just picking the cheapest possible one and throwing that out the door. If we look at a more realistic user query, now these are just, of course, these are msparam.

These are Microsoft internal queries timing out. If we look at an actual user query, right? And we look at this query plan, we’ll see one that’s far less complex.

There’s really only one, two, three joins in it, right? There’s an adaptive join from users to comments. There’s an adaptive joins from posts to votes.

And then there’s an adaptive join. Oh, sorry. There’s a non-adaptive hash join. Oops. I made a mistake. A non-adaptive hash join to join the results of both of those joins together. So there’s like essentially three joins in here.

But if you look at the initial starting cost, that is 13 query bucks or 13, I guess almost 13 and a half query bucks. Well, we can be kind and round up a little bit there. Make a donation to the Darling Data Home for Little Wanderers by rounding up a couple cents to the nearest 50.

And if we look at the properties of this, notice the compile time was only 46 milliseconds, but we still see this timeout down here, right? We still see SQL Server saying, I ran out of time to optimize this query. But what it really did was run out of tasks to optimize that query.

So the number of steps that it was willing to take and try to come up with a good execution plan for this query, SQL Server was just like, you know what? 13 and a half query bucks. I don’t know.

I don’t actually don’t know how many tasks you get per query buck or if it’s like gated a little bit. Because something some things in SQL, something inside of SQL Server like memory grants, SQL Server does use query cost as sort of gateways for figuring out who will get the next memory grant. So and but that’s bucketed from like less than 10, 10 to 99, 100 to 999, 1000 to 9999.

So maybe there’s like, you know, depending on not just like per query buck, but maybe for every like 100 or so query bucks, you get a different number of tasks allotted to you. I’ve never I’ve never quite had the temerity or the tenacity to go try and figure figure that part out. But of course, that also might be dynamic even within that.

And I just don’t want to give you wrong information. So I’m not going to give you any information on that because I don’t have any. I don’t want you to start thinking the wrong things.

But looking through some of the other metrics that we get back from query store, we can see compile, optimize duration or sorry duration and CPU. We can see compile memory. We can see some information about the queries like the number of executions, the average duration of the query, the average CPU time of the query.

But this this stuff over here slightly less sorry, this stuff over here highlighting. Thank you. Slightly less useful. This stuff over here for the purposes of what I’m trying to show you is that none of these spent a very, very long time compiling or parsing or binding or anything really.

Or they didn’t even use a lot of memory to do what they had to do. We see eight megs of memory for the first query, 31 megs of memory for the second query. Then if you’re like, you know, like a real memory stick or you’re eight megs, we went to the moon on that.

I don’t know what to tell you. Okay. I just I just don’t know what to tell you there. But we can see that we just did not spend a lot of time compiling plans or optimizing plans for the query.

For these queries. And now that I’m looking at this, I think that I actually forgot to convert those from from milliseconds to or from microseconds down to milliseconds. So those actually look a little bit bigger than what they were in the query plan.

But you know what? I’m 12 minutes into this thing and we’re just going to give me a slap on the wrist for maybe doing some math wrong. And we’re going to we’re going to carry on from there. But hey, it’s free.

At least for the majority of you, it’s free. So anyway, that’s about it for this. Just to sort of sum things up a little bit.

When you see a timeout in your query plan, like if you see that your optimizer timed out, that doesn’t mean that the optimizer took so long trying to come up with a good enough plan for your query that it just, you know, went just, you know, took its ball and went home. It just means that it tried through a certain number of steps or tasks or its budgeted tasks. And then it said, you know what?

I’ve spent enough. I’ve spent enough tasks on this trying to rearrange this query. I think I’m I don’t think I’m getting anywhere. I think I’m just going to give you this cheap, this good enough cheapest plan I found so far. So sometimes that’s fine.

We’re rather actually not even sometimes most of the time that’s going to be good enough. Other times you might need to, I don’t know, do other things to influence the optimizer to come up with a better plan. There have been some, well, actually, I think the use hints are still not quite functional.

It was like an optimized for analytic query or workload or something hint that I don’t think I’ve ever really gotten to work. But then there were some trace flags that would expand how long or rather how many tasks SQL Server was willing to go through to come up with the query plan. So those were, you know, actually, I’ve never actually seen those be useful for anything I was working on.

But I’m sure that they were useful for someone because they ended up as trace flags. So some someone out there probably at like Walmart or Walmart or SAP really needed them. So, you know, the rest of the rest of us little eyes, you know, we don’t get we don’t get stuff added to the product willy nilly.

Lucky to get a working cumulative update sometimes. Anyway, I hope you enjoyed yourselves. I hope you learned something.

I’m going to record some other videos now. So hopefully the key grips and gaffers don’t take their balls and go home. We got some work to do today.

Anyway, thank you for watching. My rates are reasonable.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Help! My Query Runs Too Long To Get An Actual Execution Plan!

Help! My Query Runs Too Long To Get An Actual Execution Plan!



Thanks for watching!

Video Summary

In this video, I share some tips on how to capture query plans for long-running queries in SQL Server. Specifically, I demonstrate two methods: using the SP Whoisactive feature and my own SP Human Events store procedure. These techniques are particularly useful when you need to gather detailed information about a query’s execution without having it run too long to get an actual plan. I also take some time to highlight the official BeerGut magazine website, which is now live with a collection of historical magazine covers that cover a wide range of topics, including SQL Server and community heroes like Sean Gilardi and Joe Sack. The video concludes with a bit of humor about query tuning and operator times, encouraging viewers to support my work if they find it helpful.

Full Transcript

Erik Darling here with Darling Data. And before we get into anything from this video, I realized that I have not… I did, but I did not… have not announced on YouTube the official website for BeerGut magazine is in fact live and in production. I added a bunch of historical magazine covers to it, and I wanted to and I’ll be adding more as time allows, but there’s quite a few up here now with really just some thought-provoking, really just wonderful topics that BeerGut magazine has covered over the years. This is a personal favorite of mine, about beer and the goth club. It’s a very good issue. There were some other good ones. You know, there’s some about me. That’s clearly probably the finest representation of me that I’ve ever seen, I’ve ever seen in my life. We have issues devoted to SQL Server, Community Heroes. There’s one about my friend, Sean Gilardi here. There’s one about even Joe Sack, the patron saint of databases. We’ve got one about Paul White. Some more about me. I don’t know. There’s one about some bass player who likes Entity Framework or something. We’ve got…

Probably, I think the finest… Probably the finest copy of BeerGut magazine in existence. I’ve got some autographed ones sitting around. BeerGut magazine did St. Patrick’s Day and boy, that was… That was pure chaos. We’ve also got some nice issues about… Well, our friend Taryn Pratt. We’ve got one about Kendra. We’ve got one about our Canadian friend. Michael Swart, who is… Again, a tremendously faithful representation of his likeness there. And, of course, we’ve got one about InfoSec Superstar. Super… Superstar? Not stupid star. That’s messed up. Sherrod DeGrippo. So, we’ve got… I’ve got magazine covers going back a ways here. And I’ll be adding more as my scanner allows. But, you know, it’s been fun, at least so far, creating these and getting that going.

But, anyway, let’s talk SQL turkey. Boy, I ruined that one, didn’t I? Spoiler alerts. Anyway, logo. Proper amounts of time paused on the logo. We’re going to talk about what happens or what you can do when your query runs too long to get a full actual plan. This can happen for a variety of reasons. I’m actually going to cover a slightly different scenario, too, just because I feel like it.

And, gosh darn it, it’s my YouTube channel. Before we do that, let’s talk about turkeys. Specifically, turkeys that cost $4 a month, just like memberships to this channel, if you feel like showing your support with a monetary donation. If you don’t have $4 a month because you spent it all on actual turkeys. I know inflation has been driving the price of turkeys up tremendously lately.

You can like, you can comment, you can subscribe. All will be well and happy in my heart if you do that. If you need help with your SQL Server. Your SQL Server is not performing very well. I don’t know. Maybe, you know, you tried sending it to some improv classes and it was like, this didn’t get much out of it.

If you need any of this stuff, that is my job and, as always, my rates are reasonable. If you would like some high-quality, low-cost training for a SQL Server, I’ve got that in abundance. I hope you consider 24 hours or so of training in abundance because I do.

24 hours is a pretty good amount of training for anything. You can get that for about $150 US dollars for life and you should take advantage of that while it is available. I mean, while your life is available, of course.

If you would like to see me live in person, November 4th and 5th, I will be doing two days of pre-cons with Kendra Little at Past Data Summit. If there is an event near and dear to your heart and you would like me to come there while you are still alive and do a pre-con, let me know what that is so I can submit. Or maybe you’re just so cool you can fast-track me right in.

Whatever works. Either way, let’s get on with this query party. So let’s say that we’ve got a query that runs for, I don’t know. I think for the sake of argument, we’re just going to say that this runs too long to get an actual execution plan for.

There are two ways that you can do this. One is if you are cool with, if you have like a fairly simple, like either just a single query or, you know, a store procedure where you kind of know that one query and it runs for a long time. What you can do is turn on actual execution plans, either using the little button up there that says include actual execution plans.

Or if you know which query in your store procedure causes you the most problems, inside of your store procedure, you can wrap that particular query and set statistics XML on. And then, you know, your query here. Oh, that’s the wrong button.

Your query here. And then at the end of that, set statistics XML off. And what that will do is for that one query, actual execution plans will be enabled. What that’s really helpful for is if you are in a situation where there are a lot of other little queries that do stuff around your big problem query, like assign variables, go look at stuff, maybe stick stuff in temp tables, things like that.

You can do all that stuff without fetching the query plans and then just fetch the query plan for your troublesome query. Once that’s running, now I have a, I mean, this store procedure that I’m using here only has one actual query in it. So we don’t really need to worry too much about anything else.

But if I run this and then I come over to this window, that’ll run for about 14, 15 seconds. And, you know, as we watch the time sort of pile up at the bottom there, we can see that query running all in there. And as we click on the query plan for that, since we have actual execution plans enabled for that query, we can actually see that query making progress with operator times in there.

Now, these times are not going to be absolutely dead to life perfect. Dead to life, that doesn’t make sense. Why did I say that?

Dead to rights? Dead to life. It’s like a hardcore band. Actually, now that I’m thinking about it, I’m going to start a hard, that’s going to be the official Beer Gut Magazine hardcore band is dead to life. What the hell is wrong with me?

But you can at least get some idea of like, okay, you know, we’re spending a long time in here or we’re spending a long time in here. Oh, I don’t like the way this looks. We spent a long time doing this, that, and the other thing.

So as the query makes progress, you can, with at least some, you know, semblance of reality, not like as good as having the actual plan, you can at least grab, like get some idea of like, oh, I see, we’re jammed up here. Let’s try and fix this and then run it again and see how much progress we made.

So that’s one way of doing it. So SP who is active will, at least for more, like, you know, more recent releases of it. Like, you know, there was a release of SP who is active now like three years ago where this was in there, but it was in the release before that too.

So hopefully you have updated SP who is active sometime in the last three or four years. Some of you don’t. Some of you never update anything.

It’s wild. So that’s one way of doing it. Another way of doing it would be to use my store procedure, SP human events. Now, this doesn’t solve the problem of capturing in-flight query plans, but it does, it can help you figure out which queries in a store procedure run for a long time and screen out the ones that don’t.

So the way that I’m running SP human events here is I’m capturing query performance metrics. I am only interested in queries that run for, that have one second or more of duration. And I am limiting things.

Oh, why did you go away? And I’m limiting things to just my session ID, which I have, you know, of course, thoughtfully signed up there. And I’m using the keep alive parameter so that I have a persistent ring buffer session that captures stuff. Now, that will, of course, clear out as it fills up.

But before it does, it’ll stick all sorts of live data in this window. Right. So if you right click on an extended event session and you say watch live data, a window just like this will show up and you’ll be able to add it and you’ll be able to see events as they come in. Where this is useful is because we’re capturing the query post execution plan event.

Right. Or sorry, query post execution show plan event up here. We actually get the full query plan once it’s finished here.

Right. So looking at all this stuff, we can see like the actual times in some of these operators or for all of these operators. These are, I think, mostly correct.

And we can see, you know, oh, we spent four and a half seconds there. We spent 3.8 seconds there. This is all stuff that we could, you know, like try to fix in various ways. This isn’t a query tuning video.

This is to help people who, you know, just need to either get advice about a query, capture a query plan. You know, if you want to ask a question on like Stack Overflow or Stack Exchange or wherever you prefer to do your Q&A stuff from. This is just ways to help you get better answers faster.

Posting pictures of query plans, posting estimated plans will often not get you very good advice very, very quickly unless there is something just so absolutely pathologically wrong with your query that someone can offer good advice on a rewrite. Just, you know, kind of like, you know, psychic finger training on things. So this is a couple ways to capture query plans for stuff.

I guess the nice thing about the extended event option is, you know, aside from being able to, with SB Human Events, aside from being able to screen out queries that, you know, don’t take very long or don’t meet a certain threshold, if you have a long store procedure with multiple steps in it, you can capture all of the query plans for those. Now, that doesn’t help if they all run too long for you to get an actual plan from, but it might help if you are, if that, if you are looking to, you know, capture results in a way that, you know, you can sort of have a record and log of and maybe come back and revisit.

You know, you can kick something off, go, you know, go to bed at night, come back in the morning and see if it’s done, stuff like that. So there are a couple ways to get query plans for things that run for a long time or for, you know, if you, you know, in a pinch and you just want to like do some, you know, like, let’s see how it goes tuning stuff. The SP WhoisActive method is pretty good for that because you can stop and start the query, make changes, see if those changes actually happen.

If SQL Server chooses your indexes, if it, you know, chooses a better plan, all sorts of stuff like that. So just a couple of quick ways of making your query tuning life easier in this video. Not too much else to go on here.

I’m not going to tune this query because I’ve intentionally made this query bad so that it took enough time to run and make sense for this set of demos. So we’re not going to do that. We are going to stop and marvel for a minute, though, at the fact that this query really only ran for 16 seconds about.

But SQL Server might be trying to tell us that it ran for almost 32 seconds. So, again, pretty please. If anyone from Microsoft or if anyone who knows someone at Microsoft, our friend Sam, someone at Microsoft, if anyone is out there, please have some work on the operator time code.

Because some of the things that I’ve seen make me question the very basis of reality. Anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that you will continue to watch and learn something. But if you don’t, I don’t know, maybe you’re off to learn Postgres or Big Data. You’re going to be a data engineer now.

You’re going to go DuckDB yourself or something. I don’t know. Whatever you want to do. Have fun. That’s the important thing. Consequences be damned.

Anyway, thank you for watching. Goodbye. I think it’s dinner time. Mmm. Dinner.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.