Learn T-SQL With Erik: Fun with RAISERROR!

Learn T-SQL With Erik: Fun with RAISERROR!


Video Summary

In this video, I delve into some of the often-overlooked aspects of the `RAISERROR` function in SQL Server. While many have moved on to using `THROW`, I still find significant value and utility in `RAISERROR`. I explore its various substitution parameters, such as `%d` for signed integers, `%x` for unsigned hexadecimal, and `%i64d` for bigints, pointing out the absence of native support for bits or temporal data types. This can be a bit cumbersome but also opens up creative possibilities for error handling in dynamic SQL scenarios.

I also cover how `RAISERROR` can be used effectively within loops to print messages without waiting for buffer fills, and demonstrate how adjusting severity levels can change message appearance. Additionally, I show how you can dynamically assign values to parameters like severity and state based on your code logic, providing a more flexible error handling mechanism. The video concludes with practical examples of using `RAISERROR` in dynamic SQL contexts, highlighting its benefits over traditional error functions.

Full Transcript

Erik Darling here with Darling Data. And today we’re going to be talking about some of the somewhat overlooked things about raise error. Because I still find value and use and utility in raise error. You know, some folks have graduated to throw or semicolon throw, as I often see in the world for no particular good reason. So you just terminate your previous T-SQL statement, you won’t have a problem. So there’s that. Anyway, we’re going to do that. In the meantime, if you feel so bold and you look a little bit further down in the video description, right? My face is up here, but I don’t care if you look down here for a minute. You’ll see all sorts of useful and helpful links for ways where you can interact with me further. You can hire me for consulting. I do that as a full-time job. It’s nice. You can buy my training, right? Including the Learn T-SQL with Erik course that this is teaser material from. It’s down there. You can become a subscribed member of the channel. You can give me a small amount of money every month to keep producing this wonderful content. You can ask me office hours questions for those episodes where I answer five user submitted questions. I get a bit of a backlog now, but you know, times are busy. So, we’re rolling these videos out as we can. And of course, if you enjoy this content and you feel that it would be a remarkable, it would be a remarkable improvement in someone’s life if they were to view this content on a regular basis, I would suggest that you not only like and subscribe, but you tell that person about this channel. Perhaps send them a link and tell them, encourage them to like and subscribe. So, with that out of the way, we need to discuss my travel plans.

For the remainder of the year, where I will be in Dallas, September 15th through 16th and Utrecht October 1st and 2nd for a couple of mini pass on tour events that Redgate is putting on. And of course, this all leads up to the greater, larger, past data community summit taking place in Seattle, Washington from November 17th to 21st, where I will be doing two days of T-SQL pre-cons with Kendra Little. So, we do have many things to look forward to there. But with that out of the way, let’s talk about some of the fun stuff with raise error.

Now, the first rule of raise error is to make sure that you understand how to use it. Now, if you look at things that you can use as substitution parameters or strings with raise error, or substitution wildcards or something with raise error, you’ll notice that you have %d or %i for signed integers.

You have %x or that should be % uppercase x. Let’s fix that. Let me don’t say zoomed in on that for too long. Got lost in the copy paste or something. We have %o for an unsigned octal. I’m not sure what that is.

I’m not a math person. I don’t know what an octal is. You have %s for string. You have %u for unsigned integer. Well, apparently the… I don’t know.

I don’t know why you wouldn’t sign an integer. And then you have %i64d for big ints. There are a couple notable things that are missing here.

Like say, bits. Right? Or say dates or times or date times. If you want to use bits, or you want to use some temporal thing in here, you have to store that as a different data type before you are allowed to use it with raise error.

So if you have a bit, you often have to store it as a tiny int. And if you have a date, you often have to convert it to a string in order to present it as a substitution in raise error.

So there are no native replacements for bits and strings with it, which is quite annoying. I find that quite obnoxious about raise error, because now I have to declare all these additional parameters and hang on to them, and it’s just annoying.

Why we couldn’t get bits and strings for raise error? I don’t know. It’s just the way it is. So there are some neat things with raise error. Or if you just want a normal printed message, you can use log if you want to send it to the error log.

I’ve never found a lot of use for this. Maybe people who do more dedicated SQL development work, who plow through the error log for various things, find this a useful addition to it.

I’ve never really used it all that much. Most of the code that I write these days is my store procedures, where I don’t need to send anything to the error log.

There wouldn’t be any use in that. And most of the client code that I work on, there’s no sense in polluting their error logs with anything that would happen in there. But if I were writing some sort of maintenance procedure, this might be useful.

But notice if we use 10 for a severity here. I am an error gets printed in, well, because I’m using dark mode. So this text is some form of white.

If I were using light mode, this text would be some form of black. So local factors will apply. But the most important thing is that this doesn’t look much like an error, does it? It just looks like a regular printed string.

That’s why raise error can be very, very useful when you’re working with things like loops. So if you are writing a loop of any variety or any sort of task where you want to print a message when things happen, this with no wait is very, very useful because you do not have to wait for these messages to buffer anywhere before SSMS starts sending them out.

If you just use print, you do have to wait for certain buffers to fill before the print messages come out. If you want your text to come out red, you can use severity 11 or higher. And now notice we get some additional stuff here along with the text being red.

This would be red regardless of dark mode or light mode or anything in between. System default, if you will. One thing that you do want to avoid though is not using an aggressively high error message.

So here I, the last one is severity 11. Here I’m going to use severity 20. And severity 20 is just about as high as you can get. But when I run this, well, we get a whole lot of stuff back.

I once worked with a client who was doing, was using raise error for stuff. And people like the one developer was putting a level 20 in there. And everyone thought that the database was corrupt and that there was some problem.

Like every time this, like this, the store procedure had like hit an error, it was like, it would spit all this stuff back. And people would be like, we got corruption run check DB there, run check DB. Nothing came back.

And it’s very confused until we looked at the code and I said, ah, well, guess what? We’re all safe. We all live another day. So what, what, getting to sort of what I was talking about first with a sort of wildcard replacement is that you can put whatever you want in here, as long as it’s not a bit or a date or a time or a date time, those you have to store differently.

But you can, you can substitute things in either as a string or as an integer. These are the replacements that I’ve chosen here with percent S and percent I. Well, I guess there’s already a little line underneath it.

So we’ll give this one a hat. But what’s kind of nice in here is that if we run this, we will see the replacements all worked very nicely. Now, one thing that people do miss sometimes is that you can substitute anything in raise error with a parameter or variable, right?

So in a lot of these, we were using just a literal value for severity and state. But you can, you can have, you can assign this dynamically based on whatever your code does. And you might, you know, assign some things a severity of 10.

If you don’t care about it being read and printing an error and maybe like halting execution, you, if you just want to like, you can assign things any way you want. So like, if you look at this, I’ve declared some variables in here in order to do a little bit of replacement.

So I’ve got error message. I’m just going to highlight this part. So you can add a little bit more, uh, severity, state, error number, error procedure, and error line. Now in real life, like if you were writing like real legit error handling, I don’t know, you would probably just use the various built-in error functions to assign these in your try catch block.

And that would be totally fine. You could also assign your own things to them. If you want to a little bit more control over what gets spit out to people, right? Like you might like, instead of like a full crazy error message, you just might say, retry your procedure or something, right?

You put whatever you want in there. It’s nice. It’s a good time. So what you can do is have any one of these, uh, replace things. So we have severity and state this time is, uh, well, obviously local variables, because I just declared them up there, but then we have, uh, all the replacement stuff that I would be using down here.

Right? So, uh, you’ll see number procedure line message, all the stuff that you would put in like a normal, good verbose error message to help you figure things out. And if we run all this, we will get all that back.

Uh, I’m going to put that on a new line just so we have things a little bit more friendly for zooming in and we have our number, right? Good for us.

Right. And we see, well, we used 11 and one up here, so that doesn’t really do anything, but we, uh, we did mess with our line number a little bit. Uh, and then we have, uh, the number that we wanted to replace, uh, the procedure name, right?

So like, if you write a lot of dynamic SQL and, uh, maybe, you know, you want to raise an error and you like, you know, procedure name is cool and all, but you know, you could say like dynamic SQL block 10 from whatever procedure, and you would be able to locate the dynamic SQL block that you, that you, uh, you like name or like label dynamic SQL block 10, like a comment in the code or something.

And, you know, I’m grant, you know, I write some long procedures, but that’s, that’s getting out of control. And, you know, we can put in whatever message we want here. So there’s all sorts of fun stuff that you can do with raise error to make things, uh, better suited to your environment, more descriptive and a little bit more dynamic than what you might get with sort of like just normal assignment to the various built-in error functions that SQL Server has.

Um, I like using this stuff, particularly, particularly with dynamic SQL, uh, because I can, uh, maybe like in the error message, I can do a preview of the dynamic SQL, or I can put the error procedure as, you know, whatever block that I like labeled in the procedure with a comment, something like that.

So there’s all sorts of neat things you can do to help you get exactly to the code that’s having a problem that maybe wouldn’t be so obvious if you were just using like, um, like the normal results of error functions.

So, uh, easy Friday video, nothing too crazy in here, but I hope you enjoyed yourselves. I hope you learned something and I will see you over in the next video, which will be on Monday, which will be office hours, reliable office hours, where I will answer five user submitted questions because I will, I will count carefully.

I will endeavor to count more carefully in the future. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: How to Pick a Temporary Object

Learn T-SQL With Erik: How to Pick a Temporary Object


Video Summary

In this video, I delve into the nuances of temporary object comparison within SQL Server, focusing on table variables and temporary tables. I explore their respective strengths and weaknesses to help you choose the right one for your specific use case. Starting off, I highlight the downsides of table variables, such as preventing parallel query plans when modified and lacking column-level statistical information. In contrast, temporary tables offer more flexibility with full parallelism support and better handling of statistics. The video then dives into practical examples, comparing execution plans between a table variable and a temporary table to illustrate how cardinality estimates differ significantly. By the end, you’ll understand why temporary tables might be preferable for complex queries that rely on accurate cardinality estimates, making this content invaluable for anyone looking to optimize their SQL Server performance.

Full Transcript

All right, Erik Darling here with Darling Data and continuing on the theme of learning T-SQL. I suppose it’s worth saying that all of this content is teaser content, of course, for my Learn T-SQL with Erik course. Today we’re going to be talking about temporary object comparison, not how to like compare data from one to another, because that’s dull. Intersect, accept, whatever. But more how to choose the right one. one for you and for each particular use case that you may have for a temporary object. So, before we move on to that, if you are saying, wow, a course where I can learn T-SQL with Erik. Well, yeah, there’s a description, in the video description, there’s a link to do that. The course is still currently at the presale price of $250. But as soon as the advanced content is all done, which probably has like maybe like two more months left on it, and that like fully recorded and produced, like a quarter of the way written, you will have to pay $500 for it. So I would strongly suggest buying it now. You can also do other things down there, like hire me for consulting. You can contribute money to the channel to support my work to bring you SQL Server content. You can ask me questions for my Office Hours episodes that I really enjoy answering. And of course, if you enjoy this content, but not in a monetarily productive way, you can always like, subscribe, and of course, tell a friend, because maybe your friend is secretly rich, wants to hire me for anything. Cool. I’ll be leaving the house a few times between now and November. I will be in Dallas September 15th and 16th. I will be in Utrecht October 1st and 2nd. And I will be in Seattle November 17th to 21st. What do all of these things have in common?

Well, they are all past events put on by the nice folks at Redgate, who were kind enough to decide to have me show up at all of them and talk at you about SQL Server performance tuning. So with that out of the way, let’s compare temporary objects here. Let’s make sure that I go to the right SSMS spawn because, well, I need multiple of them. So let’s talk first about table variables. They have some upsides and some downsides. Downsides would include preventing parallel query plans when you modify them. At base, you need to put data into a table variable via an insert. And of course, that will force whatever query populates the table variable to run fully single threaded. There is a non-parallel plan reason for that.

There is no column level statistical information. There are no histograms available for table variables. This is true even if you index them. You can get some benefit if you create a unique index on them for things, but you still don’t know what values are in there that are unique. You often get poor cardinality estimates with them when you start joining on them to other tables or filtering on them locally with a where clause.

There is no recompilation threshold with some exceptions. There are some trace flags out there and there are some newer SQL Server features out there where you may find a recompile does happen. Most notably, if you put a recompile hint on the query. There we go.

You cannot select into a table variable. Table variables use the collation of the local database that your procedure is executing in. They are only available in the current scope and batch, so you cannot declare a table variable out here and then reference it as easily as a temp table, either in dynamic SQL or in another store procedure.

Modifications to table variables are not automatically rolled back on error or explicit rollback. They can be used in some functions like a multi statement table valued function or a T-SQL scale R UDF. And of course, you can create indexes on them when they are declared.

Temporary tables have a bit of a different profile. There’s no inherent parallelism restrictions. You can insert, update, and delete. And as long as SQL Server thinks a parallel plan would be a good idea, you can get yourself a parallel plan.

They do allow for statistical information and histograms. They can sometimes suffer poor cardinality estimates, though, because statistics can be cached when they are used in store procedures. They do have a low compilation threshold, so you don’t often run into the cached temporary table statistics problem, but never say never.

You can select into them. Temp tables use the collation of tempDB. So if you might ever look through my store procedures that create temp tables, you might see a lot of things where I need to collate to the local database.

So like collate database default, and that is to prevent string comparison issues if the collation of tempDB does not agree with the collation of the local user database that the store procedure is running in. They are visible to child batches and scope, so they’re very easy to reference in Dynamic SQL or in other store procedures that you might call. Modifications are rolled back on error or explicit rollback.

They cannot be used in some functions, unfortunately. And you can create indexes with the table. When you say create table, you can make the index definitions inline, or you can add them on later, depending on which you find to be most beneficial. There are a lot of insert patterns where you want to insert into a heap temp table to get a fully parallel insert, and then you can create an index later on that.

And you also get the benefit of a full scan statistics histogram when you create the index later, rather than the sampled histogram that you would get if you loaded the data with the index already there, and then maybe use the column in a where clause or a join clause or group by or something. There are some similarities though, and these are non-negotiable. They both live in tempDB and take up space there.

Table variables are not in memory unless you are explicitly using the in-memory SQL Server feature and creating in-memory table variables, which do not look like the declaration of a normal table variable. They both log equally to tempDB’s transaction log. They are both session scoped, meaning that once the session that created or declared the temporary object is gone, they are gone.

Global temp tables have slightly different rules, but we’re not talking about those. Simultaneous sessions can use temporary objects with the same name without existence errors, but that, of course, does not apply to global temporary tables. Now, that’s a lot of talking, isn’t it?

We’re already seven minutes, and still, all we did was talk. But now we can get down to some of the nittier, grittier stuff with them. So what I’m going to do is create a table, or rather declare a table variable called user scores. And I want you to note that I have some check constraints on here, and I have a couple indexes on here.

Now, the point of these indexes is to aid later queries. So what I’m going to do is I’m going to run all of this stuff, and I’m going to talk about these two hints in a second. But let’s run all of these, and let’s wait a moment, and we’ll get the actual execution plans for those when this is all finished.

So because I am using SQL Server 2022, and my compat level is 150 or higher, keep in mind this is a feature that started in SQL Server 2019 with compat level 150 up, assuming you’re on Enterprise Edition, of course.

You have paid the Microsoft friendship tax, and they have decided to try to help the performance of your workload. Because of that, I have an intelligent query processing feature called table variable deferred compilation at play here, which is going to get me table level cardinality for like a full scan of the table or something, a full join of the table.

But you will see in a moment that this does not help us with cardinality when we have a where clause. So if I were to apply either one of these hints to either disable that feature or to use the optimizer compatibility level 140, the table cardinality guesstimate for table variables would disappear.

But this is all finished running, and I’ve got a few queries down here. I’ve got one query saying where owner user ID equals 22656. I’ve got one query here asking for where score is greater than zero.

And I’ve got one query down here asking for both. So this should be interesting because, like we went over, when I created this table, I had a check constraint for score being greater than zero.

And I have an index on owner user ID and an index on score. When I inserted data into the table variable, both of these things were written in a way to get us the same owner user ID for every entry in here, right?

Every row in this is going to be 22656. And of course, we have score greater than zero in order to not run into any check constraint errors. But the cardinality estimates that I get from this, I mean, the totals are all correct, but, you know, not much else.

Starting with the top query that does the insert, you will see that there is a clustered index scan here, a clustered index insert here, and then there’s an assert over here, which the check constraint uses to make sure that all of the scores are greater than zero.

If you hover over the clustered index insert, you’ll see the two additional indexes listed that were inserted into for owner user ID and score. Anyway, moving on.

If this query ran for 4.295 seconds, and you’ll notice that there are no parallelism indicators at all for this plan. This is because if we right click and hit properties, we will have this non-parallel plan reason over here, like I described earlier, where SQL Server will not be able to generate a parallel execution plan when we go and, what do you call it, run the query, do the insert.

There we go. Anyway. All right.

Moving on. Now, because I have the table variable deferred compilation intelligent query processing feature enabled here, and it happens for us, what SQL Server is able to do is when we just get a count of the entire table, where one equals select one will always mean true, but when we do that, SQL Server is able to guess that 27,029 rows will come out of here.

Great. But this is where things do start to fall apart. Why?

Well, because all SQL Server knows is that there are 27,029 rows in the table variable. It still has no description of what values ended up in there. So when I change the query a little bit, and I say where owner user ID equals 22656, SQL Server, well, we still return 27,029 rows, but SQL Server guesses that 164 of them will meet that where clause.

There’s no histogram. If there were a histogram, we would see that. Now, excuse me.

If we look at other places, we’re going to see where I say, hey, show me where score is greater than zero. Well, we know that every row in there is greater than zero because of that check constraint, and we have an index on score, but SQL Server says, well, I think you’ll get about 8,100 rows back.

And look at the check, neither the check constraint nor the index were of any benefit to cardinality estimation here. And going down even further, if we say where owner user ID equals 22656 and score is greater than zero, SQL Server says, I think you’ll get about 90.

But we still get back all 27,029. So it is difficult to prove a negative when looking at things like this. But if we look at the properties over here, there’s a thing missing from over here.

There’s usually, if you look at a query where statistics were in use, you will see an optimizer stats usage entry in the properties of the root operator in the query plan. But we do not have one of those here.

Again, it is difficult to prove a negative, but we can go on and use a temp table and prove a positive, can’t we? We can do something very scientific. So what I’m going to do is I’m going to create a temp table.

And I’m not even going to festoon it with all of the things that we had on the other one. I’m not going to do that. I’m just going to say we have two columns in this temp table.

They are both integers and they are both not null. There’s no check constraints. There’s no indexes. So that’s what we have going for us. So let’s drop this table if it exists, even though it doesn’t.

And let’s do that exact same insert. Now, if you look at the execution plan, you’ll note that we do get a partial parallel plan here. SQL Server does not use a fully parallel insert.

Why? I don’t know. Didn’t meet the cost. I think it was maybe the number of rows because SQL Server estimated 213 rows. And, of course, we got 27,000.

But, you know, I think there’s a cardinality estimation thing at play with if SQL Server decides that a parallel insert is going to happen or not. But anyway, we do not have, clearly do not have a non-parallel plan reason for the insert into the temporary table. And you’ll note that it does run in about 700 milliseconds.

I think the other one was, was it 3.4 or 4.3 seconds? I forget. But it was much longer than this, right? It was a long time. Now, we’ve got this table and we’ve got it populated.

So that’s our good first step. Now, let’s run the same queries, but now against the temp table that we had against the table variable before. We’re going to get back all our 27,029 results, which is a good first step for us here.

But if we look at these, of course, you know, table cardinality, very easy to guess. So this top one, you know, it gets a good, good grade here, right? 27,029 to 27,029.

Not much difference just yet. But now, boy, howdy. Look at, we get accurate cardinality for both of these. When we, when we said where owner user ID equals 22656, SQL Server looked at, generated statistics on the, on the, on the temp table, just on its own. Remember, there’s no indexes on this thing.

SQL Server just generated statistics and said, hey, that’s 27,029. All right. I know. I got this one. I’m your friend.

And when we say where score is greater than zero, SQL Server said, hmm, got your back, homie. 27,029 of 27,029. So now not only do we have accurate table cardinality, we have accurate column level cardinality when we start filtering on our table variable.

This becomes very important if we were to start joining, sorry, our temporary table. This becomes very important if we were to start joining a temporary object off to other larger tables where perhaps, you know, different, having statistics on columns that we’re joining to very large tables would be beneficial in SQL Server generating an optimal execution plan, don’t you think? And if we combine these two predicates where owner user ID equals 22656 and score is greater than zero, SQL Server will once again, remember last time it said like 90 or something?

Now we get the full number of rows that would actually come out of that temp table. So not to say that table variables do not have any uses, but for me as a performance tuner, generally I am very interested in not only materialized results, but getting either better or more accurate cardinality estimates when I start doing other things with those materialized results. And this is where I mostly want to sway people to use temp tables instead of table variables, but I still have to spend a lot of time going over things like this.

So perhaps this channel just does not have enough reach. Perhaps not enough of you have told a friend about this channel and perhaps you should because there are a lot of people out there who still need to know these things. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where we are going to talk about some of the fun and joys and quirks of Razor. It’s of course a much smaller portion of an error handling module, but you only get teasers because you didn’t pay for it.

If you buy the full course, you get the full content. All right. Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Just Use QUOTENAME!

Learn T-SQL With Erik: Just Use QUOTENAME!


Video Summary

In this video, I dive into the exciting world of Dynamic SQL and why using `QUOTENAME` is far superior to simply adding square brackets in your code. I explore how `QUOTENAME` can protect you from potential SQL injection attacks and handle strange characters or names that might appear as column or object names. Additionally, I share some amusing bonus content at the end, including a clever workaround for dealing with very long dynamic SQL strings that are too big to print in one go. Whether you’re looking to enhance your T-SQL skills or just want a laugh, there’s something for everyone!

Full Transcript

It’s me, Erik Darling with Darling Data. And today’s video, we are going to go over some of the kind of fun things with quote name when you’re using Dynamic SQL and of course why it is far superior to the old, I’m just going to stick some square brackets in my code because that doesn’t give you the same level of protection. And then there’s going to be some rather amusing bonus content at the end, which I do hope you’ll enjoy because without your enjoyment, I die. No pressure. Anyway, if you look down in the video description, there are many important links in there. And they’re very clearly labeled places where you can hire me for consulting, places where you can buy my training and of course at a nice discount because you watch my videos and you know, I think I think, I think you’re a nice person and I think you deserve a break in life, especially if your employer is not willing to pay for training. It is difficult to afford the very, very high priced, often not worth it training out there on the internet. You can also become a subscribed channel member where you actually give me a small monthly donation in exchange for all of the wonderful content that I produce. You can ask me questions for my Office Hours episodes where as long as I’m going to be able to get a little bit more information, I’m going to be able to get a little bit more information.

As long as I count correctly, I answer five user questions. And of course, if you enjoy this content, I would encourage you to like subscribe and of course, tell a friend. The more friends you bring, the more people I have to talk to and gosh, then I might bring joy to more people and I might live longer, which is hopefully good for all of us. Speaking of things that are good for all of us, I’m getting out of the house. Actually, I’m getting out of the country a couple of times. Once, once that, oh, I mean, is Texas another country yet? I don’t know. Hard to keep track. So much going on in the world. But the nice folks at Redgate are dragging me around to Dallas, September 15th through 16th. Utrecht, October 1st and 2nd. And of course, past data community summit in Seattle from November, from the 17th to the 21st, where I will be hog tying and spit roasting two days of T-SQL pre-cons with Miss Kendra Little. So I do hope you attend those. Can’t wait to see you there. Again, no pressure. Might die if you don’t come. Wouldn’t be good for either of us. And then, I don’t know, whatever. So with that out of the way, let’s talk about some of the joys of quote name here. Now, I am in such a busy state that I need multiple SSMS spawns open. So please don’t judge me.

No, I’m not, I’m not here to bring pain to anyone. Anyway, important reasons why you need to use quote name. Well, well, it is somewhat unlikely that you will encounter someone naming a column, the clown emoji out in the world. You never know. You never know what kind of rough scallions you’re going to see out there. So what you need to do is protect yourself. Cause there are other things that people might do, like put spaces or dots or dashes or put weird things in, in, in, as column names that, that really cause issues. If they are not properly quoted in a way where SQL Server can identify the full text as an object name.

So let’s, let’s create this table, which, which, which happens successfully, even though we’re not using quote name here, but we do have to use it while we’re constructing dynamic SQL. That is if we’re smart. Now, one thing that I like to do when I’m trying to write very safe dynamic SQL is maybe not depend on user input, uh, very much for things. Um, you know, I, I very much believe in, uh, maybe taking the user input, but using it to value, like, uh, especially when it’s, um, you know, when you’re using dynamic SQL and you need to take, uh, like sort of dynamic input for things like database name, schema name, table name, column name, all of that information is in your, your server’s metadata.

Uh, more, probably more specifically, well, I guess from server level down to database level. And so I’d like to take that input and not use that maybe directly in the dynamic SQL. I like to generate my own inputs based on that metadata that’s in the, in the server where I, where there’s no chance of SQL injection of anything malicious happening.

And I can sort of weed any of that out. So whenever someone, uh, will pass in a list of columns or a database name or a table name or a schema name, I always validate that stuff against server. Metadata.

And I, I always get, uh, like the, the stuff, like the parameters and variables that I’ll use in dynamic SQL from the metadata directly, because it’s much, much harder for someone to tamper with that maliciously. Of course, with, with the remote DAC, you can do many fun things. So like one thing that I’ll do.

And let’s like, you know, if, if, if let’s like, let’s pretend that column names was, uh, like a parameter, right. Uh, to a store procedure on someone could put in a CSV list of column names. I would even use, uh, go even further and validate those column names when the metadata for whatever table we’re looking at.

So there are all sorts of things that you can do to protect yourself from a SQL injection. Using quote name is a very good, not only good for helping you avoid SQL injection, but also for helping you deal with straight objects with strange characters in them. So let’s, uh, run this, right.

Let’s get this, let’s get a list of column names and let’s look at what dynamic SQL this generates because this is going to fail mightily. If we look at what happened in here, SQL Server tried to run this select query. And this select query is not going to run with any of these things as column names without square brackets.

And it’s certainly not going to be able to select anything from a table name with the space in it. If you’re like, if you keep in mind, there’s an absolutely no like underscore or anything here. It’s just a bare ass table name.

So we want to use quote name to fix these things in order to encapsulate them correct correctly. So if we rerun this and now instead of, uh, just taking the name from, uh, sys.columns, we’re going to say quote name, uh, c.name from sys.columns. And down here, we’re going to wrap these various things in quote name as well.

Now, if this were like a big old complicated store procedure, there’s a pretty good chance that I would, um, have maybe like an inner variable that I would set to being like quote name, database name. But of course, uh, after like verifying it and like sys.databases or something. So just here for sort of brevity, I’m putting quote name around these, but like in real, like big code, I probably don’t want to like have quote name everywhere.

Cause I may forget at some point. So it’s much easier to set it in one place and then just have the object name you care about quoted. But now, of course, when we run all this and we get down to the print and the execute, this works just fine, right?

But there’s no data in the table, but we at least run the select successfully and we get back our column names here. And the dynamic SQL that gets printed out would, uh, completely, uh, uh, avoid any of the terrible errors that we saw before, because now we have square brackets around everything. Now, the nice thing about quote name is that, um, it will produce a valid output for anything that is a legal valid object name, uh, where you have to be careful as if things get longer than that.

But we’ll talk about that in a moment. But the reason why, um, square brackets do not work the way that quote, like the, as well as quote name works is because if you were to do something like this, right? Like, uh, you could like, if you were just to put your own square brackets and can concatenate those into a string at some point in the dynamic SQL, they, they are not, they, they, they do act as an object identifier, but they do not quote out additional square brackets.

So quote name would, would fix this, right? Right. Quote name would actually protect you would like double quote these things. So you would get valid results back, but, or rather it would, it would escape those things.

So you would get valid results back just adding in square brackets does not do that. So if we run this whole thing, we will see that, uh, we get valid results, but we actually executed SP who right here, right? We got all this dumb information back that no one ever understands and is very confusing, but over in the messages tab, this is what we ran, right?

We executed SP who, and then we dropped a table, right? And then when we try to select from that table at the end, we say invalid object name, right? And the reason for that is because just adding in square brackets will not escape any additional square brackets in here.

So this does not have the same power. If we were to do this and use quote name instead, right? So like, rather than say like square bracket plus this string plus square bracket, if we just say quote name, that same string, well, all of a sudden SQL Server is very, very happy to, uh, say, well, I’ve, I’ve never heard of the store procedure SP who dropped table.

All right. That’s not a thing. So like notice that when, uh, when I made this string, there was no double square bracket here, but when I use quote name, look what happens here.

We get double square brackets here. So quote name does some extra stuff that just rolling your own square brackets doesn’t do. Now you could totally, you know, do something on your own and be like, well, I’m just, if there’s a square bracket, I’m going to replace it with double square brackets to just use quote.

Right. And unless there’s some strange reason why you can’t use quote name, you’re just allergic to the letter Q. Maybe I don’t know.

I don’t know. Use quote name. Very, very simple. But, uh, just, you know, there are some cases where quote name will return a null if you, uh, make the string too long. So please do try to avoid that.

Um, if you, if you do have a situation where you need to, uh, square bracket something that is longer than quote name will return a string for, uh, where if it would just return a null, you have my blessing to use square brackets and replace square brackets within the string with double square brackets. Okay. Okay.

You can do that. But for most cases, you know, database schema, object table view, store procedure, anything like that, any, like, you know, system, like, uh, system identified object, uh, a mix of sys name and, uh, and quote name is totally fine. Um, there, you can use different things for quote name.

Uh, so like you are not just stuck with square brackets. Uh, you can use, uh, single quotes. You can use parentheses.

You can use curly brackets. You can use double quotes. You can use those funny little ticks, but you, you cannot use tildes. I’ve learned. Uh, sorry, Tilda Swinton. I do apologize. But if we run this, you will see that the word Eric has been, uh, bracketed, single quoted, parenthesized, curly bracketed, double quoted, single ticked, but, uh, we cannot tilde Eric.

Tilda and Eric were not meant to be. I will not be one of her husband’s sad to say. Uh, but one thing that is, uh, useful to talk about is that if you, if you like, so when I was talking earlier about how I will often, uh, in my store procedures, I will have a, uh, like an inner variable that I’ll set to be like quote name.

Um, something like, so like, let’s just say it’s a, uh, someone, I like, I let someone pass in a database name. I’ll hit sys.databases. Right.

And I’ll, I’ll get the database name from there where the database name parameter they pass in equals like the database name and sys.databases. And I’ll put quote name around that in that internal local variable. Where that can mess things up is if you have to do anything with it later to validate other metadata.

So now you have a database name with square brackets around it, but that your, your database name probably isn’t stored like that in sys.databases. Your schema names aren’t going to be stored like that in sys.schemas. Your table names aren’t going to be stored like that in sys.tables and so on and so on and so on.

So if you need to locate other data with, um, something that you have already quote names, uh, it’s, it’s really helpful to use parse name. So parse name is another built in function that will actually remove those square brackets. So you see up here, I have, uh, declared this local variable DBO.

And if I run this, you’ll see that, uh, just adding quote name again, we’ll double. Well, actually, well, you can actually see there’s three there, right? So we actually add in another square bracket quote name tries to do us a solid, but when we use parse name, we remove the square bracket.

So sometimes if you, in any of my procedures, you see like where some object name equals parse name, some local variable. It’s because that local variable has a quote name quoted identifier in it. And it’s messing up me looking up other metadata.

But of course, if we pair, uh, parse name and quote name, we will go back to just having single brackets. So there are all sorts of neat things you can do. And, um, last but not least, as I promised, I’m going to show you some, some, a couple of amusing things.

Now, one thing that, uh, I run into a lot and that I run into people sort of, uh, having issues with a lot is when, if you’re writing dynamic SQL, or, uh, you have your, whatever string you’re building is very, very long. So, uh, you might not be able to print it successfully in one go. It’s not a good time.

It’s not a good time. Because now you’re looking at like having to chunk print statements with like sub string or something, maybe write like a while loop to produce like a, like continuously printing thing until you run out of characters. That’s not a lot of fun.

Now I think generally for me, it’s a lot easier to, uh, produce like an XML cookie column where I can just click on it and get the full thing somewhere else. That’s, that makes my life a lot easier. So, uh, let’s pretend that I have, this is a very long string.

One way of achieving that is to use, uh, this processing instruction alias. And I just have an underscore in here to avoid like having any, like, like, like I could put anything between these, uh, these parentheses. I could put like query text something, but I just have an underscore in here to sort of eliminate lots of strange XML stuff.

But if I select processing and processing dash instruction with parentheses, and I, again, I just have the underscore in here for brevity. Uh, and I say for XML path, well, I get back a clicky column. All right.

And if I click this, I get the, that’s where the underscore comes in. But now I have this thing that I can, I could copy and paste out and do whatever I want with, you know, however it is. But what’s kind of funny is, uh, you see that it produced a column with this name, which is XML and then sort of a funny looking pseudo gooey looking thing. And what’s really funny is if you, uh, maybe skip the processing instruction thing.

And you just, uh, alias a column with that same name, you will get an XML clicky column. And this one doesn’t have any of the sort of festive XML bracketing on it. So maybe, maybe that’s more to your liking.

And what’s, what’s also very funny is, uh, I don’t know if you’ve ever used like set statistics XML on, but if you use set statistics XML on, it produces like an XML clickable for a query plan. And if you, the alias of set statistics XML on is Microsoft SQL Server, 2005 XML show plan. And if you alias a column with that name, you also get an XML clicky that you can click on and open, have open in a brand new tab for you.

So that’s my amusing long print, long printing tricks. All right. Perhaps I’ve gone too far.

Thank you for watching. Hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video, uh, where we will talk about, let’s see, what is it? Ah, choosing temporary objects.

Well, we’ve got some work ahead of us, don’t we? Still, I still have to talk people out of table variables and they still make all the same excuses. So I’m going to ride that train till it dies.

I guess I guess this is coming up on the 25th anniversary of people being wrong about table variables. So happy anniversary table variables. Happy anniversary.

All right. Goodbye. Bye. All right.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 31

SQL Server Performance Office Hours Episode 31



Questions:

  • I have a very slow delete statement (44 mins for 2000 records). Execution plan shows painful Clustered Index Delete is culprit. Parent and child foreign keys are indexed on two associated tables. Where else to look? Merge join is feeding problematic operator but associated sort isn’t causing pain. What would you check next?
  • What topics will be covered in the advanced material of your T-SQL course?
  • I’ve been thinking about career goals. Becoming an MVP sounds good. I already do a lot for the community. Good idea or bad idea?
  • You often mention pessimistic read committed holding locks for too long because of key lookups WITH UNORDERED PREFETCH. Does Read Committed always show WITH UNORDERED PREFETCH in plans where it hold locks for too long?
  • In theory, with the Microsoft push of all things to the cloud, query tuning and optimization should be a major priority for companies because CPU time (CUs) is money. In your experience, are companies actually lowering costs through query tuning or are they more focused on improving speed and reducing loads? I rarely hear stories of companies reducing Azure capacities or lowering licensing costs due to query tuning but I don’t have much exposure to it.

To ask your questions, head over here.

Video Summary

In this video, I dive into some interesting questions and discussions from our recent Office Hours session. We tackled a slow delete statement that was taking 44 minutes to process just 2,000 records, which led us to explore the execution plan details and suggested looking at actual rather than estimated plans for more accurate performance metrics. Additionally, I shared insights on my upcoming advanced course content, emphasizing its in-depth nature and wide breadth of advanced topics, from T-SQL to data storage strategies. The conversation also touched on career goals like becoming an MVP and the realities of query tuning in cloud environments, highlighting both the benefits and challenges companies face when optimizing their SQL Server workloads for cost savings or performance improvements.

Full Transcript

Erik Darling here with Darling Data and we have a very, very exciting episode of Office Hours today. We are joined by absolutely no one. Who would want to stand here? Have a bad time. So, if you look down in the video description, you’ll see all sorts of helpful links where you can hire me for consulting by my training, become a Supporting With Money member of the channel, and even ask me Office Hours questions. If you enjoy this content, and you want to support me in some other way that is not giving me money, you can always like, subscribe, and of course tell a friend if you think that this channel would help them and whatever they’re dealing with in their life. Usually it’s SQL Server, but of course people find lots of solace in this channel for other reasons. If you want to see me live out in the world, the nice folks at Redgate are dragging me, well, to a few different places, at least through the end of the year. We got Dallas in September the 15th to 16th. We got Utrecht, that’s the one in the Netherlands from October 1st to 2nd. And we have Past Data Community Summit in Seattle from November 17th to the 21st, where I will be doing two wonderful days of two T-SQL pre-cons with Kendra Little.

And we do look forward to seeing you there. All right. Anyway, let’s talk about Office Hours, because we’re going to have some stuff to go through here, don’t we? We have to answer five questions, not an accidental four like last time. So that’s one, two, three, four, five. I didn’t mess that up. I can count. Wonderful. We have passed the first test of Office Hours.

All right. I have a very slow delete statement. 44 minutes for 2,000 records. All right. The execution plan shows a painful clustered index delete is the culprit. Okay. Parent and child foreign keys are indexed on two associated tables. Where else to look? Merge join is feeding problematic operator, but associated sort isn’t causing pain. What would you check next? Well, I’m going to go out on a limb here, and I’m going to say that you are looking at the estimated execution plan and not the actual execution plan.

If you’re looking at the actual execution plan, you would see helpful things like operator times, unless you’re using like SQL Server 2012 or something, in which case, well, you have bigger problems than this delete statement. So what I would suggest is to either find a place, a development server, where you can run this query and get the actual execution plan, or profile your production server in a way where you can catch this slow delete running. So, you know, just between you, me and the wall, or the camera, or the thing behind me, there are some things that you should look at first.

First, make sure that this delete statement was not just blocked for 43 minutes and 59 seconds and then finished in one second, because 2,000 records is not really where I’m going to think, oh, this makes total sense that this would take 44 minutes. So, you could look at the underlying table. Who knows?

There could be all sorts of things there, like it could be way over indexed, have like 999 indexes on it or something. I suppose it could be like a, there could be a columnstore index involved. That might slow things down a bit, but not like 44 minutes slow it down.

So, my suggestion would be to, you know, get some contextual information, make sure, like, you know, maybe fire up the block process report and make sure that this delete isn’t just getting blocked for a very long time. And then, absent a development server with sufficient data to reproduce a long-running delete, I would profile the production server in a way where I can collect the actual execution plan and not rely on very misleading cached or estimated execution plans, either from the plan cache or query store. There are various ways of doing that.

That SP human events, my store procedure, does give you that ability. So, I would maybe do that. Otherwise, you know, really just stop looking at estimated execution plans and looking at operator costs as being a good source of durable performance metrics because they are not.

All right. Next question. What topics will be covered in the advanced material of your course?

A lot of them, and they’ll all be advanced. So, there’s a lot of stuff in the beginner content, which, of course, also has an advanced component. And then there’s also sort of maybe not directly T-SQL things, but things, you know, around performance, maybe some things around various, you know, ways of storing data that has upsides and downsides, things like that.

But there’s going to be a lot in there. It’s probably too much to really list and go through here, but it’s going to be a very, very in-depth and have a wide breadth of material of an advanced nature. So, keep your eyes out for that.

I am currently, I’ve written a lot of the starting, well, not starting. I’ve written a lot of the, you know, like early modules in it. And as soon as I have like a little bit more of that done, I’m going to get that tech reviewed and start recording it.

I’m going to release the material for the advanced course a little bit differently than I did for the beginner course. I think I’m going to piecemeal it out in some smaller batches. Whereas with the beginner stuff, I really like put out a lot discontinuously over like two or three weeks until it was done.

This one, I’m going to release some sort of smaller chunks, I think. Next question. I’ve been thinking about career goals.

Really? How old are you? You’re just thinking about them? All right. Becoming an MVP sounds good. Well, if it sounds good to you, don’t, why are you asking me? I already do a lot for the community.

Good idea or bad idea? I don’t know. I would personally never do it for various reasons that I will not go into here. But it would not be something that I would consider.

But if you already do a lot for the community and you feel that your contributions are ones that Microsoft would see sufficient free marketing labor in, then you should go ahead and become an MVP. You might make some good connections.

You might meet some nice people. You might, you know, get around the world a little bit in that way. But, you know, whether it’s a good idea or not sort of depends on, you know, how humbled and honored you are to carry marketing water for a, what is it, like $250 billion a year corporation? I don’t know.

I don’t know. I don’t know. Who knows? Maybe it would have some benefit for you. Don’t let me stop you. Anyway, let’s go on to the next question. You often mention pessimistic read committed locking holding locks for too long because of key lookups with unordered prefetch.

Well, it could be an ordered prefetch too. It could just, or it could just be an optimized nested loop. Does read committed always show with unordered prefetch and plans where it holds locks for too long?

Well, it’s not holding locks for too long. It’s just holding locks for longer than most people expect and even escalating those locks in a way that causes read queries to block write queries. So, like I said, you will either see unordered prefetch, ordered prefetch, or maybe even optimized nested loops in those query plans.

But more importantly, you should generally see the queries that generated those executable plans in a block process or deadlock report. So, don’t be afraid to use those to track things down as well. All right.

This is a long one. Oh, boy. I’m going to have to think about this one while I’m reading it, which means I might read it wrong. In theory, we do love theories here at Darling Data. With the Microsoft push of all things to the cloud, it’s been going for like a solid decade now.

Not just Microsoft. Lots of people who have cloud to sell want everyone to be in the cloud. Query tuning and optimization should be a major priority for companies because CPU time is money.

It sure is. A big component of my consulting. A little free marketing for me.

Not quite a $250 billion a year corporation, but I’ll do some free work for me. A big part of my consulting is tuning stuff in the cloud so that people need less cloud and spend less money on it. In your experience, hey, there we go.

Well, I led right into that one. Are companies actually lowering costs through query tuning? Well, they are when they work with me. I don’t know about anyone else, but whatever, you know. Or are they more focused on improving speed and reducing loads?

Well, query tuning is improving speed and reducing loads, at least in my estimation. I rarely hear stories of companies reducing Azure capacities or lowering licensing costs due to query tuning, but I don’t have much exposure to it. Well, it does happen, usually shortly after, you know, hiring me.

But, you know, if you’re not really trafficking in that world, you might not see a lot of success stories around it. Now, there is a flip side to that coin, because there are people I work with who went with the absolute cheapest things they could get in the cloud and have terrible performance problems because the hardware is just so, so underprescribed for the workload they’re expecting it to run. But, like, there are Microsoft salespeople out there who don’t know this thing from the thing behind them who will just try to do anything to get people started in the cloud, and they will forecast these terribly small resources and be like, oh, but it’s only going to be this much a month.

And then people get there, and they’re like, our workload doesn’t run anymore. What happened? And they’re like, oh, no, you might have to spend more money.

So, like, but, you know, I end up in some situations where, like, you know, you do have to, you know, hard truth people and be like, look, you, like, the hardware you came from was, like, let’s say, this big, and the hardware you’re on now is, like, this big, and it’s not going well. Like, you, like, there’s no automatic performance benefit just by right of being in the cloud. You still need to, like, do things to make things faster.

So, there are some cases where you do have to, you know, tell people, like, hey, you made a bad choice with this hardware, or some Microsoft sales rep gave you bad advice on this hardware. You really do need more, at least in the meantime, until we can tune stuff, until you can maybe shrink stuff down. But a lot of people have, you know, either perhaps, you know, oversubscribed a bit, or perhaps they’re just unhappy with the cloud build generally.

And they’re like, there’s got to be some stuff we can do to fix this. And that’s where, you know, I come in a lot, and I get queries and indexes and workload various bits and bobs tuned. And then they can reduce the size of those servers without causing any major workload regressions.

So, I hope that answers your question. And perhaps even gives me a nice sales pitch. So, thank you for that.

I appreciate you. Anyway, that’s about it for me. I’m going to go do something else. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you tomorrow where we will talk about some more T-SQL learning. We’re going to talk about, I think, quote name and dynamic SQL, and maybe some other little dynamic SQL tricks that I haven’t talked about 7 million times here already.

So, all right. Cool. Thank you for watching.

Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Cursor Loops Are Better Than While Loops

Learn T-SQL With Erik: Cursor Loops Are Better Than While Loops


Video Summary

In this video, I discuss the merits of using cursor loops over while loops in T-SQL programming, sharing my personal belief that cursor loops are often easier to write and maintain. I walk through a basic example of how to use cursor variables in SQL Server, highlighting their self-cleaning nature and demonstrating how they simplify the process compared to manually managing row increments with while loops. By comparing this approach to yesterday’s video on while loops, I illustrate how cursor loops can be more efficient both mentally and for database performance, especially when dealing with non-contiguous data sets.

Full Transcript

All right, Erik Darling here with Darling Data, as the sign indicates. And today we are going to talk about how I think, it’s my personal belief, that writing cursor loops is much easier than writing while loops. So we’re going to talk about cursor loops in today’s video.

This is of course part of my little itty-bitty teaser bits of my Learn T-SQL course. If you would like to buy that course, hire me for training, become a supporting channel member, or even ask me office hours questions, all of those helpful links are down in the video description right below this.

So I would encourage you to click on those, should you be interested in pursuing our relationship further. And, tilly You feel so kind.

If you’re feeling so kind, you could always like, subscribe, and tell a friend about this channel. Of course, I need to leave the house a few times this year for the Pass on Tour event and, of course, for Pass Data Community Summit. I will be in Dallas September 15th and 16th, Utrecht, October 1st and 2nd, in Pass Data Community Summit in Seattle, November 17th to 21st, where I have not one but two magnificent days of pre-cons with Kendra Little all about T-SQL.

We’ll be covering some of the material that I’m going over here. So with that out of the way, let our database party begin. We will get right into our cursor loops.

Now, like I said in the intro to this video, I really do believe that once you learn how to write a cursor, it simplifies a lot of things that you might have to consider and try to be smart about when using a while loop. So I think probably the best way to write a cursor in order to simplify even that part is to use what’s called the cursor variable.

And the nice part about cursor variables is that you don’t need to close or deallocate them. They automatically are self-cleaning, which I think is nice. So what I’m going to do is just run through a basic cursor loop to just populate some data and just kind of show you like the basic fundamentals of it.

And then I’m going to compare or rather I’m going to do a demo like in yesterday’s video with while loops, where things are sort of greatly simplified for us over needing to think about how to increment and how to find what to work on next. So I’ve got my cursor stuff declared up here.

Zoom, it would be so kind as to actually zoom. We’ve got this stuff up here. And then I’m going to be using a table variable to hold some of our data because I’m just putting a single row at a time in there. I don’t really care so much about the performance of this.

So the table variable is just fine for us here. And then I’m going to set the cursor variable with the options that I want and also tell the cursor variable what the query that populates the cursor looks like. That’s this portion down here.

I talk way more about the cursor options in the full class. So if you’re curious about these, well, please consider purchasing. And then we will open the cursor.

And while that cursor is open, we will do some fetching into the cursor. And then we will, you know, populate some stuff. And what we’re going to do is what I have in here is some additional stuff to kind of show you how the cursor makes progress through all of the values and stuff.

Because what I want you to see is how the fetch status changes as the cursor finishes. And then we will be looking at the results down here. So after I finish highlighting this gigantic loopy piece of code, we’re going to run this.

We’re going to get a result back very quickly with all of the data that we populated our table variable with right here. Over in the messages tab, though, this is where we’ll see sort of how the while loop, rather the cursor loop made progress. And how this is a little bit easier than the while loop equivalent.

See, we didn’t have to tell the cursor loop what row it was going to get next. Like, we didn’t have to increment some value and then go find that value. The cursor loop is based on this query.

And this query here just goes and finds the next thing for us to do anything with. Actually, the cursor keeps track of this position for us. So we got all of the stuff that we wanted.

And of course, while the fetch status was zero, which is for all of these iterations through, we went and we got a new row to work on, worked on what we needed to do. And then at the very end of the while loop, the fetch status changed to negative one. That means that the loop killed itself and we didn’t have to tell it to do anything.

So this is how, one way how cursor loops can be a little bit simpler to write once you understand the basic syntax than writing while loops. Now, to compare the cursor loop to the while loop in yesterday’s video, there were lots of things that I had to do in yesterday’s video in order to figure out like a starting and ending position for the while loop. How to increment data, how to increment to get the next row in the while loop versus like, not like doing a naive incrementing, just like adding one to the next ID to get, having to think about how to increment things so that I was making sure that I got the next value.

So I’m going to show you how this is simplified for the cursor loop. Now, in yesterday’s video, I showed you like both a naive and then a smarter way of doing things. But in the smarter way of doing things, of course, was more efficient.

But of course, in this case, we can get more efficiency without having to worry about thinking about stuff too much, which is nice because we can now think about other things like where to go for dinner and what sort of wine to drink. And I don’t know, like where we want to go on vacation next. We’ll become more efficient people.

And so we have better things to do with our minds than figure out what to do in a while loop, right? Which is not fun. Which is not making good use of time, right? It’s not a productive use of our brains, whereas thinking about food and wine and vacations, terribly productive use of our brains.

So what we’ve got from this table is, of course, the same non-contigious situation as we had yesterday, where like the IDs that we’re working off do not just count up like one through ten or whatever, the way that the row number function does. The row number function rows are over here. So we can see that we have some big gaps in a lot of this between what number is next or not.

So what we can do to write a cursor to do this is pretty much what we did before. But I want to show you just how like we don’t really have to do as much setup for this, right? We don’t have to find a min and a max ID.

We don’t have to decide how to increment. We don’t have to like do a lot of pre-work to figure out our cursor starting place, current position, and ending place. The cursor does that for us.

So let’s run this. And I am going to show you some of the sort of guts of what happens as we run through this. And just like in the last demo, we’ll see that the cursor does a great job of, you know, while the fetch status is zero, just going and finding the exact next row that we needed based on the query we gave it to populate the cursor.

So this is all very nice. And we go through and we don’t like double work anything. And we don’t have any sort of no ops where we like maybe incremented by one and found a row that didn’t exist.

And then at the very end, after we hit 100, we have our fetch status of zero. This does double print, but it does not double work 100 here. It’s just a double printing thing that I don’t know.

I’d say I’d fix it in post, but we’re already posted. So anyway, running through the cursor loop like that, we end up with no dupes, right? Every row has a count of one.

Whereas when we did the naive looping with the while loop yesterday, we had a lot of dupes where we were just like finding where we’re either working with the same number over and over again, or we would have double worked and sort of no opt things. So cursor loops, usually much more efficient mentally and I guess in the database physically than writing while loops, especially if you do not write while loops in a way that intelligently goes through your data and finds what to work on next. So I do hope that you will perhaps spend some time learning how to write effective cursors in T-SQL.

It can be done. I know that cursors get a lot of bad press for performance and that is true. Like cursors can perform poorly when compared to sometimes set-based operations.

But even sometimes cursor loops versus while loops, cursor loops are generally on the more efficient side, at least for us as the writer and maintainer of the code. And especially on the database side, where if we make a mistake with a while loop, we could pay quite a penalty in the database. Whereas if we make a, you know, we can’t, we don’t have the ability to make that mistake with a cursor because the cursor is keeping track of things for us.

And, you know, computers, pretty darn good at keeping track of things. Anyway, that’s it for me. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you, well, this is going out on Friday. So the next video will be office hours. And I realized in the last video that I accidentally only copied four questions. And I’ve maybe lied about answering five questions.

So, I don’t know. Should I do six questions in the next one? Or should I just do the normal five? I haven’t decided yet. I don’t want to set a bad precedent here.

Start miscounting and playing catch up and all that. Anyway, thank you for watching. All right. Goodbye. Happy Friday, too. Because you should at least get one of those in your life. All right.

Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Writing Smarter While Loops

Learn T-SQL With Erik: Writing Smarter While Loops


Video Summary

In this video, I delve into the world of while loops and share some insights that might surprise even seasoned database professionals. We explore how using `continue` and `break` can alter a loop’s behavior in unique ways—like skipping numbers or quitting early when bored with repetitive tasks. Additionally, I discuss the importance of proper incrementing to avoid inefficiencies and ensure your loop processes each value only once. This is crucial for maintaining database performance and avoiding redundant operations. The video also touches on cursors as an alternative that can simplify loop logic and handle increments automatically, making them a valuable tool in your T-SQL arsenal. Whether you’re just starting out or looking to refine your skills, there’s always something new to learn about effectively communicating with databases.

Full Transcript

Erik Darling here with Darling Data, and we are going to continue our T-SQL learning because that’s what we like to do around here. We like to learn about databases, and more importantly, we want to learn how to communicate more effectively with our beloved databases. They are important things in our lives, and we should be able to talk to them openly and honestly about things. So that’s what we’re going to keep doing. That jumped way too far ahead. If you like this material and you would like to buy this full training course, you remember these are just little bite-sized snippets of things. Down in the video description, there are links for all sorts of things that you can do to interact with me, like hire me for consulting, buy this training that you’re seeing little pieces of, become a supporting channel member, ask me questions for my Office Hours episodes, wow! And of course, the usual like, subscribe, tell a friend if you feel like your friend is having me, and difficulty effectively communicating with your database partners in their lives.

And maybe they could learn a thing or two over here and have healthier database relationships. The nice folks at Red Gate are putting me in front of full public display via the Pass on Tour and Pass Data Community Summit events. We’ve just wrapped up the New York event, so now we’ve got Dallas in September, the 15th and 16th, Utrecht in October, the 1st and 2nd, and Pass Data Community Summit in Seattle, November 17th to 21st, where I will have two days of T-SQL pre-cons with Ms. Kendra Little.

And that’s going to be great because then everyone in Seattle is going to know SQL and maybe that’ll help clean up some of Seattle’s well-known maladies out there. But anyway, let’s, let’s, let’s get on with our database party here and let’s, let’s go talk about while loops. Now, there are a couple of things that you can do in while loops that I learned probably a little bit later in my data database career than I feel fully comfortable saying, or fully comfortable admitting to, but that’s okay. You know what? We all learn, we all grow. It’s fine.

So the, the two things that I learned about probably, I, I, no, like continue, I, I, I learned this one, this one came along way later. I didn’t realize this was a thing until I saw someone else use it and I was like, continue, what the hell? But instead of break, I would often just use return instead, but you know, six of one, half dozen of another.

But continue is neat because continue will bring us back to the first begin in the, in a while loop, which is, I think a neat thing to be able to just say, ah, I’m done here. Let’s go back to the beginning. And of course, break will end things early. So let’s say that we have our, our task is to count to 100, but like many people in the tech world, we do not have the patience for repetitive tasks.

So I don’t know, will we get to a hundred? Probably not. Are we, are we, are we going to skip some numbers? We probably are. But this is one way that you can sort of use continue and break in a loop to, you know, make the loop behave maybe a little bit differently than just cycling through and counting from one to one. 100. For example, we can use continue to, let’s say, count up to nine or every time, or every time we get to a new 10, rather a new number that with a modulo of 10 equals zero.

We can just add on 10 and skip ahead a little bit because it gets really boring. And then maybe we might decide that, I don’t know, we’ll get to around 89 or so and we’ll, we’ll just quit. And that’s what our loop will do. Every time we hit a new 10, we’ll skip a few because we got bored.

And then when we get down to the end of our loop, we will say, oh, I got tired. Sorry, not happening for us. So these are neat things, continue and break. And also, you know, I guess return and break would be pretty similar, but you can use continue and break or return to sort of control the flow of a loop a little bit in slightly different ways.

But one thing that is important to consider when dealing with a while loop is, of course, how to increment through things so that you maybe don’t get stuck on repetitive tasks in a loop. So this is something that I talked about in one of the other videos earlier this week, where if you assume that whatever task you’re doing within the loop is working with a, like a perfectly sequential number, then you might run into situations where either you don’t find a next value or something else you do in the loop means that you just keep like you might attempt to run the loop for values that aren’t actually there.

So like if you just say, if you’re just always incrementing whatever loop, whatever is driving the loop by one, you can end up in situations where your loop is sort of ineffective or doing a lot of repetitive work in places. So we’re going to create some tables here for a while loop to work off of. And we’re going to stick some data in these tables.

And this is the somewhat boring part, but that’s okay. We have, we have, we have more interesting things to think about. So we’ve got some data in our worker table. And if we look at the what’s in the worker table and we start looking at the gaps in here, keep in mind that like the ID values that we put in, like, like they do, they are not perfectly sequential.

This goes like where it should, like, you might be expecting it to go like from like one, two, three, four, five, six, seven, eight, nine. I guess I should have gone down to 10 there. Here we’re jumping around quite a bit.

It’s like one, four, six, seven, 10, 11, 13, so this is not a perfectly sequential set of numbers that we’re working off of. This is something that you need to think about quite a bit when you’re writing while loops. And for the reasons that I said before, you don’t want to either do repetitive work or do essentially a bunch of no ops while you’re running through things.

So if we go and we run this while loop doing, let’s, let’s call it naive incrementing, where all we do is set the value plus equal one, then this will run through and do things. But it’s not really going to want, but it’s not really going to do things the way that we want. Because now if we look at what we actually processed, we will end up having with a bunch of like having processed a bunch of dupes until we found other rows, right?

Like these are all values, all of these I values are values that we ran multiple times through the loop until we found a new value. So this is not a good time, right? We ended up like processing parts of this loop either too many times, or like, you know, if we maybe there was like a no op sort of thing, we would have just like been spinning in a circle doing like no work really.

And that’s, that’s not good, right? We don’t want to write inefficient loops because loops are already sort of inefficient generally. You know, like loops and databases, usually not the first thing you want to use. They’re generally set based things you would want to use instead.

There are of course perfectly good reasons to use loops, especially to like batch modifications, but like batch units of work. But you know, we don’t always want to just jump right to loops, right? We want to make sure that we are using the loop for the right reasons and then in the right way.

So if we were to change the way that we increment the loop, and rather than just saying like to increment the loop by one, if we were to set the, if we were to drive our loop by using whatever increment value we found for the last time around, then we would have a much tighter loop that actually does like significant meaningful work on each trip through.

So rather than just add one, we’re going to add whatever increment we find in there. And this loop will have done a less work because you wouldn’t be meaning, meaninglessly spinning through things. But we would have absolutely no dupes processed in here either.

Rather than having like all bunch of fours and threes and fives and stuff. We just ended up, we just, we only process each value once. So each time through the loop, we got a brand new value set to the value that we should be using for the next turn through the loop.

So just be very, very careful when you’re doing this. In the next video, we’re going to talk about using cursors for something similar, but also how cursors do a lot to simplify loop logic. If you are writing a very, like a traditional while loop like this, you know, cursors in these cases will do just about the same performance wise.

So if you’re already doing a loop where you’re incrementing through things and doing units of work, a cursors is not really going to be worse. But cursors actually do a lot to simplify the logic required to set up and run through the loop. And to even help you not have to figure out what you need to increment to because cursors will keep track of that for you.

So there are a lot, in a lot of ways, it pays to learn how to write cursor syntax over a while loop, because the cursor syntax will actually do a lot of the thinking for you. And you don’t have to get into some of the like, you know, nuances of while loop handling and how you want to go through the while loop, because cursors will will fix that.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in the next video about our beloved cursors. So I look forward to seeing you there. Anyway, thanks for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Control Your Flow

Learn T-SQL With Erik: Control Your Flow


Video Summary

In this video, I dive into the world of control flow language in T-SQL, explaining key elements like branching, looping, and error handling. I highlight common pitfalls such as null evaluation and improper use of `go-to` labels, emphasizing the importance of proper termination to avoid unintended code execution. The discussion is part of a series aimed at helping developers navigate these often-overlooked but crucial aspects of T-SQL programming. Whether you’re a seasoned SQL Server professional or just starting out, there’s always something new to learn when it comes to mastering control flow. So, if you enjoyed this content and want more insights like this, consider becoming a member of the channel or even joining me on my upcoming tour events with Red Gate in Dallas and Utrecht. Your support helps keep the knowledge flowing!

Full Transcript

Erik Darling here with Darling Data. Sometimes I want to say Darkwing Duck Data, but that wouldn’t make any sense. We’re going to keep going with the Learn T-SQL sort of preview material. And today we’re going to talk about some elements of control flow language. We’ll get more into exactly what all that means when we get to the video, but for now, just understand that I am controlling the flow of information and that’s how I choose to do it. So if you look at the down in the video description, there are many helpful links where you can do things like hire me for consulting, buy my training with discounts included in there, become a member, a supporting member of the channel, ask me office hours questions, and all that good stuff. And of course, if you enjoy this content, but you don’t want to give me money for anything, that’s cool too. Just do all the usual like, subscribe, tell a friend stuff. And I guess we can consider ourselves maybe even in some way. You never know, right? The bigger your reach gets, the better chance you are that someone will maybe want to give you money someday. I am going to be going on a partial tour of the world. Just finished up Pass On Tour New York City, but the fine folks at Red Gate are going to be wheeling me out for events in Dallas, September 15th to 16th, and Utrecht. That’s in the Netherlands, a beautiful little hamlet, October 1st and 2nd.

And then of course, there will be the Pass Data Community Summit taking place in Seattle, November 17th to 21st, where I have two days of T-SQL pre-cons with Ms. Kendra Little. And I do look forward to all of that. So with that out of the way, let us do our T-SQL partying here. Now, SQL Server has a variety of control flow elements in it. ZoomIt is doing this weird thing. Like, what happens down there, ZoomIt? What are you doing? Why are you messing with me? So if you want to do some branching, there’s if and else. If you want to contain certain things within a little flow of control, you have begin and end.

You have navigational stuff like a go-to label, and then you have looping elements like while loops. And then of course, we have the usual things that we can do within a loop to make or break progress. And then additional control flow stuff is also error handling.

Now, one thing worth noting here is that even though this, I’m going to cover this more when we talk about performance stuff in the advanced material, is that control flow elements don’t necessarily control which query plans are compiled at all, ever. You have to do some extra work in order for that to happen. But branching queries within code often has very, very profound effects on cardinality estimation and query plans.

So be very careful if you are the type of person who does that. Anyway, we’re going to cover just some like basic control flow stuff in here that have sort of more specific modules that we’re going to get to over the next couple of weeks. few weeks to talk about things more in depth where it makes more sense.

But one thing that I see quite a few people mess up and be caught off guard by is exactly what branching really controls once you start sort of evaluating expressions. For example, if we look at this right here, you know, like I see a lot of stored procedures start with these if things. And I’m looking at the code and I’m like, I think there’s a bug here.

And they’re like, no, no, no, it’s fine. Why? And I’m like, well, because like this will only do this first thing. And by that, I mean, like if we look at this code branch right here, we have a local variable declared. It’s an integer and it has a value of 10.

And then we say if the value is greater than five, then we print a couple of things. And if the value is less than or equal to five, then we’ll print a couple other things. But look what happens when we run this and we look at the messages tab.

We get value is greater than five. Hello world. Goodbye world. So what happened was we printed the value was greater than five. Let me move this thing.

That little highlight line can get a little annoying. So because the value was greater than five, we printed this as an immediate consequence of that. But then we also printed this. And then the second if branch, which was the value is less than or equal to five because it wasn’t, we didn’t immediately print this, but we did print goodbye world.

Right. So we like this conceptually is difficult for a lot of people. And I, you know, I do run into bugs around this sort of logic quite a bit, where if you want to contain these things, you do need to have explicit begin and end around them in order to make sure that you only print, you only follow the correct path.

And in these sort of branched logic places, you like the when you have something like this, the if will only do the immediate will only do or not do the first thing immediately after it. So in this case, it’s the print, this first print statement. So when we ran this one, it didn’t print this, but it did go and print this, which, you know, that is something that I see people struggle with quite a bit.

Another thing that people will struggle with quite a bit is null evaluation or like what to do if you hit a null. A lot of people just don’t plan for it. So if we were to run this, what SQL Server just says the value is not greater than five.

Goodbye world, which, you know, is correct. Null is not greater than five, which I agree with. That’s not where that’s not where the thing is.

It’s just that if you were to go and try to do something with a null, well, that might that might upset whatever logic you’re working with to to figure things out or whatever logic you might have to you might be like, you know, running queries for a null value ending up in there might not be a good thing. So when you’re doing this sort of stuff, it’s best to explicitly handle nulls in some way and either like stop the execution of whatever like like like either take maybe take a different branch like this sort of like a no op that just says no, this was null. We can’t do anything with null from there and move on.

You know, you could like you could maybe like use is null or something, but it wouldn’t get you the same like like handling for something like a situation where you didn’t want to work with nulls further down. So always make sure to plan for nulls in your data and handle them accordingly, even in this sort of branching logic, because, you know, this make sure you like always test with a null at some point, because if you don’t and you end up with a null, even if you swear up and down, this could never be null. Well, I don’t believe you because I’ve seen too many people be wrong about this sort of thing.

Another thing that is worth talking about here a little bit, I don’t see these used a lot, but where I do see them used in code, people often do not terminate go to logic correctly. So if we were to run this whole thing, right, which we have the local variable declared, it’s an integer set to null. And we’re going to have these different go to units in here or go to labels in here.

And I have the these returns coded out just for a little bit of convenience. But this is where each one of these things would end up. Right.

And a lot of people think that when you go to a label, that just means the end of something happening. But that’s not true at all. If we run this, look what we get back. Value is null.

Unknown world value is greater than five. Hello, world value is not greater than five. Goodbye, world. Why are we here? So if you want sort of the termination of code after you have hit a label, you really do need to add these returns in or else you will just keep on going. You will just keep on running right through code.

I think that was yeah, that was all. So if we run this, then we get the correct termination of whatever block we entered, whatever label we jumped to after that. So be careful with if branching using begin and end properly.

Always test for nulls. And if you are going to use sort of go to labels and you’re going to jump around within code, make sure that you have a return to to like the stop execution after you have done whatever you need to do within that label. Otherwise, you will end up doing probably a lot more than you bargained for.

Anyway, next up, we’re going to be talking about while loops. So I’m generally far more excited about that. But I don’t know.

These are things that I find people struggling with. So, you know, I talk about them because I don’t want you to struggle. I like you. You seem like a nice person. You know, I don’t necessarily think we need to hug, but you seem OK.

So I want you to have a nice life. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something.

And I will see you in the next video about while loops where we will. Well, I guess it’ll it’ll be a surprise, right? If I tell you what I’m going to do, then ruins everything.

I’m going to keep you on your toes out there. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: SET, SELECT, and Data Types for Local Variables

Learn T-SQL With Erik: SET, SELECT, and Data Types for Local Variables


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 30

SQL Server Performance Office Hours Episode 30



Questions:

  • Hi Erik, Thank you for the great content. Why is it better to have a higher sample rate for update statistics?
  • Hi Erik! I had a procedure with a chain of CTEs and UNIONs. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues. I went in desperation and popped an OPTION (RECOMPILE,HASH JOIN) in the end of the CTEs where they were inserting into a #Table. This only took it to less than 2 seconds! How on Earth can this has such a performance effect and SQL Server could not find out by itself? Thanx
  • I recently purchased the Everything Bundle and am enjoying each module. Is there a recommended viewing order? For example, the Index Tuning module seems to reference the Hardware modules.
  • If you were curating a Paul White’s Greatest Hits compilation what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL Servers. Remember: All of Paul’s posts are at https://www.sql.kiwi/

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five of your most pressing SQL Server questions during one of our live Office Hours sessions. Whether it’s about the benefits of a higher sample rate for update statistics, optimizing performance in complex queries using OPTION(RECOMPILE) and HASH JOIN, navigating through the Everything Bundle modules, or delving into Paul White’s invaluable blog posts, I cover it all. I also share some upcoming events where you can interact with me in person, including the PASS On Tour event in Dallas and Utrecht, as well as the Pass Data Community Summit in Seattle. If you have any questions or topics you’d like to discuss, make sure to leave them in the video description or reach out directly—your feedback helps shape future content!

Full Transcript

Erik Darling here with Darling Data, and it’s time to do one of those there Office Hours episodes where I answer five of your most burning SQL Server questions, or whatever you ask me about. Anyway, five. If you want to interact with me in any way, you know, virtually, physically, whatever, look down in the video description. All sorts of helpful links. You can hire me for consulting, buy my training, become a member of this channel, and support my efforts to bring you high-quality SQL Server content. Ask me Office Hours questions on episodes just like this. You could be one of the lucky five people who gets a question answered. And of course, if you enjoy the things that I do here and you think someone else might enjoy them as well, you know, like, subscribe, tell a friend, all that good stuff. I will be live and in person a few more places through the end of the year. Dallas, September, September 15th to 16th for the Pass On Tour event there. Utrecht, same event, different place, October 1st and 2nd. And of course, Pass Data Community Summit in Seattle from November 17th to 21st, where Kendra Little and I have two days of T-SQL pre-cons in which you should go to both so that you can spend, you can maximize your time with us. And with that out of the way, let’s go answer these questions. These have been sitting around for a little bit because I pre-recorded a lot of stuff while I was on vacation.

And getting back into things. And boy, did you ask a lot of questions. So first up, we’re going to do this one right here. Let’s make sure Zuma’s working. So hi, Eric. Thank you for the great content. Hello, you. You’re welcome. Why is it better to have a higher sample rate for update statistics? Well, the good news is it’s not always better. Sometimes the default sample rate is good enough to pick up on a good general description of what data, lives in your indexes and describes that in the histogram. Other times you do need to up the sample rate in order to catch more granular things. If you are using a higher sample rate for statistics and getting better representation of your data in the histogram, there’s no reason to, you know, fear or think that you need to do that for all of them. But that is just sort of how that works. If you are in a situation where even a full scan update of statistics is not doing a good job of describing your statistics, you might try filtered indexes or filtered statistics so that you have a smaller range of data.

You might create a filtered index or statistic on a particularly interesting data point for you and have a more accurate description of just that interesting data point. So again, not always better. Sometimes you do have to increase that. You know, I’ve worked on a few different systems where a regular default sampled statistics update did not do a good job.

And there was like a I think we hit a sweet spot for some of those between about 10 and 20 percent. But I have a few demos that I do in my performance tuning stuff where I have to do 25 or 30 percent. And then there’s even one where I have to do a full scan in order for SQL Server to get things right.

So it is not a sort of across the board thing. It’s a very specific thing. And I wouldn’t recommend doing it for every index because or every statistic because that may be counterproductive to your maintenance time. But if you find that there are some statistics where it does work better, you did a good job and you should keep doing it.

All right. Next up here. Hi, Eric. Hi, you. How are you doing? Oh, there’s even a thanks on this one with an X. That’s a spicy thanks. This thing blurry. Is it just my eyeballs?

Let’s see. Can I get it? Oh, there we go. I’m a little bit more. I’m a little bit more clear now. Hi, Eric. I had a procedure with a chain of CTE and unions. You’re off to a good start. It was taking almost 40 seconds and I was tasked to come up with a solution immediately due to production issues.

Oh, you have a high pressure job, my friend. I went in desperation and popped an option recompile hash join in the end of the CTE where they were inserting into a temp table. This only took this only took it to less than two seconds.

How on earth can this has such a performance effect and SQL Server cannot find out by itself? Well, you left a few details out of here. You know, I mean, obviously, I’m not asking to see the full definition of the query because chain of CTE and unions.

That’s that’s that’s that’s that’s pay me money territory. And the fact that you have a recompile on there makes me wonder if there are any parameters or local variables involved. It could be perhaps influencing the optimizers estimation process.

But there are lots of cases where SQL Server may misunderstand the number of rows that might, you know, come out of certain operations. You know, I get by your where and join clauses, survive group by having type stuff and make progress through the plan. And you might end up with well, I mean, you know, I’m going to say in general, you might end up with a lot of nested loops joins that execute way, way more than the optimizer anticipated them than them executing.

So without seeing the query plan, my guess would be that you had a lot of nested loops joins in there in that plan and that SQL Server was getting stuck in them, perhaps even serial nested loops joins. I guess there’s also the possibility that you had some untimely merge joins, especially if you had to sort data before going into those merge joins. But the reason why the optimizer didn’t find it?

Well, I mean, you know, it could be a lot of things. It could be that you have statistically inaccurate information in your database. It could be that you had a parameter sensitivity issue.

It could be that you were using local variables and SQL Server was not estimating things well. You could be using a table variable in there, too, and getting a bad estimation. But also SQL Server could just be doing a poor job of cardinality estimation without any of those things.

And you ended up with some nested loops joins that were completely inappropriate for the number of rows that you were managing. As for why SQL Server couldn’t find out by itself, well, you know, that’s a query complexity thing. The optimizer only has so much time that it’s willing to spend on a query figuring stuff out.

And sometimes it does not have a chance to explore or accurately or correctly cost all of the alternatives. So, you know, this is where stuff like query hints really do come into play. You could try simplifying the query a bit more, maybe removing any elements from the query that would harm estimation, local variables, table variables.

Maybe, you know, if there’s any sort of parameter sensitivity, you could find ways of dealing with that. But if you’ve got things solved with recompile hash join, I don’t see a reason for you to keep tinkering with things. Sounds like you’ve got it under control.

So, you know, the optimizer is a big piece of software and it’s just like, you know, it’s made by humans, just like most software. You know, I’m going to say, I’m going to draw the line a little bit at LLM generated software because that’s not generated by humans and that’s full of mistakes too. But, you know, the optimizer can’t do everything.

It can only make a good enough guess most of the time at a good enough plan. All right. Next question here. I have, I recently purchased the everything bundle.

Woohoo. Thank you for your support. And I’m enjoying each module. Is there a recommended viewing order? For example, the index tuning module seems to reference the hardware modules. So I think as long as you do the starting modules first and I think it’s, what’s it called, the essentials modules first, you should be in pretty good shape.

If there’s anything in the other modules that reference stuff back, you can always go watch those other modules afterwards and sort of like, you know, figure out where, where things, you know, maybe connect. But I don’t really have a recommended viewing order aside from like starting in essentials and then kind of whatever you find interesting and appealing or sort of whatever applies to problems that you’re trying to solve. So that’s about all the, that’s about all the advice I have there.

All right. Last question. Gonna finish on a strong one. Oh, my dear friend, Paul. All right.

If you were curating a Paul White’s greatest hits compilation, what articles would you consider essential reading? And what profession would best suit his personality if he wasn’t consistently dunking on poorly performing SQL servers? All right.

So I’m going to answer the second question first. Paul White would be an excellent sandwich maker. Paul White would make the best sandwiches you’ve ever tasted. They would leave you with a sense of peace and well-being, just like his blog posts do, that you didn’t know was humanly achievable. So that, that, that’s, that answers that question.

As far as curating Paul White’s greatest hits, Paul has a number of like, just absolutely, I mean, like the SQL.Kiwi site where all of his posts live now. This is actually something that a lot of people miss. Like a lot of people will Google like Paul White and a topic and they’ll go to like the old SQL performance site, or they might even find like a dead SQL blog link.

But all of Paul’s posts these days are on SQL.Kiwi. And I’m going to put this in the, the, the, the video description because it’s that important. All of Paul’s posts are there.

You can find lots of one-off posts that are amazingly long, detailed, full of, you know, great information. But I think if you were to, if I were to curate Paul White’s greatest hits, it would be any post that is part of a series of posts. Because I know when Paul White writes a series of posts, that man has got his head around something.

And that’s where I think the best stuff comes in because, you know, you have sort of like this learning pathway of, you know, like, like, like, like great, like a series of things about a particular topic that wouldn’t all fit into one post. So I think whenever Paul White writes a series, that’s the stuff that I would put in like the greatest hits. Like if I were going to make a great post-Paul book in tribute to Paul White’s writing, I would start with all of the posts that are a series.

You know, as far as stuff that has, I think the series of posts that has influenced me the most from him is certainly the Isolation Level series. I have spent a lot of time with that one, especially as I’ve been presenting more and writing more about Isolation Levels. So that one has had the, I think, the most recent big impact on me.

But there are many posts of his that, you know, I have read at different points in my career. Paul White has, of course, been blogging for quite a while. He was actually the first blogger I ever read when I was trying to figure out how to write fast performing paging queries.

He had a couple of posts on SQL Server Central. But yeah, like there are posts that I’ve had to read at like various stages in my career because there were things in each of them that I didn’t fully understand until I understood a lot of other stuff first. And so like it took a lot of puzzle pieces for me to actually get some of the points that he was making, even from many years ago.

So I don’t know. Of course, I always thank you to Paul for everything you’ve done over the years. You are just an amazing human being and teacher.

So thank you for that. But yeah, it’s, you know, I would start with anything that’s a series. That’s the really, like, that’s a really crazy, like, you know, you have to sit down and concentrate. You got to like spend some time with those in order to fully get them.

But anyway, that’s five questions. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you in another video another time. Oh, okay.

Let’s just, we know it’s tomorrow at noon Eastern. So let’s just stop pretending. Anyway, cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Little About Using Windows Performance Recorder

A Little About Using Windows Performance Recorder


Video Summary

In this video, I delve into using Windows Performance Recorder to uncover deeper performance issues in SQL Server queries. After running a set of demos where certain queries took significantly longer than expected, I couldn’t pinpoint the exact cause based on traditional query plan analysis and weight stats. To get to the bottom of it, I turned to Windows Performance Recorder, which provided valuable call stack information that led us to discover the impact of read ahead reads, specifically trace flag 652, in this particular scenario. This video is a great example of how sometimes you need to go beyond the query plan and use additional tools like performance recording to truly understand what’s happening under the hood.

Full Transcript

Erik Darling here with Darling Data. And today’s video is an interesting one. So this video is going to talk about using Windows Performance Recorder to track down like a deeper level of performance problem in SQL Server. This is actually based on a demo that I had recorded a little before I went on family vacation and it bothered me the whole time. And when I got back, I started looking at it again. And, uh, like, really couldn’t figure it out. And so, like, I showed a few people and they were like, I don’t get it either. So I was like, okay. So I used Windows Performance Recorder and found some stuff. And then it was actually, uh, my, my good dear friend, Forrest McDaniel, who reminded me about a trace flag that we could use to fix this one. But, uh, this is an interesting one. So I hope everyone’s buckled up. Anyway, uh, if you look down at the video description, there you’ll see all sorts of helpful links where you can hire me for consulting, buy my training, become a member of the channel, and support the stuff that I do here to bring you this content. Uh, ask me office hours questions for my office hours episodes. Those usually come up Monday where I answer five of your questions. And of course, if you enjoy this content, but you don’t want to pay me to do anything, you can always like subscribe, tell a friend and help this channel get bigger and better in other ways. Uh, of course, I still have some speaking events coming up. Just finished, uh, pass on tour New York City, but there is still events in Dallas, uh, coming up in September where I have a half day, pre-con and a regular session. Uh, and then there is the one in Utrecht where I have just about the same thing. Then of course, there is pass data community summit coming up in Seattle, November 17th to 21st. Uh, I have two days of T-SQL pre-cons with the wonderful and talented Kendra Little. And I look forward to seeing you at both of those, not just one of those. You should go to both of those because that’s how you maximize your knowledge and learning. So with that out of the way, let us get to the database party.

Now, just to sort of, uh, bring things back to where, where they were, right? Uh, I had this set of demos where, um, there were a couple of queries that I ran. And in one of the queries, I used this, uh, option use hint, disable optimizer row goal. And what ended up happening in these query plans was, uh, the first query, uh, with row goals, uh, enabled ran for about five and a half seconds and 5.6 seconds. Good enough there. But the other query ran for 40 seconds. And all 40 of those seconds were sort of spent in a place that didn’t make any sense to me. So I’m just going to shuffle these query plans around a little bit. And the biggest time suck in these query plans was right in this branch. So this one took 3.386 seconds, which isn’t bad, right? This is the fast query plan. This one took 37 seconds, which is bad, right? That’s, that’s like a 10 X slowdown for this one operator. And I couldn’t for the, at first, when I listened to this happen, I couldn’t for the life of me figure out what was going on with it.

Because when I looked at the weight stats for the queries, so if we, uh, get the properties over here and, uh, we open up weight stats, there was not a whole lot of weight stat evidence about what might be going on in here. Right? Like, like, like the weights for the, the quick query were of course, you know, not really anything interesting, but then the weights for the slow query, let me just make sure that’s the right one there. The weights for the slow query really didn’t add up to anything either. Like, like, like just looking through these, like nothing in here said, this is why I took 30 seconds, right? This is none of, none of these weights are doing very much.

Uh, like there’s very, very little weight wise going on in here. So what I did was I fired up my friend, the windows performance recorder, which since this is a VM I had to do over here. So, uh, what I’m going to do is show you sort of what I do, uh, to get windows performance, uh, windows performance recorder stuff. Now, of course, the first thing you have to do is actually like run it to collect things, but I’ve already done that because I don’t want you to sit there and wait for a long time for stuff.

So, uh, once I have, uh, once I had things recorded for the two queries, uh, what I usually do is go into the CPU stacks and then I’ll go into SQL Server and I’m going to make this bigger and do some zooming. So you can all see what, what, what I get up to. So the first thing you want to do is select everything in here and then you want to say, look up symbols.

So this is going to go out to Microsoft and this is going to look up symbols and this is going to resolve the names of any public symbols that it can. This is going to take a couple of seconds to run. Uh, maybe you should have done this before I did the video, but you know, I’ve got pretty fast internet, so I’m not too, too worried about it.

So after, uh, while, while this is running, what I’m going to do is there are a few things in here that I like to change, uh, for the group paths and the fold percentage. So I usually just delete that and hit enter. And then for the fold percentage, I put in zero here and I hit enter.

And this will sort of expand everything to what we care about. Now, uh, looking at the stuff in here, I’m just going to slide this over. So it’s a little bit more clear.

Oh, that didn’t, that didn’t help much at all, did it? So like, if we look at what’s going on in here, we have, um, you know, like this locate column internal thing, but there’s not really like a lot of time spent in this one where we care about stuff. Where this be, where this sort of becomes more useful is when you start comparing one, uh, set of call stacks to another, but we’re good, but we need to sort of open up the other one and do stuff there first.

So the CPU stacks here just show some stuff like locate column internal. I think I actually have to squish that a little bit so that I can, uh, use zoom it to zoom in on things. Otherwise, uh, it doesn’t really cooperate with me, but for some reason, zoom it doesn’t want to cooperate when this thing is up anyway.

So we’re just going to skip that part, but we don’t really have anything too, too interesting going on in here. All right. This is the, this is the fast plan.

So let’s go look at the slow one and let’s open, crack open this bad boy. And we’re going to go back into CPU stacks and we’re going to go back into SQL Server and let’s make this as big as we can. And what we have in here is, uh, some interesting stuff.

Once we get rid of the group paths and once we get rid of the, once we do the full percent zero. And what we’re going to see is that we have a new thing in here. Uh, and we, the couple of new things that we have in here are read ahead helper and read ahead.

And this is where, I’m sorry. There’s a, I think this one too was interesting. So if we were to compare, uh, one set of call stacks to another, um, like this is really where we had way more stuff going on.

Right. This is where like we spent way more CPU time. And this is the only way that I can figure out what happened.

Cause again, just looking at the like weight stats and other stuff for the query, there was no sign at all that this stuff was coming up to the forefront. But then if we sort of bring these percents over, excuse me, uh, we’ll see that these numbers, uh, you know, were like much like a, like these things like were present in this call stack trace where they weren’t in the other one really. And they have like fairly high percentages, uh, as far as like, like stuff that executed.

Um, there’s all, all sorts of other stuff that you could look at in here, like make, look at flame graphs of things, but I don’t really, you know, I don’t, I don’t really, I’m not, I’m not enough of a developer to understand those fully. But, uh, so this is how I was like, huh. And then like, you know, as I was sharing this stuff with the people who I was just like, what, what, this is a weird thing.

Uh, again, my friend Forrest McDaniel was like, Hey, there’s a trace flag to disable read ahead reads, uh, 652. I think Paul White’s blogged about it, but, uh, Forrest remembered it. So Forrest gets full credit on that one.

But if we, uh, run this query and we turn on this trace flag 652, 7418, uh, just changes how operator times show, uh, their time in the query plan. Uh, uh, trace flag 7418 changes things. Um, so in a, in a row mode plan, the operator times are cumulative going from right to left.

Uh, but in batch mode plans, uh, they are per operator. So 7418 makes row mode plans act like batch mode plans and just has the operator times, uh, show as per operator. So this one isn’t part of what solves a problem.

652, which disables read ahead reads for the query. This is what solves the problem. So if we rerun this query with those trace flags enabled, and we look at the execution plan without read ahead turned on. And like what I don’t want you to think is that read ahead is a bad thing.

Uh, read ahead is generally a very, very good thing. It turned out to be a negative thing in this plan because of how many of them we had to do. Like there’s two kinds of killer latencies.

Typically in computer systems, there’s the stuff that of course has a long latency, right? You might see that for like lock weights and other things that, uh, you know, when they drag on for too long, other things start having problems. But then there’s the type of latency where it’s sort of like the death by a thousand cuts swarm of mosquitoes latency where you have to do one thing.

And even though that one thing is really fast, if you have to do a lot of it, well, you can end up in a pretty tough spot. So if we look at the execution plan for this with read ahead disabled, this actually ends up running, uh, pretty quickly even compared to the fast plan before. It’s about 500 milliseconds faster.

Not that like that’s the meaningful difference, but really it was with the, the difference between this one and the, uh, row goal being disabled query. That was where it got really interesting. So, uh, if you, when you’re a troubleshooting query performance, you might find all sorts of good things to look at, uh, contextually for the query.

Uh, the query execution plan is always going to be like your first source of truth for like, you know, where, uh, time was spent in the, in the, in the query plan. And then like getting the actual execution plan for that, then you might find where, um, you might look at weight stats for the query plan and see interesting things in there. But sometimes, or like, uh, you might even look at the sys.dmexec session weight stats for your session and see if there’s anything in there.

Because like I, I’ve talked about a few times before, Microsoft screens certain weights out of actual execution plans. It doesn’t report on them. So you might look at the sys.dmexec session, uh, weights, uh, dmex session weight stats view, and you might look in there to see if there’s anything additional that would be useful for troubleshooting.

Other times you have to go deeper and figure out like, you know, stuff that a query plan would never report on. And call stacks are one of them. And windows, windows performance recorder is a much easier way to just record general call stacks for when a query is executing.

Than, say, like something like using the debugger and having to like attach to a thread and like, you know, like click through and wait until you like, like set breakpoints and stuff. Because that, that’s where it gets really difficult. And of course, the nice thing about windows performance recorder is that if you find interesting call stacks in here, you can actually use something like window bug to set breakpoints on the interesting things you find in here.

So you’re just not hunting through this haystack of things and like, you know, just like hitting enter and letting the debugger make progress after each new step. So this was an interesting one. Um, again, readaheads generally not bad.

Like don’t go disabling readahead. I don’t want you to take that away from this video. I don’t want you to think that readahead reads are awful. Generally, they’re very good things and they help performance quite a bit. But, uh, in this case, they ended up being the problem for this one particular query and tracking down what the problem was, was to me very interesting and very fun.

And of course, uh, I, I do always have to thank my, my dear friends for helping me remember the things that I cannot possibly keep, uh, all contained in my head at once. So anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you over in, uh, the next video, which should be an office hours episode because the next video will be on a Monday. Today being Friday. All right.

Cool. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.