SQL Server Performance Office Hours Episode 63

SQL Server Performance Office Hours Episode 63



To ask your questions, head over here.

Chapters

  • 00:00:00 – Introduction
  • 00:02:34 – What Metrics Do Monitoring Tools Tend to Mislead People the Most With?
  • 00:07:46 – Why Increasing MAXDOP Can Help with Spills
  • 00:11:58 – Conclusion and Next Video Topic

Full Transcript

Erik Darling here, your old friend, your old monitoring tool mogul, ErikDarling here with Darling Data, and it is officially a Tuesday now, we are on the Tuesday office hour schedule, and so this is where I go ahead and, well, I do my best to answer five, not one, not two, not any number less than five community submitted questions. For free, because I still have better answers than AI, I think, most of the time. Down in the video description, if you would like some answers that are better than AI about your specific, I don’t know, database, query, whatever, you can hire me to do that sort of consulting.

It’s a wonderful process. I highly recommend it. And you can also buy my training, which is, again, well, not to keep beating that horse, thank that one. No way that one’s winning the Kentucky Derby.

You can also become a supporting member of the channel, and you can find a link to ask me office hours questions, just like the nice folks who have asked me five questions this week have. And of course, if you don’t have a question, or you don’t have any money, you can do something else that’s nice.

You can like, subscribe, and even if you’ve got one or two, you can tell a friend about the channel. And maybe they will also like and subscribe and tell a friend. And you see how that cascades into something just absolutely wonderful for everyone.

If you need free SQL Server performance monitoring, I’ve got it. Believe it or not, that link is also baked into the video description, so crazy times, right? Totally free, totally open.

Open source. Just runs a bunch of good T-SQL to capture a bunch of good data. Puts it into pretty charts and graphs for you. It’s got a whole dashboard and cool stuff.

And if you’re, you know, really into the robots, and no one can talk you out of talking to robots, there are built-in MCP tools in the server so that you can begin to have the robot friends look at your performance data and maybe tell you what to do.

What’s going on. I can’t vouch for whether they will be right or wrong. All I can tell you is that they will have been pointed at the data that I have collected. And, you know, whatever they surmise from that is on them.

I have no control over that part. So, anyway. Stuff coming up. I actually get to take a couple of these off the chart soon.

Pass on Tour Chicago. That is soon. It’s like the end of this week. Holy crap. I have to go do something.

And then I will be in Poland May 11th through 13th. Flying directly there from Chicago from little Poland to actual Poland. Then I will be home for a little bit prior to Data Saturday Croatia.

And then home for a bit longer until past Data Community Summit in Seattle, Washington November 9th through 11th. At least, you know, as far as what I can tell you. I have advanced T-SQL pre-cons at Pass on Tour Chicago, SQL Day Poland, and Data Saturday Croatia.

We’ll just have to see what awaits us for Past Summit in Seattle, Washington. And, yeah. So this is still ChatGPT’s take on May in New York.

I don’t quite get it. But, you know, the robots, huh? Anyway.

Let’s answer you some questions here. Let’s see. Is SOS scheduler yield always CPU pressure or can it indicate other problems? So I don’t want to say that SOS scheduler yield always means CPU pressure.

But it does indicate CPU stuff. Because it is queries voluntarily yielding time on a scheduler. So other queries can get some time on there.

Be fair to everyone. Feel the warm embrace of the CPU collectivism, I suppose. And then, so, like, if this gets up, you know, real high compared to server uptime or, you know, you know, it often, like, when you say that, it quite often does accompany incorrect or at least ineffective parallelism settings.

So high CX weights and that stuff. Because, of course, parallel queries use multiple threads. CPUs harder than single threaded queries.

You know, you could also have a vaguely or, I don’t know, perhaps monstrously untuned workload that puts a lot of CPU pressure on the system. But, yes, if you’re concerned, if you’re seeing SOS scheduler yield as a very prominent weight on your server, then certainly I would investigate CPU intensive queries.

You know, prominent being, you know, either for sampled periods of time. It is quite high. Or perhaps, you know, if you’re looking at the server since startup.

And you’re seeing, like, you know, SOS scheduler yield is, like, near or equal to or higher than, like, server uptime. Then that’s probably where I would start. Let’s see here.

How do you tell? When blocking is actually the root problem versus just slow queries piling up. Well, suppose we could come back to weight stats on this one. If you’re experiencing high LCK weights.

Either from the perspective of total amount of time that queries are spending on those weights. Or if the average milliseconds per weight on those LCK weights are on the high end. Then that would be a good indicator.

If I were you, I would most likely be looking at it. To start, you know, assuming that you don’t want to, like, get emotionally invested in anything. You could start by turning on and analyzing the block process report.

I’ve got a free store procedure that does that. It’s called SP human events block viewer. You could also look at the system health deadlock extended event. Or you could, or sorry, the system health extended event deadlock stuff.

SP blitz lock will do that. And you could also spin up a dedicated deadlock extended event. Which I do prefer because at least when you are able to set one up in a way that creates and stores data in a file target.

Then you have a bit more fidelity in the deadlock data you are collecting. So first stop, look for high LCK weights. Either total hours of uptime or high average milliseconds per weight.

First stop, you know, turning on the block process and XML deadlock stuff. Looking through those. But of course, you know, you could also just maybe, I don’t know, do something crazy.

Like point a completely free open source SQL Server performance monitoring tool at your server. And have all of that stuff get collected for you. Ta-da! Look at that.

Wouldn’t that be nice? All right. Is there ever a case where having fewer indexes actually helps performance overall? Yes, I would say that there are many cases.

As long as you still have good indexes that fit your queries. I’ve got a store procedure called SP index cleanup that will, again, totally free, totally open source. That will help you identify unused and duplicative indexes.

And get those all merged in together so you can clean those up. There are three primary things that, let’s just give it the blanket term over indexing. I’m not saying this, like, because I think there’s a certain number of indexes that will, like, cause a problem.

I don’t want to put any magic numbers in your head. But what I would say, like, under the blanket term over indexing, the primary things you’re going to see are, you know, buffer pool competition. Because you have all these different objects competing for space in the buffer pool.

Most SQL Servers that I see. Do not have an adequate amount of memory to deal with the amount of data that they are lugging around. You would also see an effect on the transaction log.

Every index that gets modified as data in the base table gets modified has to write their changes to the transaction log. So the more of that you have to do, the busier your transaction logs become. And, of course, from the point of view of lock escalation.

Because the more indexes you have, the more locks you’ll have to take. And the more, well, I mean, only attempts at lock escalation. But, certainly, you know, having to lock all those different objects, even if it doesn’t result in a successful lock escalation attempt, can be unpleasant.

So, sure. So there’s all those things. But also, like, you know, sort of consider, like, the workload patterns.

You know, like, anything that’s remotely data warehousey or staging tabley is almost always going to be slowed down to some manifestly, you know. And that’s a thing. dismal degree by having more indexes to insert data into on let’s just call it a bulk load that’s always that you know that’s one thing to consider if your workload is not just standard stock and standard OLTP you know I guess there’s there’s maybe some other like let’s just call it optimizer stuff you know the how it might take its time or perhaps you know run out of its budget for steps to optimize a query maybe not pick the best like a better available index you know the optimizer just like you and me the more choices more chances you give it to screw up the more choices you give it the more likely it is to have an issue there all right what metrics do monitoring tools tend to mislead people the most with that’s an interesting one I don’t I don’t have a a firm answer on this I think my main thing with monitoring tools is that they they tend to sort of like just sort of saturate you with information and expect you to be able to parse it apart I wouldn’t call it necessarily misleading I would I would maybe call it more like misdirecting so like if you’re if you’re looking at a performance issue and your monitoring tool is just like well 1300 people logged in and 1500 people logged out watch out for that you know it’s just like it’s not really getting you to where you’re going you know like you know monitoring is monitoring though right monitoring is just collecting a bunch of metrics storing them and letting you decide monitoring is not analysis monitoring is not you know getting you helping you get to a root cause necessarily it’s just sort of collecting data for you to do something with that’s why when I was building my performance monitoring tool there were some things that I wanted to do with it that would make your life easier you know like putting the MCP tool server stuff in there was one of them because it lets you have you know one of your robot friends look at the performance data and you know help you find stuff in there you can ask whatever questions you want of the data that gets collected so like you know like maybe it’ll look at it and maybe it won’t find the particular thing that you’re interested in but you can always point it to that right prompt engineering I believe they call it right something like that uh and you know that’s also why you know like all the you know over time I built the tool up to really help you like when you see a spike on a chart or a graph be able to get to the queries that were most likely involved with that spike in the chart or the graph so that you have an easier time of just you know like figuring out what was going wrong in the server at times where you might have been concerned about it so again I don’t think it’s necessarily that they mislead you I think that they they tend to misdirect you with over collecting and um stuff like that all right oh let’s see uh finally something this is an interesting one we increased Mac dot max stop not Mac dot Mac duple uh and spills went away this seems backwards why would that help well there’s actually uh I think a fairly simple explanation for this let’s say you have a query that runs at max.4 and I don’t know let’s say one gig of memory right because all plans in SQL Server start as serial plans uh that’s when a memory Grant gets assigned to them and if SQL Server later chooses a parallel plan that memory Grant is divided amongst your dop threads that are executing in the plan um it it doesn’t even necessarily have to like it could be one of two things right it could be like that four gig memory Grant uh was not enough for uh split into the one gig grants across your four threads that that could have been too small uh to begin with right for any one of them but then also you know not all parallelism is perfect not all parallelism is created equal so if one of your threads gets a lot more rows it still gets that same amount of memory SQL Server doesn’t say oh well uh you know you got 10 million rows and everyone else got one million rows we better give you more of that memory Grant you know let’s let’s be fair here uh so uh you know you could either have like all your threads spilling or just one of your threads sort of getting an unfair number of rows and spilling and if you increase max stop you spread those rows out further so that perhaps the same memory Grant um might uh might alleviate some of those spills uh that would be that would be like the first place I go with it um sometimes increasing dot does sort of get you out of parallel skew problems so that’s where I was going with that but anyway uh that would be my guess um you know all of those threads uh having fewer rows on them so the memory Grant goes a little bit further that would be that would be my sort of thing there anyway that is five questions right two three four five six yeah there we go uh thank you for watching I hope you enjoyed yourselves I hope you learned something and I will see you in tomorrow’s video where we’re going to talk about a little kind of a fun little thing that I did with dynamic SQL recently um and so there’s that anyway thank you for watching

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Moving Indexes To A New Filegroup: Microsoft Still Hates You

Moving Indexes To A New Filegroup: Microsoft Still Hates You


At some point you’re going to want to move some indexes to a new filegroup. Maybe you’re separating data across storage, maybe you’re cleaning up after someone who put everything on PRIMARY and walked away, maybe you’ve got your reasons and they’re none of my business.

Whatever the cause, you’d think this would be a solved problem in a database that’s been around since the Clinton administration.

It is not.

How bad it gets depends on what you’re moving. Let’s go from least painful to most painful, because the pain here is instructive.

Normal Indexes


We’ll define normal as an index that isn’t carrying gobs of LOB data around with it. If that’s what you’ve got, life is easier. Not simple, but easier.

The part that surprises people: you cannot just rebuild the index onto the new filegroup. There is no ALTER INDEX REBUILD WITH (MOVE_THIS_SOMEWHERE_USEFUL = ON). That would be too goddamned easy, and we don’t do easy here (unless it’s a RECOMPILE hint).

What you have to do is fully script the index out. Keys, includes, uniqueness, filters, and any particular settings it was created with.

Then you recreate it on the new filegroup with DROP_EXISTING turned on.

CREATE UNIQUE NONCLUSTERED INDEX 
    whatever
ON dbo.SomeTable
(
    column_one,
    column_two
)
INCLUDE
(
    column_three
)
WHERE column_one > 0
WITH
(
    DROP_EXISTING = ON,
    ONLINE = ON, 
    (and all the other stuff you can or might want do, like PAGE compression)
)
ON [NewFileGroup];

Yes, you can make the new FG the default so that you don’t have to worry much about including it on every script, but who knows?

Maybe you created more than one new FG. You’re weird out there. I know you.

Miss a column in the include list, fumble the filter predicate, forget it was unique, and you’ve now changed the index instead of just moving it.

The work isn’t hard, exactly. Plenty of stored procedures and code examples exist to script out all your indexes.

It’s just tedious and unforgiving, which is its own kind of hard. Like a Cormac McCarthy book.

Heaps


If you’ve got heaps, your life is about to get worse.

What sucks is that I typed that and then realized it sounds like something an LLM would say.

Ah, screw it.

You can’t rebuild a heap onto a new filegroup, because there’s no index to rebuild. The data is just sitting there in a pile.

To move it, you have to put a clustered index on the table, which physically relocates the rows to wherever that clustered index lives.

If you’ve been meaning to fix those heaps anyway, congratulations, you get a small hit of satisfaction here. Build the clustered index, leave it, move on with a slightly better schema than you started with.

But if the table is supposed to be a heap, you’ve now got to script out dropping the clustered index you just created. Which turns it back into a heap on the new filegroup. So the move costs you a create and a drop for something that was never supposed to have an index in the first place.

LOB Data


Now we get to the part where I want someone at Microsoft to do this process.

Once. Just once.

There are products that should be experienced by the people who make them, and this is one of them. I get the sense that it often isn’t.

This applies to clustered tables with LOB columns, and it applies to your heaps with LOB data too, because LOB makes everything worse uniformly. Oh, and if you’ve got nonclustered indexes with LOB data in them, well… you, too.

When you do the create-with-DROP_EXISTING dance to move a table, the in-row data moves. The LOB data does not. It just stays where it was, staring at you, refusing to relocate. You can verify this yourself by checking allocation units before and after and watching the LOB_DATA unit sit exactly where it started.

The fix comes from a Kimberly Tripp post that has saved a lot of people a lot of grief over the years (Understanding LOB data). The trick relies on a quirk of how SQL Server handles partitioning: LOB data physically moves when the object transitions from non-partitioned to partitioned, or from one partition scheme to another. So you make the table partitioned, which forces the LOB data to move, even if you have no actual interest in partitioning anything.

The sequence goes like this:

  1. Create a partition function and a partition scheme.
  2. Apply the scheme to the table by creating the index on it with DROP_EXISTING. That moves the data onto the scheme.
  3. Then create the index AGAIN, this time onto the plain filegroup, with DROP_EXISTING once more, which makes the table non-partitioned again and moves everything, LOB included, on your target filegroup.

You read that correctly. It takes two index creates with DROP_EXISTING to move LOB data. The table briefly becomes partitioned for no reason other than to trick the engine into picking up the LOB allocation unit and carrying it along.

CREATE PARTITION FUNCTION pf_temp_move (bigint)
    AS RANGE RIGHT
    FOR VALUES (9223372036854775807);

CREATE PARTITION SCHEME ps_temp_move
    AS PARTITION pf_temp_move
    ALL TO ([NewFileGroup]);

/* Move onto the scheme. LOB comes with it. */
CREATE UNIQUE CLUSTERED INDEX 
    whatever
ON dbo.SomeTable 
    (some_bigint_column)
WITH
(
    DROP_EXISTING = ON
)
ON ps_temp_move (some_bigint_column);

/* Move back onto a plain filegroup. Table is no longer partitioned. */
CREATE UNIQUE CLUSTERED INDEX 
    whatever
ON dbo.SomeTable 
    (some_bigint_column)
WITH
(
    DROP_EXISTING = ON
)
ON [NewFileGroup];

And here’s the kicker: If the table has nonclustered indexes on it, both of those moves rebuild every one of them. Onto the scheme, then off the scheme.

You are reading that correctly too. Every nonclustered index gets rebuilt twice.

Say, have you been meaning to clean up some indexes for a while?

Picking A Boundary Value


The partition function needs a boundary point. You want a single boundary that sits higher than any value that exists or will plausibly ever exist in your clustering key, so that everything lands in one partition and nothing actually gets split up. You’re not partitioning for real. You just need the engine to think you are.

If you’re clustered on something with a sane data type, this is easy. Use the maximum value for the type:

  • int: use the int max, 2147483647
  • bigint: use the bigint max, 9223372036854775807
  • uniqueidentifier: use FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF, but read the note below before you trust that
  • date, datetime, datetime2: 99991231 is the standard pick

On the GUID one, be careful. SQL Server does not sort uniqueidentifier values by reading the bytes left to right the way you read them on screen. It sorts on the last group of six bytes first, then works backward through the groups. It’s a genuinely strange ordering and it trips people up constantly. The good news is that the all-F’s GUID still sorts highest no matter how you slice it, because every byte is already maxed out, so FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF is a safe upper boundary. Just don’t go assuming the rest of GUID ordering matches what your eyes tell you.

If you came in from a heap, you’re free to pick whatever column has the most reasonable data type to build your temporary clustered index on. You’re dropping it afterward anyway (maybe), so choose the one that makes the boundary value easy.

And Then There Are The String Clusterers


I see you out there. Clustered on a string because at the time it seemed like a simple thing to do and nobody was around to stop you.

Now you’ve got extra thinking ahead of you. You need a boundary value that sorts higher than every string already in the column, which means working out how many bytes the column holds and then building a value out of enough z’s, or whatever sorts highest under your collation, to clear the top of your data. REPLICATE is your friend here, padding a character out to the column’s length so your boundary outranks everything.

/* A boundary higher than any value in a varchar(50) clustering key */
DECLARE @boundary varchar(50) = REPLICATE('z', 50);

And even that depends on your collation deciding that ‘z’ sorts above whatever garbage is actually in there. Mixed case, accented characters, and case sensitivity all get a vote. So you don’t just get to pick a max value off a chart like the rest of us. You get to go think about collation sort order.

Go to hell, all of you.

Your Online Operation Isn’t As Online As You Think


You set ONLINE = ON, you tell yourself you’re being a responsible adult, and you expect the move to glide along without anybody noticing. Then you watch shoving all that LOB data around generate a shitload of tempdb contention, and suddenly your nice online operation is causing blocking anyway, just through a side door.

So your “online” rebuild is online in the narrow sense that it isn’t holding a long schema lock on the table itself, but it’s lighting up tempdb badly enough that everything else fighting for tempdb pages gets to wait in line behind you. The blocking didn’t go away. It just moved somewhere you weren’t looking. Watch your tempdb allocation page contention while this runs, because that’s where the pain shows up, not on the table you’re moving.

ONLINE Is A Suggestion, Not A Promise


It gets better, by which I mean worse. ONLINE = ON only loosely guarantees that your operation won’t block anything. It is not the iron contract people treat it as.

Kendra Little wrote up a great example of an online rebuild that ran offline and took exclusive locks the whole way through, with no warning and no error (Ugly Bug: SQL Server Online Index Rebuild Sometimes Happens Offline Without Warning).

Her repro used ALTER INDEX REBUILD WITH (ONLINE = ON) on a table that had previously had a LOB column dropped, which leaves the table in a state where the engine falls back to an offline operation and holds X locks the whole way through.

Joe Obbish also wrote a post about something similar on my site, here. Joe’s post is about column store indexes, though. That may not apply to you.

WAIT_AT_LOW_PRIORITY, the thing that’s supposed to be your lord and savior from the schema lock at the end, offered her no protection against those locks. This is the same LOB ghost haunting you from a different room. Maybe under your bed. Maybe in your closet. Maybe in your fridge.

The broader point is the one that matters even when you’re not stepping on that specific bug. Your online index build still has to take its locks, minimal as they’re supposed to be. And if something is already in the way when it goes to take them, your online operation gets blocked. Now it’s sitting there waiting, and everything that shows up behind it gets blocked too, because it’s holding its place in the lock queue while it waits for the lock it needs. One stuck online rebuild turns into a blocking chain, and that chain can sit there for a long, long time while you wonder why a “no downtime” operation took your application down. This also makes you look like an asshole for saying that “I can do this fully online and not cause any blocking, boss”.

There’s also some version voodoo waiting for you here. Because moving an index to a new filegroup forces you into CREATE INDEX with DROP_EXISTING rather than ALTER INDEX REBUILD, your access to WAIT_AT_LOW_PRIORITY depends on your version.

For ALTER INDEX, that option has been around since SQL Server 2014. For CREATE INDEX, the WAIT_AT_LOW_PRIORITY syntax only showed up in SQL Server 2022, along with Azure SQL Database and Managed Instance.

If you’re on 2019 or earlier and doing a filegroup move, the one saving grace you’d reach for to manage the Sch-M lock at the switch-in isn’t available to you, even though the people doing plain in-place rebuilds have had it for years.

This is exactly the situation I wrote ProtectSession for. If you’re going to kick off one of these moves on a busy server, set yourself up to watch for and deal with the blocking it can cause, rather than finding out from the people whose queries are piling up behind it.

The Short Version


Normal indexes cost you a careful script and one create. Heaps cost you a create and a drop (maybe). LOB data costs you a partition function, a partition scheme, two creates, and a rebuild of every nonclustered index twice. And if you clustered on a string, you also get to do collation homework before you can even write the boundary value down.

None of this needed to be this way. But here we are, and the data isn’t going to move itself.

An additional thing to consider: SQL Server has many different build strategies for indexes. It may choose to build indexes on giant tables single-threaded. It may even choose to build all 23 nonclustered indexes on a huge table single threaded while you’re partitioning on/partitioning off to move LOB data.

The story gets even more tawdry and sordid if you’re using an Availability Group in Synchronous mode. You might see a lot of really nasty pile ups on HADR_SYNC_COMMIT. You do have the option of switching to manual failover and asynchronous commit for a bit, but that’s between you and your RPO goals. If you’re moving a significant amount of data, it may be a long wait.

Get into sports, dummy, as a wise man once wrote on a bathroom wall.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Indexed View Maintenance

Learn T-SQL With Erik: Indexed View Maintenance


Chapters

  • 00:00:00 – Introduction
  • 00:02:16 – Index Views Considered Harmful
  • 00:04:03 – Common Misconceptions About Index Views
  • 00:05:33 – Disallowed Features in Index Views
  • 00:08:44 – Performance Comparison Before and After

Full Transcript

Erik Darling here with Darling Data, and today’s video should be a fun one. We’re going to talk about making sure that indexed views don’t ruin your modifications, because they sure can under certain circumstances.

Down in the video description, you will see all sorts of helpful links, including a link to purchase the full course material. Remember, these are just tiny little crumbs of the full course material, which you can go get down in the video description below.

Or if you attend one of my upcoming advanced T-SQL pre-cons, you will also get the full course material for free 99. It’s amazing how that works.

You can also hire me for consulting. You can become a supporting member of the channel. You can buy me half of a New York City cappuccino every month if you’d like. Is it $4?

It’s not an incredible amount of money. And if you would like to ask me office hours questions, you can do that. And of course, if you are not feeling monetarily obligated towards me, which I understand many people just are very happy to just take things for free, you can like, subscribe, and tell a friend.

Fill a hole, fill a void in someone’s life that this channel would obviously, it would just complete them in ways that you can’t imagine. Or maybe you can imagine.

Maybe it just completed you. Maybe it just completed you in those same ways. Over on my GitHub repo, I have a free open source SQL Server performance monitoring tool. And it doesn’t suck.

It is all the stuff that I care about monitoring performance-wise on the SQL Server packaged up, given to you. You can point it at your SQL Servers. You can start getting great information about what’s going wrong with them performance-wise. Excuse me.

Spring is springing here, and now I’m starting to get a little allergic to things, so you’ll have to forgive my throat clearing on that one. There’s also a built-in MCP server.

So if you like the robots and you want to have the robots talk to your performance data and give you summaries or some analysis on them, maybe even give you some feedback on what you should do to fix it, you can do that. And they’ll do it, I mean, depending on how you call the robots, it might be free.

It might not be. I don’t know. All right. Tokens ain’t free, I guess. But anyway, this will be the last video that I record before going to… Jacksonville, Florida, May 1st for an advanced T-SQL pre-con.

You can still buy tickets. At least you better be able to still buy tickets, because it’s not for a couple days after this, so…

Or maybe… Actually, this one’s Thursday. The next day, Friday. Hey, you better hurry up before Jacksonville is gone. After I get back from Jacksonville, I will be on my way to Chicago, Illinois, May 7th and 8th for a pass-on tour east.

I will also be doing an advanced T-SQL pre-con. I will be doing an advanced T-SQL pre-con. I will be doing an advanced T-SQL pre-con there. Mmm. Many chances. After that, I will be at the lovely SQL Day in Poland, May 11th through 13th. And I’m flying right from Chicago to Poland.

It’s gonna be crazy times. Boy, I hope the weather’s nice. And in Poland, I know it might be hard for you to believe, but I will also have an advanced T-SQL pre-con there.

After that, I will be at Data Saturday Croatia with, believe it or not, an advanced T-SQL pre-con. And after that, I will be at PassData Community Society. I will be there. PassData Community Summit, the westest of all the summits in Seattle, Washington.

And there, well, I’m just gonna have to surprise you with what I’ll be doing there. It’s gonna be out of this world. Anyway, it is still April-ing outside.

Next week’s video will… Next week I will be debuting the May graphics. It’s gonna be wonderful and fantastic. You’re gonna be just as terrified of it as I am, I think.

Anyway. T-SQL Server Management Studio. When most people think about index views, they rightfully think about all the stuff they can’t do with them.

And I sympathize with that because, man, so many times they’ve been like, oh, if only you could do this, if only you could do that, it sure would be nice. And I realize that all the air has gone out of the room as far as making index views more powerful because everyone’s like, well, you could just use batch mode.

And that’s true in a lot of cases. But it’s also not true in a lot of cases. um batch mode is not always better than a completely pre-aggregated set of data so getting things like a min and a max in an index view you can’t do it and that sucks right this is lazy um but you know like really having those aggregations maintained somewhere uh can make read queries a lot faster especially when you consider that on standard edition um batch mode is still terribly hobbled everything maxes out at a dopp of two so if you still need like a dop above two like say four or eight or even six all right uh you don’t get it right it’s it just sucks and it’s annoying because you didn’t pay the friendship tax to microsoft but uh index use uh you know they have they have many challenges of their own uh we’re going to talk about those a little bit more i have another file queued up for next week’s stuff that we’ll talk a little bit index use in a slightly different way but um you know you know beside from the things that are disallowed in them uh you may you may still want and need no expand hints uh when you query them to keep uh sql server from expanding them into the underlying queries and ruining all the hard work you did to index that view just like filtered indexes and computed columns you need to have some anti-set options lined up correctly so that you do not experience terrible errors or queries not matching to your to your uh your your rocket science query tuning efforts um if you so you got to get this stuff lined up if you want them to work correctly but um we’re just to make things just like nice and compact here we’ve got this view which is not indexed yet but it is set up to be indexed by having a schema binding thingy here and a count big thingy here and of course we have the correct grouping that we need to do here this view already exists the problem is this view still takes 15 entire seconds to run right we are not having a good time with this view look at that well 14.2 seconds close enough uh you know sql server is like uh merge join that’s a good idea when is a merge joint ever a good idea right make that a hash join make that a parallel hash join and this thing would probably be about five seconds all right let’s say let’s try that let’s see what happens let’s do this and let’s let’s come on i’m clicking on you listen to me why don’t you ever listen to me let’s do a option i don’t know if you can hear the sirens outside but that’s another lovely side effect of spring the weather gets nice and i open my windows and new york’s like screw you here’s some sirens in your youtube video all right so let’s see let’s uh estimated plan what do we get look at that parallel hash join isn’t that a thing of beauty oh my word oh it’s gorgeous it’s wonderful let’s see what happens remember the last one was 15 well 14.2 seconds man wow parallel hash join 2.2 seconds actually let’s let’s go let’s go the tape i’ve been lied to by uh ssms before but yeah okay fine 2.2 seconds elapsed that is beautiful why would sql server pick a serial merge join plan when it could have had a beautiful parallel hash join plan i don’t know sql server sometimes i i i want to migrate to postgres when i see what you do but um when when we’re talking about you know uh trade-offs and query tuning and should i do this or should i do that i mean it’s our job to test these things right it’s our job to make sure that the the changes that we’re affecting have positive effects on the workload as a whole not we if we make one query a little bit faster but we completely ruin a whole bunch of other queries we didn’t do a good job right so we can’t have that but let’s let’s take a look right now uh at what an update to the post table currently looks like all right so i’m gonna do a begin transaction and a rollback and in the middle we’re gonna hit this little helper uh in a table valued function here called what’s up locks if you’re not familiar with this it’s available on my github repo with all my other grand stuff um so you can go get it there if you really want it but if we run this and we look at what happens when we update 100 rows uh we get you know we get a few x locks it’s not that big a deal uh the execution plan uh is pretty simple well for the for the update it’s pretty simple for for what’s up locks it’s clearly a complete disaster it’s well not a disaster but it is kind of a nightmare but uh here we have this thing and this you know we seek into an index and we do our update everything is just fine and dandy and even if we update 28 000 rows right if we hit john skeet and we say well it’s like 27 900 and something like you know we this thing does you know uh sort of lock the entire table but there’s no other competing locks so the lock escalation there not really uh any like all that unexpected for updating that many rows the execution plan does change a bit right but it’s it’s still a it’s still a pretty you know um you know still a pretty efficient plan for updating 27 900 and something rows but it can’t be too angry at that right that’s not not so not so bad but now let’s come back to our view here right and let’s let’s create a unique clustered index on our index view right and well this this takes a second to create and it’s not it’s like you know creating any other index it takes it takes a moment but but wait but while you’re sitting here and you’re sitting there waiting for that index to create you get to do all sorts of other things like run sp who is active maniacally and more manically rather and and like you know sit there and stare at your availability group and you’re blocking and whatever else and wait for it to finish but it’s finished now so that’s great and now with this done right with with this with this view indexed this all happens relatively quickly now we do one tiny little seek into the clustered index view uh sql server even suggests another index on here which we’re not we’re not going to add because we can just pretend 500 milliseconds is fast enough but now the the query plan for our update is going to change a bit right it doesn’t really get meaningfully slower for a couple reasons um i mean it gets a little bit slower but not like terrible and we have a lot more complexity in here now all right so because now we have to maintain the indexed view and so now we seek into the post table and we do all the updating we need to have the post table and then we have this sequence operator and the sequence operator says after you happen you happen i’m going to sequence you right and now down here we have to maintain the indexed view or rather the clustered index on the view which requires touching both tables in our case though i have added good indexes to support my index view so this is not a complete disaster right sometimes you do need good indexes in place to support the query underneath your index view to make reassembling the indexed view faster right it’s it’s a crazy world like like like when people say it’s like turtles all the way down it’s indexes all the way down it’s just oh i need an index to tune this and now i need an index to tune this and i’m going to make an index view but now i need indexes to make updating maintain maintaining my index view faster it does take some effort right and it does take some testing and stuff and uh updating 28 000 rows you know again this is half a second last time it it this does slow down a bit right it’s two points per second right again this does slow down a bit right it’s 2.6 seconds now so this is not the perfect world but now you can’t have everything all the time but uh you know we and we have a much more still a like this this plan was already a little bit more complex on top all right with all the sorting and splitting and filtering and whatnot but now maintaining the index view is still well it’s still like the bulk of our effort right that take that’s like you know you know there’s like 600 ah screw you there’s like 600 milliseconds here but two two two full seconds down here in the index view maintenance phase um you know most of it is not um assembling the index view right because we’re only at 325 milliseconds here that’s like 1.7 seconds total just updating the the values in the index view perhaps if we put that nonclustered index on the index view it would be it would be faster but you know let’s not let’s not get ahead of ourselves so if you want index views to work well for you uh you need to consider read queries and modification queries in your workload uh like with anything else modification query wise the more rows you get involved the longer something’s going to take a lot of the times um you know when i’m looking at trying to tune modification queries uh there’s like almost nothing from the read portion to tune everything you need to tune is in the right portion which is when things like batching become so much more valuable and useful and interesting to to get in because uh it turns out updating smaller chunks of work is going to take a lot longer than you think it’s going to take a lot longer than you think it’s going to take a lot longer than you think it’s going to take a lot longer than you think it’s to time well you may take the same amount of time to iterate over the table and update all 100 million rows or something uh you’re much kinder to your server in the process and things get a lot less like nuclear meltdowny when you’re doing that anyway thank you for watching uh yes this if you run a creator alter on an index view it drops all the indexes um but thank you for watching i hope you enjoyed yourselves i hope you learned something i hope that you will use indexes and index views responsibly in your SQL Server.

Thank you for watching. I hope you learned something. I think I may have already said that. Anyway, I’m good now. Goodbye.

Thank you. You are very kind people in the world. All right. Adios.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Computed Columns

Learn T-SQL With Erik: Computed Columns


Chapters

Full Transcript

Erik Darling here, Darling Data. Feeling my Darling Data dandiest today. Beautiful day outside, I think. I haven’t been outside yet, but I hear, and I’m lying, I walked back and forth to the gym. It was pretty nice. I can’t complain. In today’s video, I’m going to show you a little snippet, a brief, a little whiff of some of the content that I cover in my course Learn T-SQL with Erik. We’re going to talk about computed columns today, and I’m going to talk about, well, we’re going to do two things.

One, we’re going to see if a bug still exists in SQL Server, and then I’m going to show you some stuff around determinism with computed columns, because that struck me as interesting when I was writing it. I was one of those, like, I like this. You’re cute. Anyway, done with the video.

In the video description, you will find all sorts of helpful links where you can hire me for consulting, as that’s my job. You can buy my training, including the Learn T-SQL with Erik course. There is a video. Would you believe, I’m just so well thought out, I put a link in the video for the course that I’m selling.

It’s crazy how on top of things I am. You can become a supporting member of the channel for as little, few, minimally as $4 a month. You can say, hey, Erik, thanks for creating all this video content.

For free, you idiot. You’re feeding the robots. Aside from that, you can also ask me office hours questions. And, of course, if you’re feeling just really kind and generous, but perhaps not monetarily, perhaps you are pinching pennies.

You’re saving up for that big house in the hills or something. You can do stuff for free, like subscribe and tell a friend about my channel so that people keep becoming subscribers. And the little number gets bigger.

And that’s what I like to see. If you are out there in the world and you’re like, gosh, I sure would like some high-quality SQL Server performance monitoring that’s absolutely free. Boy, howdy, have you come to the right place.

Let me tell you, I’ve got that. It’s on GitHub. It is totally free. It is totally open source. There are absolutely no requirements on your part as far as giving me information goes. It is a bunch of really good T-SQL collectors.

Run against your server. Get important information. And distribute them into appropriate beautiful charts, graphs, and other. There’s a heat map in there.

There’s all sorts of good stuff. And if you are maybe not a fan of feeding the robots, but at least enslaving the robots, making them do work for you, there are built-in optional MCP servers so that you can have the robots do analysis on your robots. You can have them do the performance data for you and write up a bunch of stuff about it.

Again, they’re pretty good at looking at the data and telling you stuff that happened, but I’m still a little wary of the advice on how to fix some of these things. I can’t fix that part, but I can do the other stuff. Anyway, that’s good enough there.

Again, these are the last chance to buy tickets for my pre-con in Jacksonville. Day of data. That’s the one in Florida. I know that there are several Jacksonvilles.

At least I would imagine there are several Jacksonvilles. Jackson is a fairly popular last name here in America. So there’s probably many Jacksonvilles. But this one is in Florida.

And I’ll be there May 1st and 2nd. Or at least May 1st. I don’t know. The 2nd, we’ll see. We’ll see what happens. I get really tired sometimes. But I will be there May 1st with the advanced T-SQL pre-con.

So buy those tickets now. Because that’s… I may never come back to Jacksonville again. I don’t know anything about it.

After that, I will be in Chicago, Illinois. May 7th and 8th for Pass On Tour. After that, I will be at SQL Day Poland. May 11th through 13th.

I will be teaching advanced T-SQL pre-cons at both of those. Shockingly, I will then be at Data Saturday Croatia. June 12th and 13th. Where I will also be presenting an advanced T-SQL pre-con.

You might sense a pattern forming here. I really enjoy teaching about advanced T-SQL. It’s kind of fun.

And then, I don’t know. Barring extreme world events. I will be at Pass Data Community Summit in Seattle, Washington. November 9th through 11th.

So you can come… You can go to there. And we can hug in the rain. And I don’t know. There’s that bridge that everyone puts gum on. It’s gross. Yeah.

I don’t know. It’s… Yeah. Anyway. It is the end of April. We have… But… This today and tomorrow’s video, really. And before I debut my image for May.

And I’m going to tell you. It’s a weird one. All right. It’s going to get weird. But for now, let’s go into SQL Server Management.

See, I got a little lost there. I was like, wait a minute. What am I doing? Oh, yeah. Computed columns. Computed columns.

I enjoy them very much. They have very good… They do very good things for very bad queries, I find. People often overlook them.

And even more often, they misunderstand them. Just like every other feature, there are tradeoffs and a bunch of limitations and, you know, some general difficulties that are rather annoying.

One of the biggest things that I get annoyed with when it comes to computed columns is not being able to team them with filtered indexes in the way that I want, where you can’t have a computed column in the filter part of a filtered index.

You can put it in the key or the includes, but you can’t put it in the filter definition, which, you know, I mean, to an extent, I understand. Because that index would just constantly be morphing in strange ways, depending on when you update things.

But, you know, a fella can dream. Maybe Microsoft can have Copilot fix that, right? Where’s all the code? I don’t know. Where’s all that AI development, right?

I don’t know. Anyway. You can run into some other problems. Those are the problems we’re going to discuss today around determinism, often requiring convert to be used with a style specified or a culture specified, depending on how you like to refer to that optional third argument to the convert function.

And, of course, one of the biggest misunderstandings is people seem to think that if you don’t persist a computed column, you can’t do anything useful with it. And that is just a bunch of hogwash.

Even a non-persisted computed column can generate statistics, can be indexed, and is often a path of least resistance to getting that column added, at least in the near term. There are some upsides to persisting computed columns, but we’re not talking about those today.

I talk about those in the full course content. So, you know, you’ll just have to buy the course and do that. Or you can come to one of my pre-cons and get the course for free. I’ll put the link in the description. It’s crazy how that works out, isn’t it? But much like any other column in a database, computed columns do generally tend to become much more useful when you index them, because that’s when SQL Server can, like, have that data in a useful order for seeking and searching and all that other stuff that it does, right?

Even putting things in order has amazing benefits in a database. Who would have guessed? But probably the most obvious use case for computed columns is generally to make some expression sargable, or provide better cardinality estimates for search predicates that would normally cause problems for us.

You can think about one of the most common ones. It’s like, you know, is null column zero or empty string or something like that. And you’re like, I can’t change the code. Crap.

But I can add a computed column to this table, and I can index it. And then SQL Server can say, oh, look, there’s a computed column on the expression is null some column something else.

And I have an index on it. I can use that index. Isn’t it wonderful? So that’s neat and life-granting sometimes. Anyway, just like with filtered indexes, there are some ANSI set options.

There are some rules you must follow in order to not hit errors when things are when queries attempt to modify the table that has computed column or a filtered index applied to it. If you don’t do this, you’ll be unhappy.

Things will start going bad for you. But non-persistent computed columns, of course, the value is calculated on the fly when the column is accessed.

There is almost no locking or overhead when you add them. The values are not stored on disk unless or until you index them. The expression may be well, you know, the expression not maybe the expression does get expanded into the query during compilation and expansion will happen even if the column is indexed.

So if you want to find out what all that means, again, that I cover that in the full course material. Persistent computed columns are a bit different. The value does get stored on disk just like regular columns or irregular column or irregular columns.

The space usage is equal to the data type size. It is updated when the base columns that make up the expression change, assuming that you have assuming that the computed column takes other columns into consideration in its expression because I guess I suppose you could you could not do that at all.

And the expression is still there. It is still expanded unless you use trace flag 176. So if you want to see that in action, again, you can buy the full course material. It’s not not rocket science, not like trace flag 176.

So the first thing we’re going to do is we’re going to see if this bug still exists in SQL Server because I do enjoy showing people this bug. So we have a table here called Ono and Ono has a persisted computed column in it that is just ID times two.

And if we try to create a filtered index on that table, we will get an error and that error will say filtered index C cannot be created on table dbo.ono because the column CC and the filter expression is a computed column.

Rewrite the filter expression so that it does not include this column. That is a very straightforward, very good error message. But this is the part.

This is the fun testing part. Did Microsoft fix this bug yet? If we create a table like this with that computed column and with a filtered index in line in the table definition, can we still do this or will this throw an error?

It still works. Microsoft has still not fixed this bug. I’ve been telling about this for years and they just refuse to do anything about it. And what’s funny is now if you try to query that table, you just get a bunch of indexes, a bunch of errors, not indexes.

You have an index. You have an index that causes the error. And look at this one. Cannot. Oh, there’s more. Sorry about that. Look at that.

It keeps going. It just keeps red texting me. Cannot retrieve table data for the query operation because the table ono schema is being altered too frequently because the table dbo.ono contains a filtered index or filtered statistics. Changes to the table schema require a refresh of all table data.

Retry the query operation. And keep in mind, there are no rows in this table. I didn’t put any rows in here. All I did was try to query.

Even if we just try to get a count from this table, SQL Server will say, invalid column name, cc. You’re like, what? What are you talking about? There’s a column? Cc?

What? You’re crazy, SQL Server. What’s going on with you? But anyway, back to computed columns. Again, probably the most accessible use case for them is to like when you see a query that’s just like function column or column plus column or something like that.

You can read the green text up there. When you see that stuff in a query, you’re like, man, that sucks. And especially like the column plus column stuff or the column plus value or something like that.

You’re like, it’s like substring, left, right, replace, upper, lower, things like that. You’re like, man, that’s annoying. How am I going to index for that?

You can’t. Unless you compute that expression and then you index it. But when it comes to creating computed columns, especially ones with dates involved. Then you have to be very careful and you have to make sure that you create them in a way that is deterministic.

So I’m going to teach you about that today. So we have this index in place on the post table on creation date and last activity date. And we have query plans turned on because we are smart professional presenters.

We don’t screw these things up. We’re not amateurs anymore. We’ve graduated. We got our presenter six sigma delta phi black belt thing. And the first thing we’re going to do.

Is alter the table posts. And we’re going to add this computed column. The computed column is going to say, hey, if last edit date is null, replace it with 1900 0101. And then we’re going to look and see if this column is indexable.

And we can use this kind of neat. Let me actually do this zoomie thing. And once again, thank you, Aaron Stellato and company for making SSMS zoomie wheel things for the results. So I don’t have to do zoom it on all this stuff.

And we can use this. Column property function. And we can feed it the table name and the column name. And we can ask this question of the function is indexable. Well, are you?

And of course, this comes back with a zero because it is not indexable as written. So this is this is not going to work out for us. We do not. We cannot use this column because it is not indexable. And we want our column to be indexable so that it gets all of the attendant superpowers that an index column gets in SQL Server.

Of course, cast. Will not help us here. Cast is the crappiest function in all of SQL Server.

Convert is way better. And this is still not indexable. So we bid you adieu cast. You stink.

And I hate people who use you. Even past past the past versions of myself that have used that used cast. I hate them too. So now we’re going to try this with convert. But notice we only it’s convert with only two arguments.

I mentioned earlier that there is a third. Convert argument that is off that is required for this that will make things work for us. So even just using convert alone does not solve this problem.

It is still not indexable. And if we try to create an index SQL Server will say no. We cannot do that because it is non deterministic. Damn the gods.

All right. So let’s try this again. But let’s do this the right way. So now we are going to say. We’re going if this is null. If last edit date is null.

Please replace it with this string converted to a date time using this culture one one two. And I think I already did this. So this is going to throw an error probably.

Oh no I didn’t do it. Now it’ll throw an error. There we go. But now if we say are you indexable or is indexable. We come back with the one. So the big the big win here was not only using the convert function.

But also feeding it this. This this this culture or style third argument. So now that it is indexable we can create this index. And and all of all of a sudden things things get better for all our queries ever.

That were you saying where is no last edit date greater than nineteen hundred or one or one or something. All right. So whatever you were doing before that stunk and made life terrible.

Now SQL Server can have an easy time with that. Right. Now again one thing that SQL Server is relatively good at. Is expression matching.

Right. So if we say something like this. Right now we can. Now we’re going to actually have a query plan. I don’t know. Did I turn those off. I don’t know what I did.

I lost track. But now we say hey SQL Server. Can you use that. Can you use that index and SQL Server says well yes I can. I can seek directly into that index. Thank you for asking. Is indexable.

Good stuff. We are indexable. But we do not want to be slack with our data types. Right. We do not want. If we if we write queries like this. We deserve all the hardship trials and tribulations that arise in our lives.

We should do everything we can to make sure that when we are converting strings to dates times date times date time twos or whatever other temporal element we care about in here that not only do we use convert but we use convert with an appropriate style so that in a non ambiguous date format so that we do not hit any weird ambiguity issues. What. What we are doing.

What writing the query in this way does is remove ambiguity. Right. And it makes it indexable. We cannot index and ambiguity can we. And now this query will as well pay dividends and remove as we remove slackness and we will seek into the index that we created on that column.

So with that out of the way. Thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. And I will see you in tomorrow’s video. Bye. Where we will talk about something else. Probably T SQL related cuz that’s that’s what we’re that’s what we’re pushing out there lately isn’t T SQL all right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 62

SQL Server Performance Office Hours Episode 62



To ask your questions, head over here.

Chapters

Full Transcript

Oh, hello, Erik Darling here with Darling Data, your friendly host for this week’s episode of Office Hours, in which I answer five questions that you, the greater SQL Server community, and what community is greater than the SQL Server community, have asked me since the last time, so we’ve got that going for us, don’t we? Down in the video description, you will find all sorts of helpful things, you will find ways to interact with me emotionally, intellectually, but most importantly, financially. You can hire me for consulting, you can buy my training, you can become a supporting member of this very channel, and of course, you can also find links to continue to ask me Office Hours questions, and if you like this content, if you feel like this content would make a meaningful addition to someone’s life, you can like, subscribe, tell a friend, help the channel grow in other profoundly meaningful ways that are not an emotional, intellectual, or financial connection with me.

I mean, you could perhaps imagine your own set of those connections through watching these videos, but one generally will not exist. I will not acknowledge it publicly. If you need SQL Server monitoring that doesn’t suck, for free, open source, no email sign-up, no phoning home, no weird telemetry data, just all the T-SQL collectors that I would care to run on a server to monitor its performance, cool stuff, weight stats, blocking, deadlocks, top queries, and all of this goes into a very pleasantly colored dashboard for your consumption.

And if you are a fan of our new robot friends that have sprung up from vast data centers around the world, there are optional built-in MCP server tools so that you can just ask questions of your performance data and not have to worry about the robots running ramshackle or repshow, whatever that word is. I forget that word. Sometimes they suck.

Sometimes they just jump out of my brain while I’m talking. It happens. The robots trampling all over your SQL Server, you can do that. It’s pretty okay. This will probably be your last chance to buy tickets for my pre-con down in Jacksonville.

That will be on May 1st. When this gets published, it should be April 28th, so you should hurry up and buy your tickets if you care to see me talk about advanced T-SQL live and in person. All attendees of the pre-con will get access to the Learn T-SQL with Eric material.

That’s a whole lot of hours of recorded material, and it’s way more than gets covered just in the one-day class. Other places in the world will be coming up very soon. Wow, it’s just creeping right up on you.

May 7th and 8th, I will be in Chicago for Pass on Tour Summit East in the Midwest. Not the actual East. The actual East would be like New York.

Or Boston or something like that. We’re in Chicago, though, so it’s the East of the Midwest or something like that. I’ll also be at T-SQL Day in Poland, May 11th and 13th. I have advanced T-SQL pre-cons at both of those events, so you should get your butts in seats for those.

Because who knows how much longer we’ll have to talk about advanced T-SQL for. You never can tell. After that, I will be at Data Saturday Croatia, June 12th and 13th.

As well as Pass Data Community Summit taking place in Seattle, Washington, November 9th through 11th. I don’t think that full tickets are on sale for that one yet. But as soon as they are, you should line right up and go to Seattle.

Just go live there. But with that out of the way, it is still April, and we are still baseballing. The Mets and the Red Sox are not off to a very good start.

They are having some trouble. But that’s okay. Because we still get to watch and sort of enjoy baseball. And, you know, win, lose, or draw, you still drink the same amount.

So that’s the kind of wonderful thing about sportsaholism. You can be a sportsaholic and everything is just all the same. Anyway, I believe we need to go over to this Excel file.

And we need to answer some questions here. Hey, Eric. Hey, you.

How are you doing? I send this blurry-eyed after a painful evening in the SQL mines. Well, that sounds terrible. Are you a dwarf? How did you end up in the SQL mines? Without index tuning.

Okay. Do you know any ways to encourage an update to a table that participates in an indexed view to only seek the members of the view? Batching works.

But if I get the batching even slightly wrong. Then SQL will scan 70 million rows rather than seek for like 500. I would probably try doing a little. So I’ve covered this technique in a few different videos.

I can’t precisely recall the titles of them. But it might be like around like query transformations and stuff like that. But what I would probably do is.

You hit the index view with a sub query and a no expand hint. So that when you are updating the table. It is aware of the indexed view.

And it will hit only members of that. That would probably be the way that I would encourage it the most directly. But, you know, this is one of those things where there are many ways that you could have asked this question.

That would have involved sharing a little bit more information. So that I could give you a more intelligent answer. Because perhaps there are many questions that I have as a follow up for this.

But, you know, you could have pasted the plan. You could have gone to plans.erikdarling.com. And you could have shared the plan via that.

There are many ways with which you could have presented me with sufficient information to give you a better answer. But I would probably like combine. So like as far as getting the batching right.

Like I would probably want to do like where the, you know, the key. So like whatever ends up being the unique clustered index in the index view. I would say like, you know, update table select top.

You know, however many batch rows you want to do. From the index view with a no expand hint. So that SQL Server only.

The SQL Server is better able to understand what your goal is. That would be the way that I would first attempt it. But it’s a little hard to give you much more there without seeing some plans or some T-SQL.

Do you and Brent Ozar argue about logical reads being important, not important for query tuning? Nope. I’ve never had an argument with him about that.

I just disagree completely. There’s nothing to argue about. I’m familiar with cross apply and select top one. Good for you.

And I’m familiar with row number filtered with RN equals one. Well, that all depends on what you alias row number is, isn’t it? Alias it is. All sorts of things, right?

In your optimizer rules, I wish SQL Server had video. You show off cross apply into row number that you later filter to row number equals one. How do you decide when to prefer cross apply into row number?

Well, I look at the query plan. If I am unhappy with the performance of top one and I am unhappy with the performance of row number, I may sometimes combine them. And then you often get the best of both worlds where you get a top above the row number or whatever is generating the row number.

It could be a segment in the sequence projected row mode. It could be a window aggregate in batch mode. It all depends on the query plan and which one runs the best.

That’s why we have these alternatives to explore. It should be fairly obvious right now that I care most about what makes the query run the fastest. So if one of those seems a little too slow for me, then I’ll try the different variations there.

There’s not really a rule that I have beyond that. Which weight stats do people freak out about? That usually don’t matter much.

You know, honestly, it’s kind of like you have two camps. You have people who have been like mentally conditioned to freak out over like, you know, like CX weights. And then you have people who have no idea at all what any weight stat really means or when to freak out about it.

So, I mean, probably not. I mean, probably this, I think the CX weights are probably the most obvious answer there. People don’t seem to freak out too much about any other weights.

You know, I can’t really think of anything that, you know, like don’t matter much. I mean, any weight can matter in sufficient quantity. But usually it’s the parallelism related weights that people are like, there’s too much of them.

I’m like, okay, well, compared to what? I don’t know. It’s always fun to sort of get into their heads and start asking like, well, why do you think there’s so many of them?

Could it be because, you know, your queries suck and you have no good indexes? Things like that. Like perhaps you have not changed your parallelism settings at all.

Who knows? There’s all sorts of funny things that can happen on your way to, in a friendly, reasonable rated consultant fashion, get to the bottom of these mysteries with people.

Why do you feel the weight stats are too much? Are the weight stats in the room with us right now? There are many ways you can go with that. But usually the CX weights are the ones that people lean on the hardest as being a problem.

You know, there’s a lot of dumb blog content out there about them. You know, a lot of the scripts that people use to measure weight stats kind of only give you like, what percentage of weights are the various weights?

They don’t really give you context. Like how long has the server been up? How many hours of those weights have accrued compared to how long the server has been up? Things like that.

So there are many things that, you know, just, you know, matter a lot contextually that people don’t take into account. Anyway, I think that’s probably good enough here. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I’ll see you in tomorrow’s video. I have not quite yet decided what I’m going to talk about. But when I do, boy, are we ever going to talk about it. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Filtered Indexes

Learn T-SQL With Erik: Filtered Indexes


Chapters

Full Transcript

Erik Darling here with Darling Data and in today’s video we are going to continue our T-SQL learning adventure where I show you little snippets, tidbits, whimsical little chunks of the full Learn T-SQL with Erik course material. The full course is of course available at my site training.erikdarling.com and there’s a link down in the video description for you, you special son of a gun you, to get $100 off the price of the course. Also down in the video description, all sorts of other helpful links for you to engage in consensual monetary exchanges with me.

You can hire me for consulting. You can buy Learn T-SQL with Erik or any of my other courses. You can become a subscribing, what do you call it, member of the channel.

And then if you want, now taking place every Tuesday, you can ask me office hours questions for free. And of course if you like this content, but perhaps not enough to give me any money for doing any of these songs and dances for you. I understand, you know, can’t all be daddy, rich daddy warbuckses.

You can always like, subscribe. And tell a friend. That is a fantastic way to make this channel grow and expand to something bigger and better and more beautiful than it was before.

If you are a real cheapskate, and I’m just kidding, I understand. Frugality is an important thing. I think Benjamin Franklin said some stuff about that.

I don’t know what any of it is though, because I dropped out of high school. But what I did do pretty well at was building a free SQL. Server monitoring tool.

All of the stuff that I care about monitoring for performance in a SQL Server. Totally free, totally open source. No weird email, phone home, telemetry, any of that stuff. You can just download it.

Run an executable. And start getting great performance feedback about your SQL servers. And then if you want to talk to, or have your robot friends talk to the performance data. And just the performance data.

And not go in there and start dropping tables and indexes. And writing wacky queries. And figuring their life out on your time. I have optional, opt-in, built-in MCP servers. So that you can start having your robots at least tell you what’s going on in the performance data.

I don’t know if I take their advice for fixing things. Maybe, maybe not. Depends, but you know.

What can I tell you? You’re not going to pay me to do it if that’s all you care about. Getting out into the world. Boy, I love going places. Doing things.

Seeing people. You know, amongst my adoring fans. Hopefully not getting shot. I’ll be in Jacksonville, Florida. May 1st and 2nd for Day of Data Jacksonville.

I’ve got an advanced T-SQL pre-con there. As you might be able to tell by the URL. So buy those tickets. Because they’re flying out the door fast.

And I would hate to see you lose your seat. Lose your seat to an alligator or a snake or a bottle of blue Gatorade or something. So, you know. Don’t, don’t, don’t wait.

Benjamin Franklin probably said something about waiting too. Hesitating or something. I don’t know.

Again, I’m unclear on a lot of things, historical. But then, after that. I will be in Chicago, Illinois for Pass On Tour. Again, advanced T-SQL pre-con. Buy your tickets.

I will be at SQL Day in Poland. 11th through 13th. Again, advanced T-SQL pre-con. Buy your tickets. And then, I don’t know. I come home. I take a shower.

I take a little nap. I do a whole crap load of work and don’t sleep for a month. And then I am back in Croatia for Data Saturday Croatia. Wow, a lot of Data Saturdays out there.

I’m still unclear on the difference between SQL Saturday and Data Saturday. But people, people like to fight about that. I don’t get involved with those things though. Just like Benjamin Franklin said not to do. So, I will be there.

Again, advanced T-SQL pre-con. Buy your tickets. And then I will be in Past Data Summit Seattle. Past Data Seattle Summit. In Seattle, Washington.

Not the other Seattle. In Zurbistan. November 9th through 11th. So, you know. That will be nice too.

I’m going to stay in a hotel far away from everyone else. But it is still April. What can I say? I’m getting ahead of myself.

All that talk about May and November and June and whatnot. We are still Apriling here. So, let’s April ourselves. And let’s talk a little bit about filtered indexes.

Now, you just can’t talk about indexing in SQL Server really without talking about filtered indexes. They are a very, very important thing. Conceptually, they are just not that hard to figure out.

It’s an index with a where clause. It only indexes some of the data. It qualifies for the where clause. I don’t know. Like the benefits of that just seem rather apparent to me.

But filtered indexes, of course, function best when they meaningfully reduce the total number of index rows compared to unfiltered indexes. Or when they function to improve statistical information about a segment of data that otherwise might get lost in the shuffle a little bit during the normal course of building a 200-step histogram. Filtered statistics can, of course, do the same thing without the extra index to maintain.

So, those are always, always on the table. But, you know, also having a smaller data source to read from is typically considered an advantage to most people. Especially people who are unnaturally obsessed with logical reads.

Curse you all. There are some rather well-known oddities and limitations with them. And perhaps some less known ones as well.

But, you know, like stuff you can’t do in filtered indexes. You can’t use an OR clause or any other sort of disjunction. You can’t use NOT IN. They can’t be used as a candidate foreign key index.

You can’t create them on index views, which I always found stupid. They can’t reference other tables, which I guess makes sense, right? How would that even look?

That syntax would be weird. You can’t filter on a computed column. Now, you can have a computed column in, like, the key of the index or the includes of a filtered index. But you cannot have it in the filter definition of the filtered index.

Which, again, always found that rather annoying. But, I guess, you know, whatever. They don’t play well with local variables and or formal parameters. And, by extension, they don’t play well with the forced parameterization database setting enabled.

That’s this one right here. You can’t use the IGNORE DUKE KEY creation option with them. And Merge has several issues with filter indexes that Microsoft insists are by design.

And not just out of pure laziness. You also, if you are going to use filtered indexes and you don’t want your data modifications to fail. You need to pay very careful attention to the ANSI set options that your client uses while connecting to SQL.

SSMS just happens to satisfy all these. But many ORMs and other client drivers do not subscribe to this set of core beliefs. So, you might just have a bad time there.

But, anyway, let’s create this index. Oh, I’m not even connected to SQL Server. Isn’t that wild? It gets dusty here sometimes.

And you would think that a filtered index like this would be very, very easy for SQL Server to use. But, alas, if we run this query. And we say, SQL Server, give me a count of comments where the user ID is null.

SQL Server says, you know what? Well, use a different index. Not our filtered index.

That’s not confusing. Not confusing at all. But, hey, anyway. The joys of overlapping demos. You can see that this index up here is called post ID filtered. And this is not.

So, it did not use our filtered index. It said, I got bigger ideas. But if you force SQL Server to use the filtered index, you start to get a sense of why it may have not used that index. Because SQL Server is like, oh, well, you know what?

I’m just not sure. I’m just not really sure what’s in there. So, we scan the entire filtered index. Right?

And then we do a key lookup. And why on earth would we do a key lookup here? Why on earth would we be doing this? Well, SQL Server applies this predicate in the key lookup. And it says, well, I got to go back to the clustered index and figure out if user ID is null.

Even though I feel like we were fairly explicit in our filtered index definition. This will only be indexed where user ID is null. But alas, alas, we do not have that.

So, another kind of funny thing you can do is run this query. Right? And, oh, well, we use our filtered index now because the cardinality was low enough.

Right? So, now SQL Server naturally was like, oh, I’ll do that. But it still has to go back to the clustered index and figure out if user ID is null.

So, the only stuff that could possibly be in here is where that is null. I don’t have a great explanation for that one. However, if we change our index definition.

And this is one thing that is in every single index analysis script that I’ve worked on. I’m not claiming credit for saying that I put it in there. Especially with SP Blitz Index.

But SP Index Cleanup also checks around this stuff. But… If you have a filtered index and the filter definition columns are not at least in the includes, they should yell at you about that. So, if we change our index and we say, you know what, we’re going to include user ID in there.

And all of a sudden SQL Server is like, oh, yeah, well, by the way, now that I can figure that all out in one index, I don’t really have to worry about it. Now, there’s fairly well documented stuff that I said before about filtered indexes. It’s not playing well with local variables or parameters or anything like that.

And there’s all sorts of ways to get around that, right? So, like, we create this index on the users table where reputation is greater than or equal to 100,000. And we run this query and we say, SQL Server, go use that index.

And SQL Server, well, it doesn’t use that index, right? Because it uses an index called reputation. But it didn’t use an index called reputation filtered.

So, that’s kind of a bummer. But if we throw option recompile on there, then SQL Server all of a sudden is like, oh, well, it’s not, look at that. It still didn’t use our index.

Well, you know what? Again, the joys of overlapping demos. There is an exception to this, though. And that is if you are assigning a variable, a value to a variable like this, even with an option recompile hint, SQL Server can’t use the filtered index.

And the reason why it can’t use it is because assigning a value to a variable like this disables the parameter embedding optimization. So, be careful with that. If you find yourself with a filter in this situation, this very situation, at home or at work or wherever you run your queries from, what you can do is stick that result into a temp table and then assign the variable from the temp table if using the filtered index.

The filtered index is important to you because the recompile hint won’t fix that. Another neat thing you can do if you want to play all sorts of funny tricks on SQL Server and you want to use a local variable or a parameter is you can hard code some stuff. So, remember our filtered index up here is on reputation being greater than or equal to 100,000.

You separate those zeros, it’s 100,000. And you can always write your query in a very funny way with some literal values mixed in. So, this first query.

It’s going to say where reputation is greater than or equal to reputation. And the reputation local variable is less than this hard coded 100,000 value. And then you can also say another union all because you need to be able to find both outcomes of this, right?

And you can say where reputation is greater than or equal to reputation. And reputation is greater than or equal to 100,000. And reputation, the local variable, is greater than or equal to 100,000.

So, you can get a very interesting query plan out of this with all sorts of startup predicates, right? These startup filters right here. Normally, I would see filters in a query plan and say, curse your eyes.

I hate you. But these ones are kind of interesting because they have startup expression predicates on them. Meaning that rows don’t pass any of this.

Rows don’t go here unless they pass this filter, right? So, you can see there’s zero action going on in this one. That is a zero and that is a zero. But down here, right?

This is where we’re saying, ah, is it greater than or equal to 100,000? And SQL Server says, yes. And it goes and gets one row from our filtered index down here. So, that is kind of a neat way to work around some of the limitations with filtered indexes.

I do apologize for the weird demo thing up there. Again, overlapping demos. I guess I neglected to drop indexes before starting off on this adventure.

But I promise. In the actual course material, I was way more prepared. So, I’ve got that going for me. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you not tomorrow, probably. But I will see you on Tuesday for office hours. All right.

I’ve got a four-day weekend. All right. Go me. Everything is coming up, Eric, for once. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

The Quiet Death of Foreign Keys

The Quiet Death of Foreign Keys


Chapters

  • 00:00:00 – Introduction
  • 00:02:35 – Unique Constraint and Foreign Key Relationship
  • 00:04:47 – Disabling the Unique Constraint
  • 00:06:12 – Orphan Rows and Schema Changes
  • 00:08:04 – Retrusted Zone and Index Rebuild
  • 00:09:26 – Cautionary Advice

Full Transcript

Erik Darling here with Darling Data, and in today’s video, we are going to be talking about the quiet death of foreign keys. Shh, go to sleep. It was something that happened actually while I was working on SP index cleanup, and I found it amusing enough to record a video about, because it’s not something that I had run into before.

And it also resulted in me having to fix something in SP index cleanup, so here we are with that. As usual, down in the video description, you will find all sorts of helpful links to engage in monetary exchanges with me. You can hire me for consulting, you can buy my training, you can become a supporting member of the channel for as few as $4 a month.

My god, what a tremendous value that is. And if you’d like, you can also ask me office hours. Questions now taking place every Tuesday.

And of course, if you enjoy this content, and to some degree, perhaps not in a way that encourages you to express yourself monetarily towards me, you can always do nice things like like subscribe and tell a friend. If you really like free stuff, you should grab my free SQL Server performance monitoring tool. It’s totally free, open source, no email, no phone home, no nothing funny.

No weird stuff. No weird hijinks or shenanigans in there, despite those being two of my favorite bars in Times Square. But it’s just a bunch of T-SQL scripts that run, collect data, put it into pretty charts and graphs, and help you get root cause analysis of your SQL Server performance issues.

It’s also got built-in robots that are opt-in only, so there’s just a bunch of MCP tools that are well-trained on the collected performance data, and you can get really good summaries, at least summaries of what happened. Perhaps not high-quality advice like you would get from a young, handsome consultant like yours truly, with reasonable rates, of course.

But it can at least kind of give you a sense of what’s going on in there. I will be out in the world. I will be in Jacksonville, Florida, doing an advanced T-SQL pre-com.

The link is here. You can also just search for Day of Data Jacksonville, and you will find me. And links to that.

And links to come hang out with me for a Friday and learn a whole lot of stuff about advanced T-SQL. I will also be attending some other events spread around the country and the space-time. I will be at Pass On Tour in Chicago.

Again, advanced T-SQL pre-com there. So buy your tickets. Come hang out with me in Chicago. I will be taking one lucky winner to a Cubs game, if the Cubs are playing. And I’m not too tired.

I don’t know what the contest is yet, but you do have to show up to play. I will be at SQL Day Poland, May 11th through 13th. Again, advanced T-SQL pre-com there.

And then I’ll be home for a little while, I don’t know, mopping floors, brushing my teeth. I don’t know, trying to regain some of my health after being in Chicago and Poland and Jacksonville and who knows where else along the way. I think I have layovers in Amsterdam both ways, to and from Poland.

So you never know what’s going to happen. So we never know what’s going to happen there. It’s a wild city. And then I will be at Data Saturday Croatia, June 12th and 13th. Again, with a pre-com there.

Data Saturday, that’s a long URL. I wish Data Saturdays had friendlier URLs. And then I will be, of course, at Pass Data Summit in Seattle, Washington, November 9th through 11th. All right.

So it still being April and us still being in the thrall. And then we have a new baseball season. We will continue to show this image. I did spend some time making new images for, I think, up through October of this year.

And they came out really well. I can’t wait for May to come along so I can show you what crazy stuff got come up with for May. Those aren’t good words for people.

Anyway, let’s see here. We need to go to SQL Server Management Studio so I can show you a funny thing about foreign keys. So, again, this happened while I was working on SP index cleanup.

And I was testing some stuff because, you know, one thing that is made great by the robots is the ability to have them write a bajillion different tests. Because they don’t get tired and they don’t want to be like, man, that’s a lot of type. I don’t feel like doing that.

The robots will just write whatever tests you tell them to. You can say, write some code variables. Write some verification tests. Write some happy path tests. Write some adversarial tests. Write some tests that will make my code work better.

And they’ll do it. You know, you still have to mind them a bit. But they will do it.

You know, mind your prompts. So we’ve got two tables here. We’ve got a table called parent. And we’ve got a table called child. And we’ve got some folders popping up at the bottom for some reason.

And what I want to notice, what I want you to notice here. What I’ve already noticed is that we have some ID which is just, you know, kind of a useless primary key on the table. It’s an identity column.

And then we’ve got a column called parent ID. And parent ID has a unique constraint on it. And then we’ve got the child table down here. And that child table, in fact, does have a foreign key that references the parent ID column of the parent table.

All right. So I believe I created both of those. I can always run this again just to make sure we get error messages.

That’s a good sign. But then we can put a couple rows into parent like this, cool mom and cool dad. And then we can put a couple rows into child, the children of cool dad and cool mom up there.

And just to show you, before any of this happens, our foreign key. And once again, thank you, Aaron Stilato and team and company. I guess I’m not going to say thank you, Microsoft.

I’m going to thank the lovely people who make these things possible. We can zoom in. We can zoom in on our SSMS results without having to zoom in. We can also zoom in if we feel like it.

But this constraint is not disabled and is not not trusted, which means it is not disabled and it is trusted. So we just double-negative ourselves on that one. But look what happens if you disable that unique constraint on the parent table.

Now, normally, like if you were running something like this, you might not always, like, I don’t know. I mean, I guess this pops up. But it’s white text, right?

It’s not like red text. It’s not like, you know, error text. It’s just, ah, informational only. You know, you might see this stuff like a null value is eliminated because of an aggregate. And you’re like, ah, so what?

I meant to do that. Dummy, why do you have to tell me that every time? But what happens is SQL Server, let me make this a little bit more readable for you nice folks at home. Warning.

Foreign key. FK child parent on table two. Parent on table child referencing table parent. Why can’t we get a schema prefix in there? Microsoft, what schema are we in? Do you know?

Are you aware? Was disabled as a result of disabling the index UQP. So disabling the unique constraint on the parent table disabled the foreign key. Isn’t that wild, right?

So, like, if we look at that same query now after disabling it, we will see it is disabled and it is not trusted anymore. Right? It is.

It is an untrustworthy foreign key. It’s dirty. It’s been sullied. It’s been corrupted. It is not in a good place. And now we can put whatever rows we want in child. Like, we don’t have a matching row for this.

That foreign key is no longer checking to see if anything useful is happening. And we have an orphan row in our parent-child relationship. We’re just letting orphans in.

Next thing you know, we’re just letting anybody. Right? It’s crazy. Even if you rebuild the index on the parent table, the foreign key does not automatically come back. Right?

It is still disabled and it is still not trusted. And if you remember, you know, I learned, I first learned of this syntax from Kendra Little. It’s the double check on the check constraint.

Right? The check check. So if we try to run this and we say, hey, we need to do, we need to get this foreign key back in action. We will find that this foreign key has some trust issues.

Right? And the trust issues arise from the fact that we have that orphan row in our child table. Or orphan or child 9999.

And so we are still in a screwy place. We would need to delete that child before and then we could re-enable everything. Right?

So we are now back in the retrusted zone. And I’ll just run this to let you see. It is not disabled and it is not not trusted. Okay. So. Be careful out there.

If you are disabling indexes and, you know, you are not checking to see if foreign keys exist between tables. And that index happens to be unique. You might turn off your foreign keys by accident.

Which you don’t want to do. And now I protect against an SP index cleanup. So you can go grab that and feel a little bit warmer and a little bit fuzzier about running it. And getting rid of all those crappy indexes that idiots have created over the years.

And I don’t know. Maybe someday you’ll say, you know what? I want a young handsome consultant with reasonable rates like Erik Darling to come look at my indexes. And maybe someday you’ll drop me a line.

Anyway. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I’ll see you in tomorrow’s video. Where we’ll talk about something about T-SQL I assume. Alright.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 60

SQL Server Performance Office Hours Episode 60


Chapters

  • 00:00:00 – Introduction to Performance Monitoring
  • 00:02:59 – Why You Don’t Need Expensive Third-Party Tools
  • 00:04:56 – Filtered Indexes and Their Safety
  • 00:07:31 – Logical Reads vs. Query Performance
  • 00:08:48 – Nested Loops Joins Explained
  • 00:10:01 – Office Hours Wrap-Up

Full Transcript

Erik Darling here with Darling Data and we’re going to try a little blog experiment because I feel like it and it’s my world and I just live in it. I’m going to be cutting content down to three days a week, five day a week thing. It’s a little bit much with my schedule right now.

So Tuesday, Wednesday, Thursday will be office hours and then a couple posts about or videos about whatever. And then Monday and Friday, they’re going to be kind of floater days. Sometimes I might write something.

Sometimes there might be extra videos or content on those days. But from here on out, we’re going to do three days a week instead of five days a week. So the content stretches a little bit further because like I said, times are busy right now.

Anyway. So it is Tuesday, which is the new official office hours day. So you’re all just going to have to get used to that.

Same as always, though. Down in the video description, good lord, down in that super helpful video description crafted by the most helpful human being in the world, me, or I guess maybe Claude. I don’t know.

You’ll see all sorts of helpful links where you can interact with me in various monetarily beneficial ways. Like you can hire me for consulting. Or buy my training. Or become a supporting member of the channel.

All very valid reasons to click on links as far as I’m concerned. But you know, you can also do some free stuff like you can continue to ask me Tuesday office hours questions. And you can like, subscribe, and tell a friend.

Also down in the video description, just like the slide says, free SQL Server monitoring. Totally free. Totally open source.

No email sign up. No phone home. No telemetry. Weird stuff. High drinks going on. I don’t care. I just don’t. The only thing I care about is putting big monitoring out of business. Because they don’t deserve to live.

It’s all the T-SQL stuff that I would run and collect if I were performance monitoring a SQL Server on my own. Sitting there hitting F5 all day. 30 windows open. Looking at wait stats, blocking, deadlocks, top queries, you name it.

It gets into all that business. And it puts it in nice pretty charts and graphs for you. And if you are a robot aficionado, you can use your best robot friend to talk to all of your collected monitoring data so that you don’t have to set it free out into your server to let it run crazy DMV queries and whatever else you might trust it to do or stop trusting it to do pretty quickly. Speaking engagements, I have, as far as I know, a whole bunch coming up.

But there is a surprise one down in Jacksonville, Florida. It’s Day of Data Jacksonville. It will be there May 1st and 2nd.

If you want to sign up for the pre-con, which I’m putting on May 1st, you can do that there. Advanced T-SQL. Good stuff all around. And then after that, I come home for like a day.

And then I have to do other stuff. I will be in Chicago May 7th and 8th for Pass On Tour. Again, pre-conning there.

Advanced T-SQL. You should buy tickets and go. It will be there. It will be fun. I will be at SQL Day Poland May 11th through 13th. That pre-con is on a Monday. I’ve never done a pre-con on a Monday before.

So that will be an interesting experiment. And then I will be at Data Saturday Croatia June 12th and 13th. And then I guess I’ll be home for a little bit, toddling around. I don’t know.

I’ll figure some way to make myself useful. And then I will be at Pass Summit in Seattle, Washington November 9th through 11th. So you heard all that here first.

But it is still April. It is not quite May. It is not quite travel season. My green screen is being a real turd today. So we’re just going to deal with a little fuzz at the bottom and near my arm over here. I don’t really know why that’s happening. But I don’t change anything. I feel like one of my clients when I’m like, nothing changed, but it’s all screwed up. That’s me right now.

Nothing changed, but it’s all screwed up. That’s fun. Anyway. Yeah. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. then a bunch of things in dead last that’s it so if you want performance monitoring you get my thing don’t don’t pay anyone $1,000 a server for a year for the garbage that they peddle to you it is bordering on fraud to call any of that SQL Server monitoring let’s see what we got here that did it did uh is option use hint query plan profile still useless you useless as the day it was born just like all those third-party monitoring tools I have never never had a good use for that one but anyway it’s like it was one of those things where it’s just like oh cool it’s just that by the time you like I go to run it it’s like I don’t need it you can just do something else our filtered index is safe for super high-right systems or is that asking for trouble well that depends on which side of the filter that right is happening on doesn’t it I mean you know uh if it’s if it’s absorbing rights it’s no different from another non-clustered index but if it’s not absorbing rights it’s a bit safer than isn’t it it’s like you’re not inserting updating if like the way that you’ve crafted your filtered index precludes it from some modifications and well it’s kind of that index is lucky day huh dodge that bullet so um I’m not sure where I’m not sure where where that question came from that’s an interesting one why do I sometimes see a seek logical reads than a scan all right so we’re gonna have the talk again no one gives a crap about logical reads they’re a stupid metric I guess if you were tuning queries in 2008 you might care about logical reads for some reason I’m not sure why but maybe maybe if you still have that mindset you could look at you would look at logical reads and say gosh this query does a lot a lot of logical reads I wish it did fewer logical reads and maybe you would do something about that but the the things that you should care about when you’re tuning a query from from a performance point of view you should care about CPU and duration no one knows how fast or slow a logical read is no one if you reduce logical reads uh no one’s going to come thank you for it if you make a query faster if all of a sudden a query goes from three seconds to 500 milliseconds or something just to throw some numbers out there someone might thank you but no one’s gonna say ah bang up job fellow you you fixed all those logical reads and hardly any of those now it’s dumb right uh but you know uh it’s drives you bad you can you can’t break people of these habits can you next thing you know like why is ple sometimes lower when SQL Server does a seek who cares it’s not or not worth spending your time on SQL Server huh I know I know that I know that one sometimes picks nested loops when the row count is huge what makes it choose that uh well you know the the usual garden variety stuff you might be doing something that inhibits uh cardinality estimation in some way you might be using a table variable or you might have local variables somewhere in there uh you might be suffering some other form of cardinality mis-estimate that would uh that made the tricked SQL Server into uh doing uh nested loops over a different type of join and and sometimes you you might have written your query in a way that excludes certain join types from being available to the optimizer for example both hash and merge joins require at least one equality Predicate and if you don’t have at least one of those if your joint let’s just make it easy let’s just say uh you want to know if like a date column in one table is between two a date column or two in another table uh there’s no equality Predicator it’s just it’s greater than less than maybe with an equal greater than new equal there but uh sql server cannot use a hash or merge join in those cases so that that might be another reason um but if you if you really want to like you know show me show me show me how you do that trick uh you can you can upload your query plans to plans.erikdarling.com uh it’s it’s my site for getting free query plan analysis right in your browser it all runs client side but you can you can optionally choose to share those plans so if you wanted to share a plan with me and you wanted to say ask me a question about a specific query plan and say why is sql server doing a nested loops join here that would be a very adequate way of doing it so you could try that on for size if you’d like but anyway it’s tuesday i’m not used to that uh and and we we have answered five questions and so office hours is done now where i’m gonna go do something else perhaps perhaps perhaps who knows what i might do anyway thank you for watching i hope you enjoyed yourselves i hope you learned something and i’ll see you in tomorrow’s video where i’m going to show you a little something a little bit funny about foreign keys all right thank you for watching

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Contributing To My Open Source Projects

Contributing To My Open Source Projects


Chapters

Full Transcript

Erik Darling here with Darling Data. And in today’s video, I’m not going to talk about SQL Server. Specifically, I’m going to talk about contributing to open source projects, which is a subject that has become nearer and dearer to my heart. Sort of like the metal shards that Iron Man is desperately trying to keep away from his. But the reason why I wanted to record this is because I feel like people are a little too shy about contributing to them. And the reason why I say that is because, look, if you’re here, you probably know I’ve been working on a couple of SQL Server monitoring tools. We have the Lite version and we have a full version. And I’ve got this Performance Studio application, which is a portable SQL Server query plan analyzer. And I want to, of course, start by acknowledging that these applications would not be near as good as they are if it were not for the reception that the SQL Server community has given them. Both requesting, well, not both, requesting features, contributing code and reporting bugs. These are all very, very valuable things. The part that I want to talk about today is requesting features. Because I love a feature request.
It’s like, cool, this thing could get better. The thing is, when you request a feature, you’ve most likely got some vision in mind for it. You likely have what it would look like in a perfect world to you when you think of this thing. I don’t have that when you request it. And so when someone requests a feature, the very first question that I ask them is, hey, do you want to contribute the code to do this? Sometimes they say yes, and that is absolutely fantastic. The process for that is, of course, they, you know, fork the repo and they either do the work themselves or they have their little robot companion, the way I have my little robot companion do a lot of the coding work. And, you know, they submit a pull request for it. And then I have my robot companion review their code changes. Sometimes there’s a few rounds of back and forth between the robots getting things right. Sometimes there’s some additional testing that has to be done. But overall, it’s a pretty smooth process. And the reason why I ask that is getting back to what I just said. When someone has a feature request in mind, they likely have the best idea of what that feature should look like. Right? So when I ask that question, though, sometimes people say, no, I don’t know how to code or no, I’m not familiar with the code base. You don’t have to be with me. Right? You can have your robot companions work on stuff and do things. It doesn’t have to be handwritten because mine certainly isn’t.
Right? I’m not sitting there writing every initial code that goes into these applications because I’m not a front end person. I am not a C sharp person. I’m a database person. And I have a very good vision or rather I had a very, very specific vision of the tools that I wanted to build. And I use the robots to help me do that. Without the robots, none of this would be possible. Right? Like I’m not learning C sharp and XAML and all the other stuff. I’m like, I’m using these well-defined languages and I’m saying, hey, robots, use these languages to build around the thing that I do or the things that I know, which was essentially the original set of sort of data collectors that feed all the charts and graphs in the monitoring tools. Right? Like all the stuff that comes in here. These are all things that I had written scripts and procedures for to gather over the years. And these are all the things that I knew I wanted to see in pretty charts and graphs.
So that that’s really how the whole thing started. There are some GitHub repos out there that probably do a lot more mission critical work for like, for example, the Linux kernel. It’s a pretty important one that wouldn’t, that probably don’t accept pull requests written by our little robot companions. I don’t have that particular brand of, um, I don’t know, uh, I guess, uh, importance to, to these because, uh, at the end of the day, anything that comes in here is fixable and is not going to like bring down infrastructure. Right? Like it’s like, you’re not going to see like the electric grid go out because like you, you, because you, you submitted a code that has a bug in it. Right?
It’s just something that can get fixed and cleaned up and it’s not a big deal. This is pretty low pressure stuff. It’s a free SQL Server monitoring tool. And while I do want it to be bug free and I do want it to be a great monitoring tool that remains free and remains out there in the community for everyone to be able to use and not have to give oodles and gobs of money to these monitoring tool companies. Right? Like I have a very specific mission. Um, I’m not making money off this, right? Uh, if anything, I’m probably losing money off this.
Uh, but I get paid in what I call petty cash and that petty cash is every time someone cancels a contract with a monitoring tool vendor and starts using this instead, I’m a happy camper. So please, if you have features, if you have things in mind that would make this better, don’t let the fact that you’re not a coder or you’re not a programmer or maybe you’re just a DBA hold you back. You can use your robot friends to implement that feature, get it right the way you want it and contribute it. I’m totally okay with that.
You don’t have to know everything inside and out. You don’t have to know every line of code. It doesn’t have to be perfect the first time. I’m happy just getting the feature submission and having something new in here that is helpful to people. So if you’re out there and you’re, and you’re looking at the GitHub repo and you’re thinking, wow, there’s a lot of stuff, but man, if it had this one extra thing, it’d be great.
And you’re like, you want to contribute and like, you want to make the feature request. That’s cool. Like sometimes I’ll think it’s cool enough to pick it up and do it on my own or think it’s easy enough to pick it up and do it on my own. And maybe it’s like something that doesn’t require a lot of interpretation, right?
Like sometimes there are some things that come through and I’m like, oh yeah, no, I get what you want that I can do that. But there are other times when it seems like more of a, I don’t know, sort of like someone else had their own very, very perfect idea of something when it came to their mind. But maybe like the description of it wasn’t quite clear or maybe there’s some, you know, room for interpretation in there.
I want you to be able to contribute to this. I’m not holding anyone back. You want to use your robots? That’s fine, right? It’s, again, it’s okay with me. What I care the most about though, is making sure that this is a great tool. The thing about open source stuff, and this has been said by many smarter people than me and probably, actually, you know what?
I’ll just say it the way I’ve always heard it. Open source is free software, but it’s not free development. Every time someone makes a feature request, that’s time that I have to spend implementing it. And while I don’t mind that, it’s also something where if you have a little robot friend and you could implement it, it would be a great help for me.
Because then that’s time that I get to spend doing something else, right? Working on other things. Spending quality time with the people in my life instead of, you know, sitting there and prompting and trying to get things right and going back and forth a little bit. So, if there’s a feature you like and there’s a feature you want, cool, you can make the feature request because, again, I love great ideas.
But that doesn’t always mean that those great ideas are going to get implemented quickly or immediately. If getting them implemented quickly or immediately is important to you, you could consider supporting the project monetarily, which gives me a kind of a little kick in the butt to get things in there a little bit faster, but not a lot of people have chosen to do that.
And again, that’s okay. I get paid in petty cash. So don’t worry about that. But if it’s something that’s really important to you or if this tool is helping you or your company, you could also consider supporting it in that way. Anyway, that’s probably about enough for this. Thank you for watching. I hope you enjoyed yourselves. I hope you’re enjoying this free software.
If you have any, you know, questions, comments, concerns, you have any feature requests, you have any bug reports, you have any ideas, you have any questions about it, GitHub is wide open for you. GitHub is also free, right? You can do that stuff. Maybe you’ll have to sign up for an account there if you really want to do something, but I promise there’s no head biting going on.
Anyway, I’m not quite Linus level of grumpy yet. Maybe someday I’ll get there, but not today. Anyway, thank you for watching. Enjoy your weekend. I believe it’s Friday. At least that’s what the robot calendar is telling me. Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Query Plan Analysis In Your Browser

Free SQL Server Query Plan Analysis In Your Browser


Chapters

  • *00:00:00* – Introduction
  • *00:00:29* – Free Tools Overview
  • *00:01:13* – Supporting Memberships
  • *00:02:16* – Pre-SQL Server Performance Monitoring
  • *00:04:25* – Browser-Based Query Plan Analysis
  • *00:08:45* – Sharing and Exporting Options

Full Transcript

Erik Darling here with Darling Data, and continuing in my builder phase of life, you know, at least until the robots get too expensive. I’ve decided, I mean, if you’ve been, you know, dealing with me in any way, shape, or form over the past couple months, you’ll know that I’m working on two free open source tools. One of them is a full SQL Server performance monitor, and the other is sort of a portable query plan analyzer. I’ve taken it a step further, because some people seem allergic to downloading and trying things, and I’ve put the query plan, the free query plan analysis, right in your browser. Ta-da! I mean, this isn’t it. I’ll get to it in a minute. First, you have to listen to me talk about other things. Like, maybe you’re like, wow, Erik, all this free stuff, you must kind of know what you’re doing with the database. Maybe we should hire you for consulting, or maybe we should learn from you. There are links where you can do that down in the video description. It doesn’t hurt to click on them. No executable required. You can also, for as little, for as few, for as tiny little breadcrumb of $4 a month, become a supporting member of this channel.
Be real groovy, groovy-goolies. You can ask me office hours questions for free. I mean, at some point, I might have to start charging a quarter for those, but we’ll see how the economy picks up. And of course, if you, something else you can do for free, if you enjoy this content, and you enjoy the things that I do in my life, you can like, subscribe, and tell a friend, so that you get notified when I do other things, and your friends get notified when I do things, and then everyone gets happier together, I think. Speaking of which, pre-SQL Server performance monitoring. Gratis, or gratis, or whatever, however you say it. I don’t know. I can’t do voices.
You know, I tried a Christopher Walken like 15 years ago, and I just learned my lesson. Yeah, again, no voices. Totally free. Totally open source. No email, no phone home, nothing like that. It’s just all the stuff that a monitoring tool should monitor. A modern monitoring tool should monitor in SQL Server. You know, all the important stuff when you need to troubleshoot a performance issue. And of course, if you want to stay really, really modern, I’ve got built-in MCP servers, where you can just point them at your performance data collected over time, and space, and any other measurement you want to throw in there.
And you can have the robots go through your performance data, find problems, surface things, tell you about stuff, and you don’t have to lift a finger, aside from to say, you know, go look at this thing, right? Send them off to war, right? Anyway. Anyway. Again, happy surprise pre-con day. I will be in Jacksonville, Florida. Maybe it’s warm there, finally, because New York is still not warm. It is. What day is it? April something, and it is still cold here. I’m angry about that. So I’m happy that I’m going to Florida, where I’ll probably wear shorts. Not in front of a crowd, but I might wear shorts independently on my own. Anyway.
It’s a vibe, you know? Other places I’ll be, where it will hopefully also be warm by the time I get there. Chicago, Illinois. Dubious. May 7th and 8th. Not sure about warmth. SQL day Poland. May 11th and 13th. I don’t know. Poland sort of has a reputation for being cold.
I’ve only ever seen it in war movies, where I think it was supposed to look bad, so I’m going to just try to be optimistic about Poland being warm by then. Data Saturday, Croatia. June 12th and 13th. I’ve got to imagine that anything in that area of the world by June, I might be sweating by then. And then back to shivering at past Data Summit. New Community Summit. Date in Seattle. November 11th through 13th. 11th. 9th through 11th.
There are so many other 11th on there, I get confused. Anyway. Let’s talk about free query plan analysis right in your browser. Now this is not an advertisement for Microsoft Edge. Because I hate Bing and this browser is honestly, and it’s fine if you’re into that sort of thing. But if you go to plans.erikdarling.com, that is a brand spanking new subdomain under my website, under the Darling Data umbrella of websites, which is really just erikdarling.com, you will find this lovely interface.
And under that interface, you can either paste in plan.xml or you can upload a plan file. Now, I know what you’re thinking to yourself. There’s already a paste the plan. Yes, there is. But there’s a slight difference here. This one actually analyzes your query plans, and it does not, by default, save your query plans. All of the analysis is done in your browser.
It does not leave your browser. It does not go out into the world. There’s no plan file saved anywhere. You can do that, but you don’t have to do that. So, if you go to choose file, I’m just going to choose a query plan that I have saved immediately here.
You get back information about your query that looks a bit like you would get from the Performance Studio application that I’m building. Up here, when we look at this stuff, we can see some runtime stats about the query, right? We can see, well, oh gosh, golly and gosh, there was a missing index and no parameters were passed in.
We have this lovely graph down here of the wait stats, right? So, we can see, like, sort of graphed out what our query waited on. We don’t have to go digging through plan XML anymore. And then, down in this section below, we will have all the warnings generated by our query plan things that you should probably pay attention to, right?
If you look through this stuff, that’s all critical warning, yada, yada. And, of course, we get back our, well, there’s also the query text in there, but then we get back a lovely graphical representation of our query plan, right? Just like in Performance Studio, right? Stuff over here, look at all this good stuff.
And then, down at the bottom, we have the full text analysis, just like you have in Performance Studio, where I break down everything in a way that a human can hopefully understand. So, if you ever want an opinion on a query plan, and you are not allowed to maybe share it publicly on a site like StackExchange or StackOverflow, or maybe you’re not allowed to paste the plan somewhere where that plan is going to get saved off somewhere, then that’s one thing you can do.
There are also a couple other things in here that are neat. There are two buttons up at the top. One of them is to export HTML.
So, if you want to save off, like, all the HTML from here, you can do that. See, we open that up, and it looks just like it did. Well, I mean, this part’s a little bit different, because I don’t want to, like, export images to you.
But this part, so we have, like, the operator tree from here, and we have all the full text and everything else down there. So, if you want to share your plan, you can hit this button that says share. And when you hit this button that says share, you will be able to choose how long that plan stays saved on my little server for.
It is a secure server. I have done my best to make it unhackable. I’m not saying it’s unhackable.
I’m not challenging anyone in the world. I’m just saying I have taken reasonable precautions against anyone breaking in there. And it will say, you can do this, and you can have it expire after anywhere between one day and one year. And if you hit continue, it will say, are you sure?
You’re okay with this, right? Like, if you click twice, it’s not an accident. All right?
I’m just saying, like, once, oops, twice, that’s on you. But even if you’re like, I got confused, there’s a button up here where you can immediately delete it, right? If you say, ah, I was confused and drunk.
I was off my keister that day. I didn’t mean to put that there. You can, if you’re like, I just messed up, you can immediately delete it, and it goes away, right? And that URL, just the file is gone.
But then if you want to, but if you actually want to share it, then you will get this URL that you can share this plan with whoever you want, with all the analysis baked in, so that you don’t get, I don’t know, you don’t have to, like, explain anything. All the explanation is here. And you can say, how do I fix it?
What do I do? What did I do wrong? How did I end up in this place? How did I end up so doomed and damned with this query plan? And someone else can read it and look at it and say, here’s what I do. You can, you can even ask me, but that, that, that costs, if you want to do that, that’s this consulting link over here.
And that’s where you can, you can, you can get help with your SQL Server from, from me, right? Or with your query plans from me. That’s, that, that, that part is not free, unfortunately.
I cannot, cannot dedicate that much time. But this stuff I’m happy to do and to help people out, right? All right. Anyway, free query plan analysis, right in your browser, doesn’t leave your browser.
Sharing is optional, storing it is optional, and you get to choose how long you want to store it for, right? So you can even, you can even like share it with someone. And as soon as they get it, you can immediately delete the link.
Or if you, you’re uncomfortable with that, you can just export the HTML yourself and just share that with someone. So it doesn’t live on the internet, right? I don’t see anything from this, right? There’s nothing here that I get out of this.
All right. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you will, you will, you will start getting some free query plan analysis right in your browser. All right.
Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.