Blog

The Quiet Death of Foreign Keys

The Quiet Death of Foreign Keys


Chapters

  • 00:00:00 – Introduction
  • 00:02:35 – Unique Constraint and Foreign Key Relationship
  • 00:04:47 – Disabling the Unique Constraint
  • 00:06:12 – Orphan Rows and Schema Changes
  • 00:08:04 – Retrusted Zone and Index Rebuild
  • 00:09:26 – Cautionary Advice

Full Transcript

Erik Darling here with Darling Data, and in today’s video, we are going to be talking about the quiet death of foreign keys. Shh, go to sleep. It was something that happened actually while I was working on SP index cleanup, and I found it amusing enough to record a video about, because it’s not something that I had run into before.

And it also resulted in me having to fix something in SP index cleanup, so here we are with that. As usual, down in the video description, you will find all sorts of helpful links to engage in monetary exchanges with me. You can hire me for consulting, you can buy my training, you can become a supporting member of the channel for as few as $4 a month.

My god, what a tremendous value that is. And if you’d like, you can also ask me office hours. Questions now taking place every Tuesday.

And of course, if you enjoy this content, and to some degree, perhaps not in a way that encourages you to express yourself monetarily towards me, you can always do nice things like like subscribe and tell a friend. If you really like free stuff, you should grab my free SQL Server performance monitoring tool. It’s totally free, open source, no email, no phone home, no nothing funny.

No weird stuff. No weird hijinks or shenanigans in there, despite those being two of my favorite bars in Times Square. But it’s just a bunch of T-SQL scripts that run, collect data, put it into pretty charts and graphs, and help you get root cause analysis of your SQL Server performance issues.

It’s also got built-in robots that are opt-in only, so there’s just a bunch of MCP tools that are well-trained on the collected performance data, and you can get really good summaries, at least summaries of what happened. Perhaps not high-quality advice like you would get from a young, handsome consultant like yours truly, with reasonable rates, of course.

But it can at least kind of give you a sense of what’s going on in there. I will be out in the world. I will be in Jacksonville, Florida, doing an advanced T-SQL pre-com.

The link is here. You can also just search for Day of Data Jacksonville, and you will find me. And links to that.

And links to come hang out with me for a Friday and learn a whole lot of stuff about advanced T-SQL. I will also be attending some other events spread around the country and the space-time. I will be at Pass On Tour in Chicago.

Again, advanced T-SQL pre-com there. So buy your tickets. Come hang out with me in Chicago. I will be taking one lucky winner to a Cubs game, if the Cubs are playing. And I’m not too tired.

I don’t know what the contest is yet, but you do have to show up to play. I will be at SQL Day Poland, May 11th through 13th. Again, advanced T-SQL pre-com there.

And then I’ll be home for a little while, I don’t know, mopping floors, brushing my teeth. I don’t know, trying to regain some of my health after being in Chicago and Poland and Jacksonville and who knows where else along the way. I think I have layovers in Amsterdam both ways, to and from Poland.

So you never know what’s going to happen. So we never know what’s going to happen there. It’s a wild city. And then I will be at Data Saturday Croatia, June 12th and 13th. Again, with a pre-com there.

Data Saturday, that’s a long URL. I wish Data Saturdays had friendlier URLs. And then I will be, of course, at Pass Data Summit in Seattle, Washington, November 9th through 11th. All right.

So it still being April and us still being in the thrall. And then we have a new baseball season. We will continue to show this image. I did spend some time making new images for, I think, up through October of this year.

And they came out really well. I can’t wait for May to come along so I can show you what crazy stuff got come up with for May. Those aren’t good words for people.

Anyway, let’s see here. We need to go to SQL Server Management Studio so I can show you a funny thing about foreign keys. So, again, this happened while I was working on SP index cleanup.

And I was testing some stuff because, you know, one thing that is made great by the robots is the ability to have them write a bajillion different tests. Because they don’t get tired and they don’t want to be like, man, that’s a lot of type. I don’t feel like doing that.

The robots will just write whatever tests you tell them to. You can say, write some code variables. Write some verification tests. Write some happy path tests. Write some adversarial tests. Write some tests that will make my code work better.

And they’ll do it. You know, you still have to mind them a bit. But they will do it.

You know, mind your prompts. So we’ve got two tables here. We’ve got a table called parent. And we’ve got a table called child. And we’ve got some folders popping up at the bottom for some reason.

And what I want to notice, what I want you to notice here. What I’ve already noticed is that we have some ID which is just, you know, kind of a useless primary key on the table. It’s an identity column.

And then we’ve got a column called parent ID. And parent ID has a unique constraint on it. And then we’ve got the child table down here. And that child table, in fact, does have a foreign key that references the parent ID column of the parent table.

All right. So I believe I created both of those. I can always run this again just to make sure we get error messages.

That’s a good sign. But then we can put a couple rows into parent like this, cool mom and cool dad. And then we can put a couple rows into child, the children of cool dad and cool mom up there.

And just to show you, before any of this happens, our foreign key. And once again, thank you, Aaron Stilato and team and company. I guess I’m not going to say thank you, Microsoft.

I’m going to thank the lovely people who make these things possible. We can zoom in. We can zoom in on our SSMS results without having to zoom in. We can also zoom in if we feel like it.

But this constraint is not disabled and is not not trusted, which means it is not disabled and it is trusted. So we just double-negative ourselves on that one. But look what happens if you disable that unique constraint on the parent table.

Now, normally, like if you were running something like this, you might not always, like, I don’t know. I mean, I guess this pops up. But it’s white text, right?

It’s not like red text. It’s not like, you know, error text. It’s just, ah, informational only. You know, you might see this stuff like a null value is eliminated because of an aggregate. And you’re like, ah, so what?

I meant to do that. Dummy, why do you have to tell me that every time? But what happens is SQL Server, let me make this a little bit more readable for you nice folks at home. Warning.

Foreign key. FK child parent on table two. Parent on table child referencing table parent. Why can’t we get a schema prefix in there? Microsoft, what schema are we in? Do you know?

Are you aware? Was disabled as a result of disabling the index UQP. So disabling the unique constraint on the parent table disabled the foreign key. Isn’t that wild, right?

So, like, if we look at that same query now after disabling it, we will see it is disabled and it is not trusted anymore. Right? It is.

It is an untrustworthy foreign key. It’s dirty. It’s been sullied. It’s been corrupted. It is not in a good place. And now we can put whatever rows we want in child. Like, we don’t have a matching row for this.

That foreign key is no longer checking to see if anything useful is happening. And we have an orphan row in our parent-child relationship. We’re just letting orphans in.

Next thing you know, we’re just letting anybody. Right? It’s crazy. Even if you rebuild the index on the parent table, the foreign key does not automatically come back. Right?

It is still disabled and it is still not trusted. And if you remember, you know, I learned, I first learned of this syntax from Kendra Little. It’s the double check on the check constraint.

Right? The check check. So if we try to run this and we say, hey, we need to do, we need to get this foreign key back in action. We will find that this foreign key has some trust issues.

Right? And the trust issues arise from the fact that we have that orphan row in our child table. Or orphan or child 9999.

And so we are still in a screwy place. We would need to delete that child before and then we could re-enable everything. Right?

So we are now back in the retrusted zone. And I’ll just run this to let you see. It is not disabled and it is not not trusted. Okay. So. Be careful out there.

If you are disabling indexes and, you know, you are not checking to see if foreign keys exist between tables. And that index happens to be unique. You might turn off your foreign keys by accident.

Which you don’t want to do. And now I protect against an SP index cleanup. So you can go grab that and feel a little bit warmer and a little bit fuzzier about running it. And getting rid of all those crappy indexes that idiots have created over the years.

And I don’t know. Maybe someday you’ll say, you know what? I want a young handsome consultant with reasonable rates like Erik Darling to come look at my indexes. And maybe someday you’ll drop me a line.

Anyway. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I’ll see you in tomorrow’s video. Where we’ll talk about something about T-SQL I assume. Alright.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 60

SQL Server Performance Office Hours Episode 60


Chapters

  • 00:00:00 – Introduction to Performance Monitoring
  • 00:02:59 – Why You Don’t Need Expensive Third-Party Tools
  • 00:04:56 – Filtered Indexes and Their Safety
  • 00:07:31 – Logical Reads vs. Query Performance
  • 00:08:48 – Nested Loops Joins Explained
  • 00:10:01 – Office Hours Wrap-Up

Full Transcript

Erik Darling here with Darling Data and we’re going to try a little blog experiment because I feel like it and it’s my world and I just live in it. I’m going to be cutting content down to three days a week, five day a week thing. It’s a little bit much with my schedule right now.

So Tuesday, Wednesday, Thursday will be office hours and then a couple posts about or videos about whatever. And then Monday and Friday, they’re going to be kind of floater days. Sometimes I might write something.

Sometimes there might be extra videos or content on those days. But from here on out, we’re going to do three days a week instead of five days a week. So the content stretches a little bit further because like I said, times are busy right now.

Anyway. So it is Tuesday, which is the new official office hours day. So you’re all just going to have to get used to that.

Same as always, though. Down in the video description, good lord, down in that super helpful video description crafted by the most helpful human being in the world, me, or I guess maybe Claude. I don’t know.

You’ll see all sorts of helpful links where you can interact with me in various monetarily beneficial ways. Like you can hire me for consulting. Or buy my training. Or become a supporting member of the channel.

All very valid reasons to click on links as far as I’m concerned. But you know, you can also do some free stuff like you can continue to ask me Tuesday office hours questions. And you can like, subscribe, and tell a friend.

Also down in the video description, just like the slide says, free SQL Server monitoring. Totally free. Totally open source.

No email sign up. No phone home. No telemetry. Weird stuff. High drinks going on. I don’t care. I just don’t. The only thing I care about is putting big monitoring out of business. Because they don’t deserve to live.

It’s all the T-SQL stuff that I would run and collect if I were performance monitoring a SQL Server on my own. Sitting there hitting F5 all day. 30 windows open. Looking at wait stats, blocking, deadlocks, top queries, you name it.

It gets into all that business. And it puts it in nice pretty charts and graphs for you. And if you are a robot aficionado, you can use your best robot friend to talk to all of your collected monitoring data so that you don’t have to set it free out into your server to let it run crazy DMV queries and whatever else you might trust it to do or stop trusting it to do pretty quickly. Speaking engagements, I have, as far as I know, a whole bunch coming up.

But there is a surprise one down in Jacksonville, Florida. It’s Day of Data Jacksonville. It will be there May 1st and 2nd.

If you want to sign up for the pre-con, which I’m putting on May 1st, you can do that there. Advanced T-SQL. Good stuff all around. And then after that, I come home for like a day.

And then I have to do other stuff. I will be in Chicago May 7th and 8th for Pass On Tour. Again, pre-conning there.

Advanced T-SQL. You should buy tickets and go. It will be there. It will be fun. I will be at SQL Day Poland May 11th through 13th. That pre-con is on a Monday. I’ve never done a pre-con on a Monday before.

So that will be an interesting experiment. And then I will be at Data Saturday Croatia June 12th and 13th. And then I guess I’ll be home for a little bit, toddling around. I don’t know.

I’ll figure some way to make myself useful. And then I will be at Pass Summit in Seattle, Washington November 9th through 11th. So you heard all that here first.

But it is still April. It is not quite May. It is not quite travel season. My green screen is being a real turd today. So we’re just going to deal with a little fuzz at the bottom and near my arm over here. I don’t really know why that’s happening. But I don’t change anything. I feel like one of my clients when I’m like, nothing changed, but it’s all screwed up. That’s me right now.

Nothing changed, but it’s all screwed up. That’s fun. Anyway. Yeah. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April.

April. April. April. April. April. April. April. April. April. then a bunch of things in dead last that’s it so if you want performance monitoring you get my thing don’t don’t pay anyone $1,000 a server for a year for the garbage that they peddle to you it is bordering on fraud to call any of that SQL Server monitoring let’s see what we got here that did it did uh is option use hint query plan profile still useless you useless as the day it was born just like all those third-party monitoring tools I have never never had a good use for that one but anyway it’s like it was one of those things where it’s just like oh cool it’s just that by the time you like I go to run it it’s like I don’t need it you can just do something else our filtered index is safe for super high-right systems or is that asking for trouble well that depends on which side of the filter that right is happening on doesn’t it I mean you know uh if it’s if it’s absorbing rights it’s no different from another non-clustered index but if it’s not absorbing rights it’s a bit safer than isn’t it it’s like you’re not inserting updating if like the way that you’ve crafted your filtered index precludes it from some modifications and well it’s kind of that index is lucky day huh dodge that bullet so um I’m not sure where I’m not sure where where that question came from that’s an interesting one why do I sometimes see a seek logical reads than a scan all right so we’re gonna have the talk again no one gives a crap about logical reads they’re a stupid metric I guess if you were tuning queries in 2008 you might care about logical reads for some reason I’m not sure why but maybe maybe if you still have that mindset you could look at you would look at logical reads and say gosh this query does a lot a lot of logical reads I wish it did fewer logical reads and maybe you would do something about that but the the things that you should care about when you’re tuning a query from from a performance point of view you should care about CPU and duration no one knows how fast or slow a logical read is no one if you reduce logical reads uh no one’s going to come thank you for it if you make a query faster if all of a sudden a query goes from three seconds to 500 milliseconds or something just to throw some numbers out there someone might thank you but no one’s gonna say ah bang up job fellow you you fixed all those logical reads and hardly any of those now it’s dumb right uh but you know uh it’s drives you bad you can you can’t break people of these habits can you next thing you know like why is ple sometimes lower when SQL Server does a seek who cares it’s not or not worth spending your time on SQL Server huh I know I know that I know that one sometimes picks nested loops when the row count is huge what makes it choose that uh well you know the the usual garden variety stuff you might be doing something that inhibits uh cardinality estimation in some way you might be using a table variable or you might have local variables somewhere in there uh you might be suffering some other form of cardinality mis-estimate that would uh that made the tricked SQL Server into uh doing uh nested loops over a different type of join and and sometimes you you might have written your query in a way that excludes certain join types from being available to the optimizer for example both hash and merge joins require at least one equality Predicate and if you don’t have at least one of those if your joint let’s just make it easy let’s just say uh you want to know if like a date column in one table is between two a date column or two in another table uh there’s no equality Predicator it’s just it’s greater than less than maybe with an equal greater than new equal there but uh sql server cannot use a hash or merge join in those cases so that that might be another reason um but if you if you really want to like you know show me show me show me how you do that trick uh you can you can upload your query plans to plans.erikdarling.com uh it’s it’s my site for getting free query plan analysis right in your browser it all runs client side but you can you can optionally choose to share those plans so if you wanted to share a plan with me and you wanted to say ask me a question about a specific query plan and say why is sql server doing a nested loops join here that would be a very adequate way of doing it so you could try that on for size if you’d like but anyway it’s tuesday i’m not used to that uh and and we we have answered five questions and so office hours is done now where i’m gonna go do something else perhaps perhaps perhaps who knows what i might do anyway thank you for watching i hope you enjoyed yourselves i hope you learned something and i’ll see you in tomorrow’s video where i’m going to show you a little something a little bit funny about foreign keys all right thank you for watching

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Contributing To My Open Source Projects

Contributing To My Open Source Projects


Chapters

Full Transcript

Erik Darling here with Darling Data. And in today’s video, I’m not going to talk about SQL Server. Specifically, I’m going to talk about contributing to open source projects, which is a subject that has become nearer and dearer to my heart. Sort of like the metal shards that Iron Man is desperately trying to keep away from his. But the reason why I wanted to record this is because I feel like people are a little too shy about contributing to them. And the reason why I say that is because, look, if you’re here, you probably know I’ve been working on a couple of SQL Server monitoring tools. We have the Lite version and we have a full version. And I’ve got this Performance Studio application, which is a portable SQL Server query plan analyzer. And I want to, of course, start by acknowledging that these applications would not be near as good as they are if it were not for the reception that the SQL Server community has given them. Both requesting, well, not both, requesting features, contributing code and reporting bugs. These are all very, very valuable things. The part that I want to talk about today is requesting features. Because I love a feature request.
It’s like, cool, this thing could get better. The thing is, when you request a feature, you’ve most likely got some vision in mind for it. You likely have what it would look like in a perfect world to you when you think of this thing. I don’t have that when you request it. And so when someone requests a feature, the very first question that I ask them is, hey, do you want to contribute the code to do this? Sometimes they say yes, and that is absolutely fantastic. The process for that is, of course, they, you know, fork the repo and they either do the work themselves or they have their little robot companion, the way I have my little robot companion do a lot of the coding work. And, you know, they submit a pull request for it. And then I have my robot companion review their code changes. Sometimes there’s a few rounds of back and forth between the robots getting things right. Sometimes there’s some additional testing that has to be done. But overall, it’s a pretty smooth process. And the reason why I ask that is getting back to what I just said. When someone has a feature request in mind, they likely have the best idea of what that feature should look like. Right? So when I ask that question, though, sometimes people say, no, I don’t know how to code or no, I’m not familiar with the code base. You don’t have to be with me. Right? You can have your robot companions work on stuff and do things. It doesn’t have to be handwritten because mine certainly isn’t.
Right? I’m not sitting there writing every initial code that goes into these applications because I’m not a front end person. I am not a C sharp person. I’m a database person. And I have a very good vision or rather I had a very, very specific vision of the tools that I wanted to build. And I use the robots to help me do that. Without the robots, none of this would be possible. Right? Like I’m not learning C sharp and XAML and all the other stuff. I’m like, I’m using these well-defined languages and I’m saying, hey, robots, use these languages to build around the thing that I do or the things that I know, which was essentially the original set of sort of data collectors that feed all the charts and graphs in the monitoring tools. Right? Like all the stuff that comes in here. These are all things that I had written scripts and procedures for to gather over the years. And these are all the things that I knew I wanted to see in pretty charts and graphs.
So that that’s really how the whole thing started. There are some GitHub repos out there that probably do a lot more mission critical work for like, for example, the Linux kernel. It’s a pretty important one that wouldn’t, that probably don’t accept pull requests written by our little robot companions. I don’t have that particular brand of, um, I don’t know, uh, I guess, uh, importance to, to these because, uh, at the end of the day, anything that comes in here is fixable and is not going to like bring down infrastructure. Right? Like it’s like, you’re not going to see like the electric grid go out because like you, you, because you, you submitted a code that has a bug in it. Right?
It’s just something that can get fixed and cleaned up and it’s not a big deal. This is pretty low pressure stuff. It’s a free SQL Server monitoring tool. And while I do want it to be bug free and I do want it to be a great monitoring tool that remains free and remains out there in the community for everyone to be able to use and not have to give oodles and gobs of money to these monitoring tool companies. Right? Like I have a very specific mission. Um, I’m not making money off this, right? Uh, if anything, I’m probably losing money off this.
Uh, but I get paid in what I call petty cash and that petty cash is every time someone cancels a contract with a monitoring tool vendor and starts using this instead, I’m a happy camper. So please, if you have features, if you have things in mind that would make this better, don’t let the fact that you’re not a coder or you’re not a programmer or maybe you’re just a DBA hold you back. You can use your robot friends to implement that feature, get it right the way you want it and contribute it. I’m totally okay with that.
You don’t have to know everything inside and out. You don’t have to know every line of code. It doesn’t have to be perfect the first time. I’m happy just getting the feature submission and having something new in here that is helpful to people. So if you’re out there and you’re, and you’re looking at the GitHub repo and you’re thinking, wow, there’s a lot of stuff, but man, if it had this one extra thing, it’d be great.
And you’re like, you want to contribute and like, you want to make the feature request. That’s cool. Like sometimes I’ll think it’s cool enough to pick it up and do it on my own or think it’s easy enough to pick it up and do it on my own. And maybe it’s like something that doesn’t require a lot of interpretation, right?
Like sometimes there are some things that come through and I’m like, oh yeah, no, I get what you want that I can do that. But there are other times when it seems like more of a, I don’t know, sort of like someone else had their own very, very perfect idea of something when it came to their mind. But maybe like the description of it wasn’t quite clear or maybe there’s some, you know, room for interpretation in there.
I want you to be able to contribute to this. I’m not holding anyone back. You want to use your robots? That’s fine, right? It’s, again, it’s okay with me. What I care the most about though, is making sure that this is a great tool. The thing about open source stuff, and this has been said by many smarter people than me and probably, actually, you know what?
I’ll just say it the way I’ve always heard it. Open source is free software, but it’s not free development. Every time someone makes a feature request, that’s time that I have to spend implementing it. And while I don’t mind that, it’s also something where if you have a little robot friend and you could implement it, it would be a great help for me.
Because then that’s time that I get to spend doing something else, right? Working on other things. Spending quality time with the people in my life instead of, you know, sitting there and prompting and trying to get things right and going back and forth a little bit. So, if there’s a feature you like and there’s a feature you want, cool, you can make the feature request because, again, I love great ideas.
But that doesn’t always mean that those great ideas are going to get implemented quickly or immediately. If getting them implemented quickly or immediately is important to you, you could consider supporting the project monetarily, which gives me a kind of a little kick in the butt to get things in there a little bit faster, but not a lot of people have chosen to do that.
And again, that’s okay. I get paid in petty cash. So don’t worry about that. But if it’s something that’s really important to you or if this tool is helping you or your company, you could also consider supporting it in that way. Anyway, that’s probably about enough for this. Thank you for watching. I hope you enjoyed yourselves. I hope you’re enjoying this free software.
If you have any, you know, questions, comments, concerns, you have any feature requests, you have any bug reports, you have any ideas, you have any questions about it, GitHub is wide open for you. GitHub is also free, right? You can do that stuff. Maybe you’ll have to sign up for an account there if you really want to do something, but I promise there’s no head biting going on.
Anyway, I’m not quite Linus level of grumpy yet. Maybe someday I’ll get there, but not today. Anyway, thank you for watching. Enjoy your weekend. I believe it’s Friday. At least that’s what the robot calendar is telling me. Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Query Plan Analysis In Your Browser

Free SQL Server Query Plan Analysis In Your Browser


Chapters

  • *00:00:00* – Introduction
  • *00:00:29* – Free Tools Overview
  • *00:01:13* – Supporting Memberships
  • *00:02:16* – Pre-SQL Server Performance Monitoring
  • *00:04:25* – Browser-Based Query Plan Analysis
  • *00:08:45* – Sharing and Exporting Options

Full Transcript

Erik Darling here with Darling Data, and continuing in my builder phase of life, you know, at least until the robots get too expensive. I’ve decided, I mean, if you’ve been, you know, dealing with me in any way, shape, or form over the past couple months, you’ll know that I’m working on two free open source tools. One of them is a full SQL Server performance monitor, and the other is sort of a portable query plan analyzer. I’ve taken it a step further, because some people seem allergic to downloading and trying things, and I’ve put the query plan, the free query plan analysis, right in your browser. Ta-da! I mean, this isn’t it. I’ll get to it in a minute. First, you have to listen to me talk about other things. Like, maybe you’re like, wow, Erik, all this free stuff, you must kind of know what you’re doing with the database. Maybe we should hire you for consulting, or maybe we should learn from you. There are links where you can do that down in the video description. It doesn’t hurt to click on them. No executable required. You can also, for as little, for as few, for as tiny little breadcrumb of $4 a month, become a supporting member of this channel.
Be real groovy, groovy-goolies. You can ask me office hours questions for free. I mean, at some point, I might have to start charging a quarter for those, but we’ll see how the economy picks up. And of course, if you, something else you can do for free, if you enjoy this content, and you enjoy the things that I do in my life, you can like, subscribe, and tell a friend, so that you get notified when I do other things, and your friends get notified when I do things, and then everyone gets happier together, I think. Speaking of which, pre-SQL Server performance monitoring. Gratis, or gratis, or whatever, however you say it. I don’t know. I can’t do voices.
You know, I tried a Christopher Walken like 15 years ago, and I just learned my lesson. Yeah, again, no voices. Totally free. Totally open source. No email, no phone home, nothing like that. It’s just all the stuff that a monitoring tool should monitor. A modern monitoring tool should monitor in SQL Server. You know, all the important stuff when you need to troubleshoot a performance issue. And of course, if you want to stay really, really modern, I’ve got built-in MCP servers, where you can just point them at your performance data collected over time, and space, and any other measurement you want to throw in there.
And you can have the robots go through your performance data, find problems, surface things, tell you about stuff, and you don’t have to lift a finger, aside from to say, you know, go look at this thing, right? Send them off to war, right? Anyway. Anyway. Again, happy surprise pre-con day. I will be in Jacksonville, Florida. Maybe it’s warm there, finally, because New York is still not warm. It is. What day is it? April something, and it is still cold here. I’m angry about that. So I’m happy that I’m going to Florida, where I’ll probably wear shorts. Not in front of a crowd, but I might wear shorts independently on my own. Anyway.
It’s a vibe, you know? Other places I’ll be, where it will hopefully also be warm by the time I get there. Chicago, Illinois. Dubious. May 7th and 8th. Not sure about warmth. SQL day Poland. May 11th and 13th. I don’t know. Poland sort of has a reputation for being cold.
I’ve only ever seen it in war movies, where I think it was supposed to look bad, so I’m going to just try to be optimistic about Poland being warm by then. Data Saturday, Croatia. June 12th and 13th. I’ve got to imagine that anything in that area of the world by June, I might be sweating by then. And then back to shivering at past Data Summit. New Community Summit. Date in Seattle. November 11th through 13th. 11th. 9th through 11th.
There are so many other 11th on there, I get confused. Anyway. Let’s talk about free query plan analysis right in your browser. Now this is not an advertisement for Microsoft Edge. Because I hate Bing and this browser is honestly, and it’s fine if you’re into that sort of thing. But if you go to plans.erikdarling.com, that is a brand spanking new subdomain under my website, under the Darling Data umbrella of websites, which is really just erikdarling.com, you will find this lovely interface.
And under that interface, you can either paste in plan.xml or you can upload a plan file. Now, I know what you’re thinking to yourself. There’s already a paste the plan. Yes, there is. But there’s a slight difference here. This one actually analyzes your query plans, and it does not, by default, save your query plans. All of the analysis is done in your browser.
It does not leave your browser. It does not go out into the world. There’s no plan file saved anywhere. You can do that, but you don’t have to do that. So, if you go to choose file, I’m just going to choose a query plan that I have saved immediately here.
You get back information about your query that looks a bit like you would get from the Performance Studio application that I’m building. Up here, when we look at this stuff, we can see some runtime stats about the query, right? We can see, well, oh gosh, golly and gosh, there was a missing index and no parameters were passed in.
We have this lovely graph down here of the wait stats, right? So, we can see, like, sort of graphed out what our query waited on. We don’t have to go digging through plan XML anymore. And then, down in this section below, we will have all the warnings generated by our query plan things that you should probably pay attention to, right?
If you look through this stuff, that’s all critical warning, yada, yada. And, of course, we get back our, well, there’s also the query text in there, but then we get back a lovely graphical representation of our query plan, right? Just like in Performance Studio, right? Stuff over here, look at all this good stuff.
And then, down at the bottom, we have the full text analysis, just like you have in Performance Studio, where I break down everything in a way that a human can hopefully understand. So, if you ever want an opinion on a query plan, and you are not allowed to maybe share it publicly on a site like StackExchange or StackOverflow, or maybe you’re not allowed to paste the plan somewhere where that plan is going to get saved off somewhere, then that’s one thing you can do.
There are also a couple other things in here that are neat. There are two buttons up at the top. One of them is to export HTML.
So, if you want to save off, like, all the HTML from here, you can do that. See, we open that up, and it looks just like it did. Well, I mean, this part’s a little bit different, because I don’t want to, like, export images to you.
But this part, so we have, like, the operator tree from here, and we have all the full text and everything else down there. So, if you want to share your plan, you can hit this button that says share. And when you hit this button that says share, you will be able to choose how long that plan stays saved on my little server for.
It is a secure server. I have done my best to make it unhackable. I’m not saying it’s unhackable.
I’m not challenging anyone in the world. I’m just saying I have taken reasonable precautions against anyone breaking in there. And it will say, you can do this, and you can have it expire after anywhere between one day and one year. And if you hit continue, it will say, are you sure?
You’re okay with this, right? Like, if you click twice, it’s not an accident. All right?
I’m just saying, like, once, oops, twice, that’s on you. But even if you’re like, I got confused, there’s a button up here where you can immediately delete it, right? If you say, ah, I was confused and drunk.
I was off my keister that day. I didn’t mean to put that there. You can, if you’re like, I just messed up, you can immediately delete it, and it goes away, right? And that URL, just the file is gone.
But then if you want to, but if you actually want to share it, then you will get this URL that you can share this plan with whoever you want, with all the analysis baked in, so that you don’t get, I don’t know, you don’t have to, like, explain anything. All the explanation is here. And you can say, how do I fix it?
What do I do? What did I do wrong? How did I end up in this place? How did I end up so doomed and damned with this query plan? And someone else can read it and look at it and say, here’s what I do. You can, you can even ask me, but that, that, that costs, if you want to do that, that’s this consulting link over here.
And that’s where you can, you can, you can get help with your SQL Server from, from me, right? Or with your query plans from me. That’s, that, that, that part is not free, unfortunately.
I cannot, cannot dedicate that much time. But this stuff I’m happy to do and to help people out, right? All right. Anyway, free query plan analysis, right in your browser, doesn’t leave your browser.
Sharing is optional, storing it is optional, and you get to choose how long you want to store it for, right? So you can even, you can even like share it with someone. And as soon as they get it, you can immediately delete the link.
Or if you, you’re uncomfortable with that, you can just export the HTML yourself and just share that with someone. So it doesn’t live on the internet, right? I don’t see anything from this, right? There’s nothing here that I get out of this.
All right. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you will, you will, you will start getting some free query plan analysis right in your browser. All right.
Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Introducing sp_QuickieCache: 80/20 Plan Cache Analysis to Find Your Worst SQL Server Queries

Introducing sp_QuickieCache: 80/20 Plan Cache Analysis to Find Your Worst SQL Server Queries


Chapters

  • *00:00:00* – Introduction
  • *00:00:30* – Recent Video About 80-20 Analysis
  • *00:01:00* – New Store Procedure for QueryStore
  • *00:01:30* – Useful Links in Description
  • *00:02:06* – Free Resources and Services
  • *00:02:34* – SQL Server Performance Monitoring Tool
  • *00:03:00* – Built-In MCP Tools
  • *00:03:11* – Query Data Protection
  • *00:03:28* – QueryStore Benefits
  • *00:03:40* – Surprise Pre-Conference
  • *00:04:00* – Upcoming Conferences
  • *00:04:58* – SPQuickieCache Store Procedure
  • *00:05:03* – SPQuickieCache Overview
  • *00:05:20* – High-Impact Parameter Analysis
  • *00:05:31* – Query Scoring and Tuning
  • *00:06:45* – Detailed Query Information
  • *00:07:17* – Plan Cache Insights
  • *00:08:09* – SPBlitzCash Database Checks
  • *00:08:26* – Find Single Use Plans
  • *00:09:05* – Find Duplicate Plans
  • *00:09:37* – Query Store Alternatives
  • *00:10:17* – Thank You and Future Videos

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we’re going to talk about, let’s see, did it happen right? It happened, right? I had to take, if you pay attention to the goings on of Erik Darling and his Darling Data life, there was a video recently about adding sort of 80-20 query analysis into SPQuickieStore, which was very, which continues to be a very, useful thing for me. Uh, it’s in the latest release of the, my, my SQL Server, uh, performance troubleshooting scripts, uh, doesn’t have a snappy name, no first responder kit, but, you know, someday, someday I’ll figure that out. Um, but, I was working with someone recently who, uh, was, was philosophically opposed to QueryStore. Much, you know, there are various reasons why, some, some, some, you know, uh, some right, some wrong, but, you know, I just refused to turn it on, and we needed a way to look at QueryStore. And I wanted to use a similar way, so, uh, I ported the find high impact section of code over to a new store procedure, and I also added a couple bells and whistles to it. So, uh, we will, we will look at that, and we will also, uh, actually, no, we’re, we’re only going to do that. That’s, that’s all we’re doing today. We don’t have, we can possibly cram another iota of interesting things into this.
Uh, down in the video description, you will find all sorts of helpful links for, uh, our lives, to make our lives intertwine wonderfully financially together. Uh, you can hire me for consulting, you can buy my training, and for as few as $4 a month, you can, you can, you can feed a starving consultant, um, uh, some tiny fraction of a New York cappuccino or espresso. Uh, for free, though, you can ask me office hours questions. Maybe you can ask me how to draw other things in SQL Server Management Studio, and we can try that. Uh, and of course, if, if your enjoyment and appreciation of me, uh, is not, is not measurable in money, uh, you can always like, subscribe, and of course, uh, tell a friend about, uh, whatever it is you find useful here.
Or, I don’t know. Maybe you just like Adidas t-shirts. Right? Could go either way. Uh, for free! Also, you can download my SQL Server performance monitoring tool. Um, no, no, no weird telemetry or email sign-up required. I don’t, I won’t spam you with crap. Uh, it’s just all the stuff that I care about monitoring, uh, performance-wise in a SQL Server, uh, collected and, and made into beautiful charts and graphs with all sorts of very user-friendly ways of, uh, looking at that data and getting root cause analysis of your SQL Server performance issues. Uh, there is, uh, also a built-in set of MCP tools that you can have your robot friends talk to about your performance data and only your performance data.
And it’s a magnificent thing in the world, right? Because you, you don’t have to let them loose into your, into your production SQL Server to, like, start running all sorts of crazy queries. They just touch the performance data that’s already been collected, and that’s much easier for them to understand, too, right? Because we, we retain all the important details. We don’t lose them, right? When various things go away and age out of SQL Server, we collect it all over time. It’s beautiful. It’s a wonderful thing, and it’s free.
I have a surprise pre-con. Um, I don’t know, I guess someone else’s airline tickets got canceled or something. I don’t know. Maybe they, sick, maybe their parakeet died. I don’t know. Uh, but I will be at Day of Data Jacksonville, Florida, May 1st and 2nd. Boy, howdy. Look at that. Um, and I will be doing my advanced T-SQL pre-con shenanigans there, so, uh, if you want to come see me in Florida, it’s been a long time since I’ve been to Florida.
Apparently, I’ll be there. Alright. Uh, other places in the world I’ll be. Golly and gosh, look at all these wonderful people who decided to let me, like, be in public, in front of people, and, uh, not stuff a sock in my mouth yet. Uh, I’ll be at Pass on Tour, Chicago, uh, May 7th and 8th. I will be at SQL Day, Poland, May 11th and 13th. Uh, I will be at Data Saturday, Croatia, June 12th and 13th. And then I will be at, in Pass Summit, Seattle, uh, November 9th and 11th. Ah, man, Pass Summit, Seattle. There we go. I am, I am not even drinking today yet.
That’s the funny part. Alright. Cool. With that out of the way, let’s look at this new store procedure. Uh, I think I have to go to Management Studio. Yeah, I remember what that looks like. Alright. Cool. So, uh, this is, this is it. SPQuickieCache. Pay no attention to the terrible red squiggly underlines.
Um, and this is what you get back. So, if, if you have been using, um, uh, SPQuickieStore and you’re on a relatively new version of it, you will find that there is a high-impact parameter, which gives you a similar set of stuff. Uh, the whole idea here is to find queries that hurt you across a variety of metrics and sort of score them and, uh, present them to you in an order in which you should tune them.
Right? Because that is a, a wise and wisdomful thing to do. So, uh, this is sort of what you get back. Um, let’s zoom in on the results here. Up at the top, we tell you how many plan cache entries and all this other stuff we captured. And, um, you know, it was good stuff, right? Like, good information up here. Uh, and then down in this section, this is where your 80-20 queries live. Um, I have taken it upon myself to do some neat things in here, like tie statements back to the procedure that they live in. Uh, I think if we scroll down a little bit further, there’s another one, but I might be wrong because I might’ve run some stuff between now and then. Yeah, it’s not in there, but, um, you know, like for like this line, you get like just create procedure. And there are some things that we don’t get back at the procedure level that we get back at the query level. So, uh, there’s all sorts of neat stuff in there, right?
Like, uh, let’s see up here, we have create procedure, any word, right? And this is like, misses the query hash, but down here we see like the queries from that store procedure, right? So that, that’s, that’s, that’s how it works. But then over to the right, we have some other things too. Uh, we have this impact score and we have, uh, the areas in which queries returned, uh, high signals, right? Uh, so, um, this one up here up at the very top spilled a lot, right? That was how many times it spilled. And if we look over here, uh, we’ll even see, uh, the total spills and the stuff like that. So, and, you know, max spills and all these other things. So we get some, some high level information back about like how much damage this thing is actually doing. So we can, we can make smart choices. Um, down here, uh, there is some information about the plan cash, right? So, um, like, like severe plan duplication and single use plan bloating, plan cash stability, right? And single use plans and all these other neat things at the database level. Um, I also added some of these database, well, I was working on this.
I added some of these database level checks as well to SP blitz cash. So if you are an SP blitz cash fan or aficionado or whatever, um, you’ll, uh, I think at some point when Brent does another release, you will see, uh, the database level breakdown of, um, of, uh, duplicate plans and single use plans. So if there’s like a single database on your server, that’s responsible for all this, it used to just be at the server level, right? And just be like, there’s a lot of plans for this query, but now it’s like for the database, like, look how bad this is. Now, just like all my other store procedures, uh, there is a help parameter where you can see how everything is set up and gets used and that it’s MIT licensed and that, you know, we, um, you know, give this stuff away for free. I give this stuff away for free. Uh, but the important part here is that it lists out all the parameters. And if you look in here, there are a couple of neat little doodads, like find single use plans and find duplicate plans because a lot of the times, Oh, we don’t need you. Uh, you’re, you’re a quickie store. You, how’d you sneak in there? Um, so a lot of the times when, you know, you see alerts like, Oh, there are a lot of single use plans or, Oh, there are a lot of duplicate plans. Uh, you’re like, well, where are they? How do I, you’re going to make me go find those? How do I, how do I do it? I’m hope lost and hope hopeless and helpless. How can I possibly be a more self-sufficient person in the world? Don’t worry. You can be, you don’t have to be, you can be codependent on me. So if we run this query, uh, or rather run SP quickie cash with find single use plans, I will return to you single use execution plans, and I will give you a command to get rid of them. Right? So that’s, that’s cool there. And then there’s also find duplicate plans, and this will give you the top 10, uh, most duplicated plans in your cache. Right now, it appears this TPC database is really, uh, just a nightmare mess of, of things, right? Look at this 1600 plans and 1600 executions, right? And the, the story doesn’t look good there. Right. But, um, you know, this will help you maybe find queries that need parameterizing, um, or maybe it will help you turn on force parameterization for an entire database to avoid the problem. But either way, uh, you get some useful information back there. Uh, this store procedure, just like all my other store procedures are available in my GitHub repo. That’s at code.erikdarling.com. Uh, again, type that out for you. So you can be even more codependent on me. You can be codependent on my code code.erikdarling.com. You can get this here. Uh, you can start using it. You don’t have to turn query store on for it. Um, uh, so that’s, that’s cool too. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you over in tomorrow’s video where we will do, I don’t know, something stuff. We’ll figure it out.
I do all this stuff at the last minute anyway. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Can you teach us how to draw a hotdog using SQL Server’s geometry?

Can you teach us how to draw a hotdog using SQL Server’s geometry?


Chapters

https://gist.github.com/erikdarlingdata/20df5b8604d3673f474de798120a891f

Full Transcript

Can you teach us how to draw a hotdog using SQL Server’s geometry? Gee, that would be swell. Well, you know, who am I to deny a reasonable request? So here we go. Here’s SQL Server Management Studio. Here’s a bunch of geometry. Here’s a hotdog. I didn’t pick the colors. I don’t get to pick the colors. These are just the colors that SQL Server Management Studio chose. when it drew the object. It’s a bit of an 80s hotdog, which I appreciate. If I were walking in a mall food court in our glorious decade of the 80s, when everything was better, and I saw a neon sign or some other signage that had a hotdog that looked like this on it, you can bet I would have spent probably at that time about 28 cents on a hotdog. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 60

SQL Server Performance Office Hours Episode 60



To ask your questions, head over here.

Chapters

Full Transcript

Erik Darling here with Darling Data, and in today’s video, Monday has thrust itself upon us, and it is time to answer not one, not two, five office hours questions submitted by you, the greater community. Before we get into that, down in the video description, there are all sorts of wonderfully curated, helpful links for you to interact with me in ways that we will both enjoy immensely. You can hire me for consulting, you can buy my training, you can become a supporting, paid member of the channel for as few as $4 a month. And if you don’t find that you get $4 a month worth of joy or information or entertainment out of this channel, you can do some other stuff. Like, I don’t know, you can ask me an office hours question like we’re doing today, and you can, I don’t know, maybe that is worth $4 a month to you.
You can also like, subscribe, and tell a friend. If your enjoyment of this channel is somewhere between one of whatever the smallest unit of your local currency is and $3.99 of your local currency, you can just do something like that. You can say, well, you can say, well, you know, it’s worth like $0.99 whatever’s to me, like cents of whatever your variety you have and say, oh, I’ll send this link to someone who might like it. You can also, for free, download, you can buy it for free, my free SQL Server monitoring tool. Totally free, totally open source, no email, no phone home, no nothing weird.
It just runs a bunch of stuff and gathers a bunch of metrics that I would look at if I were a performance monitor, which now I am, and it puts them in pretty charts and graphs. And you can click through those pretty charts and graphs, and you can get to like the root cause of performance problems on your SQL Server. And if you prefer to have your robot friends do the heavy lifting, then there are optional built-in MCP servers you can use in order to have them do the work for you and give you some analysis on your performance monitoring data.
Not just them going out and running crazy DMV queries on your production server. I have a new surprise. Hey, look at that. A new surprise pre-con. Day of Data Jacksonville had one of their presenters drop out, and so I am taking their place.
Silver Metal Eric, coming your way. I’ll be doing a pre-con on advanced T-SQL performance tuning. And so you can come see me there if you are in the greater Jacksonville area. So that’s May 1st and 2nd. And then I’m home. Oh God, my kids are going to grow so much.
I’ll be in Chicago May 7th and 8th for Pass on Tour. I will be in Poland May 11th through 13th for SQL Day Poland. Crazy. Then I’ll be home for a little bit and I will be at Data Saturday, Croatia June 12th and 13th.
And then at least as far as I know, finally at Pass Data Summit, Seattle. It’s a community summit. Comumity. November 9th through 11th.
So all that going on. But for now, let’s answer some champion baseball office hours questions here. All right. Zoom it. Are we going to do it? Oh, we did it. There we go. Where can I download your awesome database AI art?
You can’t. That’s for me and me alone. You can take a screenshot of it if you want, but I’ll be in it sort of. But I don’t know. Maybe maybe that makes it more memorable.
All right. Old darling data hunting your screenshots. But, you know, I don’t know. I’d make a calendar, except they’d all be a year late. And I know how you people are with money.
Anyway, come on. Zoom it. And, you know, free tools. What can you say about them? Jeez Louise. On a hundreds of million row table, why does update statistics sometimes run multi-threaded and sometimes single-threaded, even though the cost is way above the cost threshold for parallelism?
Same reason that happens for a query. The optimizer says, this one’s going to go parallel. This one’s not.
Cost threshold be damned. You might induce it to, you know, what do you call it? Use a parallel plan by maybe choosing a higher sampling rate. That might work sometimes, depending on how the optimizer feels that day.
Unfortunately, there just aren’t as many tricks for this as there are with, like, you know, inducing a query to go parallel. Right? There are things you can do there, but this, you know, kind of stuck a little bit. All right.
This is a long… I had to start charging by the sentence here. When I use the legacy cardinality estimator for a query, the estimates only change a tiny bit. But the query goes parallel instead of serial.
Same stats, parameters, etc. The only significant cost-related difference is that the estimated number of rows without a row goal for the most IO heavy node of the plan goes down. Does any of that match your experience with the legacy cardinality estimator or trying to encourage parallelism?
I’ve got to be honest with you, it’s a very specific situation. I’m not sure that I have encountered your very specific situation. I’m not sure what other local factors apply to your very specific local situation.
Does that match my experiences? No, not exactly. Not in either case. But if it’s working for you and that’s what you want to happen, I’m very happy for you.
Like, have you ever tasted my mom’s cooking? Can you realistically tune your way out of a terrible data model or is that just delaying the inevitable? Yes, you can tune your way out of a terrible data model.
It is just annoying. You know, but it really does depend on in what direction your data model is terrible. You know, like, you know, the most common one that I see is like the overly wide table.
It’s like 200 columns and you’re like, oh, we’re going to need some wider indexes if we want to deal with this. But, you know, usually it comes down to indexing when you when you need to do that. You know, if you ever suggest changing a data model to people, they just start crying because of the amount of things in the application that would have to change and get tested in the whole works.
You know, it is usually incumbent upon us, the data query performance tuning community to deal with the malfeasance of whomever designed these terrible data models and find some way to accommodate them with our our talents and skills. You know, you can, but it sucks. When do you decide a query is too complex and needs to be broken into multiple steps?
Well, usually about the time that it starts not performing well and usually about the time that it starts not performing well because SQL Server does not fasten the nuts and bolts together in the right places. However, there are different ways of approaching this, of course, you know, temp tables are a very nice materialized mechanism for, you know, material like like sticking a portion of some query into an object that SQL Server can then derive statistics on and you can even index it. And so, you know, like my decision is basically, is this query too slow?
Yes. Is it too slow because SQL Server is not figuring something out correctly because it’s complex? Yes.
Then it is time to do it. Usually the appearance of more than one CTE, especially when those CTE start getting joined together is a good sign. Sometimes it is, you know, if there’s a bunch of derived joins and, you know, stuff like that going on with it.
Other times it might just be, you know, some arbitrary number of sub queries in the select list or in the where clause or even in a join clause. Right. But the decision always starts with, is this query performing acceptably?
And if the query is not performing acceptably, then it could be due to complexity, but it could be some underlying pathological issue as well. Right. So sometimes, sometimes it is a break, breaking the query up is the best possible option.
Other times there are, there are other things, other steps that you’re going to have to take regardless of whatever small steps you, you, whatever, whatever multiple, however many multiple steps you choose to break it up in. And I think, you know, at least the nice thing about breaking a complex query up into multiple steps at some point is that it gives you very specific information and feedback about which portions of the query require your attention. Right. Because there might, there might be some portions that, you know, run very quickly on their own.
There might be some portions that run very slowly on their own. But sometimes it really does come down to like putting those two portions or however many portions together is what causes an issue. Right. So for me, it’s really just, you know, it starts with is performance acceptable?
If not, what steps do I need to take to make it perform acceptably? Right. So that’s, that’s pretty much it for me there. All right. Thank you for watching.
I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where I’m going to answer one very special office hours question on its own. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Partitioned Views

Learn T-SQL With Erik: Partitioned Views


Chapters

Full Transcript

Erik Darling here with Darling Data. And we’re going to finish off this Friday by talking about partitioned views. And look, there are a lot of things I could say about partitioned views that are great and grand and that have come in handy for me over the years in ways that I’m like, wow, thank you partitioned views. Thank you for not being normal table partitioning. Thank you for existing. But mostly in this video, I just want to kind of go over a couple of things that, you know, are not fun about partitioned views. I want to show you some things that are fun, but some things that are not fun are things like, you know, like the setup of them is like a bit arduous. And if you want your partitioned views to be writable, good luck. It’s a hard road. And I’m going to talk about some of the things that disallow a partitioned view from being writable as well. So that’s our lesson from today. Sorry for pausing on the big reveal there. Apparently, I just forgot to click the mouse. Anyway, down in the video description, if in case you were unaware, down in the video description, you will find all sorts of helpful links.
You can hire me for consulting, you can purchase my training, and you can become a supporting member of the channel. If you don’t feel like giving me any money, well, I don’t like it, but what can I do from here? I can’t exactly punch through a screen here. You can always ask me office hours questions for free. That is a free activity you can partake in. If you would rather hear me answer a question than some robot, I suppose that’s as good a way to do it as any.
Of course, if you enjoy the high-quality, flawless, unmatched content that I produce on this SQL Server channel, the most important SQL Server content that you will ever see in your life, please do like, subscribe, and tell a friend. And continuing in the tradition of you not feeling like giving me money, if you would like to stay on that path, if you’d like to stay that course, you can download my free SQL Server performance monitor. It is wonderfully free. I make the same amount of money whether you use it or not.
So, like, okay, do or don’t. It’s up to you. But it is a replacement for paid commercial SQL Server monitoring tools in many cases, in many ways. And, you know, it just looks at all the stuff that I would look at if I were, you know, coming in to look at your SQL Server as a consultant.
Logs at all the tables, gives you pretty charts and graphs, tells you all sorts of things about your data, allows to get you some root cause troubleshooting stuff done. And if you feel like having the robots do the work for you, it’s got MCP tools built in so that your robots have read-only, well-defined access to the monitoring data.
And they can usually make better sense of that than if you were to just let them loose on a server running DMV queries there and about. Anyway, if you do not prefer the robots, if you do not want to have an MCP analysis of your life, you can come see me out in the world. I will be going places, doing things, talking about SQL Server as long as it’s still alive.
There’s something weird on my shirt that I can’t get rid of. I don’t know what it is. It might just be the light. This might just be an old shirt. It’s hard to tell. Anyway, I will be in Chicago May 7th and 8th for Pass on Tour, the east of the west, I guess that is.
I will also be at SQL Day Poland May 11th through 13th. That’s looking up to be a great conference. I can’t wait to see the Dwarves of Vroslav. I keep hoping I say that right.
And then I will be home for a little bit and then over at Data Saturday, Croatia, June 12th and 13th. Lovely time of year for Croatia, I hear. Distinct lack of dwarf statues around the city of Zagreb, but that’s okay because I’m sure I’ll find other things to be amused by in such a lovely place.
And then, of course, November 9th through 11th, I will be at Pass Data Summit in Seattle, Washington. So these are all things that cost money, so come see me so I can make some money back. Anyway, for now, it is March. No, wait, it is April 1st, right?
Yeah, it is April 1st. Well, today I’m recording this on April 1st. I’m publishing this far down the line because, crap. Ah, the magic is ruined for you, isn’t it?
Anyway, let’s talk about partition views. So this is the first thing, is the arduousness of setting them up, right? So, like, I’m just doing this yearly for the votes table in Stack Overflow 2013. So I’ve got, like, 2008, 9, 10, 11, 12, like, this is my preamble stuff.
But then, like, you’ve got to create a table for each one of these things. And one thing that’s very, very important to do on table create is make sure that whatever your partitioning element is, SQL Server has a valid way of eliminating certain, like, partitioned table elements from your query plans.
It will not take the hint otherwise. So here I’m saying, look, SQL Server creation date is going to be between these two dates, or betwixt, I guess. Why don’t we have a betwixt keyword that just says, that acts like greater than, equal to, and less than?
That would be nice. That would be a good extension. Connor Cunningham, if you’re out there listening, we’re going to have between, which is going to stay the same, and we’re going to have betwixt, which is going to do that, all right?
Write it down. All right, so you’ve got to do that for each table, right? You’ve got to give SQL Server a way to know what data is going to be in what table so that, you know, like, if you want to get partition elimination, you can do that, all right?
And then you have to create a view, and you have to list your columns. Now, it is kind of cool where, like, if you do this, and you have, like, different columns and different tables, you can make that work with just, like, placeholders.
So that’s fine, too. Just be careful with null, because I learned from Kendra Little that SQL Server, if you put, like, some column equals null into a view, or if you, like, select null into a temp table, SQL Server defaults to calling it an integer.
So that might, you might find that disagreeable. So make sure that you are strict with your data types if you’re expecting a different one, other than integer for your placeholder columns there. Anyway, after you create your view, then you have to get your data into these various tables.
That’s all quite boring stuff. And if you’re wondering why I created my tables as heaps, it’s because I’m going to start with these tables as unique, having unique clustered indexes on them.
And I’m doing that for a reason that we will expand upon in a moment. So if you want your partition views to be writable, there are some prerequisites.
You need to have a unique clustered primary key. This is why I’m starting with just a unique clustered index, because it must be a unique primary key.
It must be a clustered primary key. That includes the partitioning column. You need non-overlapping check constraints on a partitioning column, which means there are no gaps, no overlaps.
You need identical schema across all member tables, same column names, types, and nullability. You need, they all need to be in the same database. You cannot do that across, across databases.
If you want your partition view to be writable, that would just be insane. And you must use the union all syntax, not union.
There are some limitations as well. Like you can’t have an identity column in any member table. You must strip these tables of their identity-ness. You can’t have a computed column as a partitioning column.
And you can’t use any timestamp columns. That makes the view non-updatable. So screw you timestamp columns anyway. Constraints?
Well, you can’t have default. Well, yeah. So you can’t have default constraints. You can’t have cascading foreign keys. And you cannot put triggers on member tables. You can only put instead of triggers on the view.
You can’t use the default keyword or an insert or update. So that’s fun. You must provide all columns and insert statements.
You can’t do bulk operations like bulk insert or BCP. And you can’t have any self-joins with the view or member tables when you are, if you want this view to be writable. So like if you were trying to do like, oh, I need to put data in here that doesn’t exist.
And you like write, ah, insert, blah, blah. Where not exist, select, blah, blah. That SQL Server will say no. There are all sorts of limitations around data types like XML and other lobs.
User-defined types, CLR types, not supported. Not that anyone uses, though. That’s insane. And some other like just weird random trivia stuff. No full-text indexes, no indexed views.
Check constraints must be trusted and enabled. And all member tables must have the same number of columns. So if like some of the flexibility that you get with indexed views is reduced quite significantly if you want that partitioned, sorry, partitioned views is lost a bit if you want that partitioned view to be writable, not indexed views.
Though those have many restrictions as well that I find unfortunate. But if we look at execution plans, and I realize this might not be connected to anything, but that’s okay.
We’ll get there. Ah, see, I knew something weird would happen. There we go. Anyway, it ran. It worked. All right, let’s give that a second run just to make sure. All right, so we’ve hit.
You can see that we are touching our partition view here. And this partition view is all those tables. But our query plan only shows one table being hit, right? Because we’re only looking at 2010 and 2011.
And because I went and created all those check constraints across all of my, all the tables in the indexed view, SQL Server knows precisely where to go. Right?
And if we do the same thing with a multi-year query, then we will see that we hit more than one table, but we did not touch all of the tables. We just hit the necessary ones, which were 2010, 11, and 12. So SQL Server’s optimizer can be very smart with how it directs queries to index views, even if those queries are parameterized, right?
So if we look, this is a store procedure, which accepts a start date and end date, and you can probably guess where they end up. This thing basically just runs this query, right?
We run that, and we’ll see in the execution plan that some of these lines actually have things, actually have data flowing through them, like 2010, 11, and 12, but the rest of them do not.
So this is sort of dynamic partition elimination, right? So if I change, well, you know what, if I change this to 2012.01.01 and 2013.01.01, and we run this, the execution plan, you’ll see that SQL Server is able to dynamically just get us to the 2012 table.
So the optimizer is pretty good about this. Where that can get confusing is if you look in like a cached plan, so like either in the plan cache or query store, or even if you just get like an estimated plan for this, like you’re gonna, like, you notice what we see here.
We see three thick arrows, right? And that’s because when we cached in, or we compiled and cached a plan for this, it was for 2010 through 2013, right?
So we see three thick arrows here, right? So like cached and estimated plans for these can look pretty confusing because you’re gonna, like, they might look like they’re hitting all of the tables and doing all of the work when they are really not.
So the one point that I wanted to make a little bit with this is if we try to update this, because remember, I only created unique clustered indexes on my member tables, right? So if I try to run this update, it’s gonna fail, right?
And it’s gonna say it’s failing because a primary key was not found on, well, it says just 2008, but really none of the tables have primary keys. So if I create primary keys across all of those tables, right, which doesn’t take too long, you know, they’re all pretty, fairly small tables, and we already have an index on the columns that we’re primary keying, right?
So this is all pretty quick, but, and this is why I try to tell people, trying to make these things writable is often not worth it, right?
It is not worth the effort. Just like, it’s much, much, you’re much, much better off spending the time and coding efforts to just hit the right table somehow, like use dynamic SQL or do something else, right?
Like an instead of trigger that does things because like, you know, like we run this update, man, it’s not fun, right?
So yeah, making these things writable, again, not worth the mental effort and then modifications against them, man, they’re not a lot of fun either, right?
Like if we look at the query plan for this, man, we heave-hoed a lot of work into this one, right? There’s a lot of spooling going on, right?
This thing is a big, wide mess. Even though we, like, it’s like, okay, all right, well, we’re hitting the clustered index on just two tables, 2011, 2012. Man, oh man, we do a lot of work, a lot of work here, right?
It is not worth your time and effort to make the partition view writable. Please just spend your time either writing an instead of trigger or writing, you know, some sort of dynamic SQL that touches the right table based on user input, hopefully sanitized user input, of course, but I don’t know.
You can’t have everything unless you hire me. My rates are reasonable. Anyway, this is just a snippet of some of the class material from Learn T-SQL with Eric.
You can, of course, purchase the whole course via a link down in the video description. There is even a coupon code attached to that.
So if you have enjoyed anything that we’ve talked about today, you can learn much, much more from the entire corpus of class material. Anyway, thank you for watching.
I hope you enjoyed yourselves. I hope you learned something. I’ll see you in tomorrow’s… Oh, no, I’ll see you in Monday’s video for Office Hours. I lied to you. I apologize. All right, it’s a bad way to end things.
I should say I love you. I’m sorry or something like that, right? All right, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Partitioning and Column Store

Learn T-SQL With Erik: Partitioning and Column Store


Chapters

  • *00:00:00* – Introduction to Partitioned Columnstore Tables
  • *00:05:02* – Querying Data and Segment Elimination
  • *00:10:37* – Vote Type ID and Common Data
  • *00:13:02* – Conclusion and Future Topic

Full Transcript

Erik Darling here with Darling Data, going through some more of the Learn T-SQL with Erik snippets, things that I feel are important for people to know about, regardless of if they purchase the course or not. But, of course, because there is a course for sale, then I would, of course, appreciate it if you purchase the whole course so that you can learn all the things that I think are important for folks to learn. So, today we’re going to talk about partitioning in columnstore because there are important differences between partitioned columnstore tables and partitioned rowstore tables. One of the sort of superpowers that columnstore has is the ability to use metadata about which row groups have which data in them, and it can skip entire segments that do not contain relevant data.
That’s a wonderful thing. That is something that rowstore indexes do not really have. All right. So, la-dee-da. But today we’re going to kind of look at how those two things, how these two things pair together and how, you know, much like with, you know, normal partitioning, the way that you access data does sort of rely on the partitioning key in order to make this as efficient as possible.
Before we do all that, down in the video description, you will find the most important links in your entire life. You can hire me for consulting. You can purchase my training materials, including the Learn T-SQL with Erik course, which has a little coupon code attached to it at the moment. You can become a supporting member of the channel. And then totally for free, you can ask me office hours questions and you can like, subscribe and tell a friend all about how wonderful all of this content is.
So they can, they can, they can learn as much about SQL Server as you, right? Wouldn’t that be just great? Another incredibly important link down in the video description is where you can get my free SQL Server monitoring tool from. Uh, it’s totally free, totally open source. There is no obligation or requirement on your part to, uh, talk to me or, uh, you know, pay me anything for it.
Uh, it’s just something that I wanted to do for the SQL Server community because I don’t feel that the big monitoring tool companies are doing their jobs well and haven’t been for a long time. And this is my way of writing that situation. Uh, it is a bunch of T-SQL collectors that would grab the same information that I would if I were, uh, doing a performance, uh, analysis or tuning engagement.
Um, it displays all that information in beautiful charts and graphs and gives you all sorts of abilities to click through those charts and graphs in order to find the root causes of your performance problems. If you’re too busy for all that, you can have the robots do it. There are optional MCP servers where you can, uh, let the robot of your choice talk to your performance data that I’ve been collecting.
And it can give you all sorts of nice summaries and talk through things and, you know, just give you general advice on what’s going on or at least information about what’s going on on your server. Uh, so that you are better prepared for the work you have to do. Uh, anyway, if you prefer the, uh, the old human to human thing, uh, you know, I, I still like human contact once in a while.
That’s why I’m going to all these things. Holy crap. That’s a lot of travel. Uh, I am just in order. I will be in Chicago, Illinois, uh, May 7th and 8th for pass on tour East, which is sort of weird cause it’s in the Midwest, but you know, what, what can you do there? Uh, then I will be traveling from Chicago to SQL day in Poland, May 11th through 13th.
Uh, then I’m home for about a month and I will be back out, uh, in the world going to data Saturday, Croatia, June 12th and 13th. And then, um, at least until someone makes me an offer, I can’t refuse, I will be, uh, doing nothing until November where I will be in, uh, Seattle for, uh, pass summit West, November 9th through 11th. I’m not quite sure what to call that one right now, but I’ll work that out.
Anyway, the meantime, let’s, let’s get beery and cheery and let’s talk about partitioned columnstore tables. So, uh, what I’m going to show you first is this query, right? And this query is going to hit a table that I have pre-created, uh, that is the votes table from the Stack Overflow 2013 database.
Uh, it is a clustered columnstore table and it is partitioned by the creation date column by year, right? So we are not partitioned by bounty amount, but the bounty amount column is still interesting. Because if we select some data from this table and we look at the messages tab, we will see that we read 33 segments and skipped 71 segments.
But why? Well, if you’re interested, uh, a part of the course materials is a view that I’ve created that gives me details on what’s going on with my, uh, columnstore stuff. And if we look at that, we will see that we have partitions two through seven and partitions two through five are eligible, right?
Where bounty amount equals zero exists. That is two plus eight plus 11 plus 12, which I’m pretty sure is 33. And then we have, uh, some ineligible parties, segments and partitions, uh, where we have, of course, six with sort of overlap between and partition five there.
But there are, uh, six ineligible segments in, uh, partition five and then 28 and 37, which I am fairly sure adds up to 71, but, uh, I haven’t got 71 figures to sort that out with. So I will let you do that math on your own. So the message really is with all this.
And if you just feel like this being a TLDR for the whole video, that’s cool with me, but just like with, uh, any other index, um, but particularly with partition columnstore, uh, because the creation date column is our partitioning key. If we want to really team up, um, you know, the, like, like segment elimination plus partition elimination, we need to access data via the partition column. So if we run this query, we will see this is, um, uh, looking for bounty amount equals zero.
And the bounding is 2010. Well, just the year 2010, right? Greater than equal to 2010.01 less than, uh, 2011, less than 2011.01.
And we look at the messages tab, we read 11 segments and skip zero, right? So all 11 segments that we, we could have read, we read, right? And that the segments in here are going to be ones that like, like, like I said, in another video, like the batch mode plans don’t show us which, how many partitions we touch.
You know, kind of leaves a little bit of guesswork to the, to the curious mind. But, um, this is how many, uh, like, because this is partition by year, we’re just looking at the year. We’re just looking at the partition that has data for the year 2010 in it.
And there are 11 segments within that year, right? So we didn’t skip any of them, but we were able to skip like all the segments and partitions that aren’t relevant. If I run this query where I’m looking at 2010.12.02 through 2010.12.31, right?
So this is just a portion of, uh, the year. And we look at the messages tab, then we’ll see that we only read two segments and we skipped nine of them, right? So we were able to like, even within one partition, we were able to skip segments that didn’t have relevant data to us.
So that’s, that’s pretty cool there. Um, I forget what this query is supposed to show. I should have left some notes for me for myself, maybe.
But, uh, anyway, uh, we can see that only some of these, um, have the bounty amount equals zero in them, right? So the min data ID in these is zero. The min data in these two is not zero.
So, um, there are two partitions where that have no zero bounties in them, which is a little weird, but not the end of the world. Anyway, uh, if querying the data in other ways kind of gives us different, you know, um, sort of fingerprints, uh, for segment and partition elimination. Uh, none of these are, I don’t think any, there are any examples of partition elimination for the following queries cause we’re not hitting creation date anymore.
But if we look at, come back to this query, we’ll see that there are, you know, all like the max data ID and, uh, partitions three through seven is 550, right? So if we run this and look at it, we’re going to hit like just about everything, but we are able to skip, like, because we’re like, we’re going to read all the partitions, but we’re able to skip a lot of segments based on the metadata that SQL Server has about, um, like what’s where. And, uh, the same thing, if we have a really high post number, right?
Like SQL Server is going to be able to figure out like, Oh, like I, I don’t need to read a lot of this because like, like this post ID has to be like, like, like it’s a very high post ID relative to the, like, like max post ID in the post table. So that, that post ID doesn’t, hasn’t been voted on until like recently. So we’re able to skip over a lot of segments in order to get to just the segments where that post ID might exist.
But if we, um, if we query by a very low post number, right, this is a very high post number. This is a very low post number, like post ID one, three, eight people have been voting on post ID one, three, eight, like basically since, you know, I don’t know, like forever, right? That’s one three is a very low post number.
So we can’t eliminate anything with that one. We have to read all hundred and four segments. We don’t skip any of them. Uh, same kind of deal with vote type ID 16, right? Uh, well, I mean, maybe not totally, but we’re able to skip eight or sorry.
We read 83 segments and we skip 21 vote type ID 16 is of course approved edit suggestion. Uh, so I guess that’s kind of okay. Um, another thing that will not help you in your journey to, um, and you know, it, it, it, it, it, it, it giveth and it taketh away.
So like, you know, people are most interested in data that is, you know, unfortunately sort of common, right? Um, like at least in the stack overflow world, vote type IDs two and three, those are up votes and down votes. Those are the two most important types of votes that, you know, um, that we can get.
So like, let’s, you know, these two, right? Up mod and down mod, right? Oh, that, not that first one. So like, these are going to be common. These are going to be like scattered throughout all of the partitions and all of the segments because, you know, they’re the most common thing.
We could pair this maybe with, you know, like some year boundaries in order to get partition elimination. But if we just need everything, you know, we still have to read through all the segments where this data might exist, but it’s okay. Like generally, I think, because I mean, you know, reading through a 53, actually this, this table is twice as big because I have to put double the amount of data in the clustered columnstore table in order for it to, um, really make, uh, like any sort of performance blip happen.
Um, so this table is actually very, very, uh, about a hundred and something million rows. Um, you know, like we, we do read a lot of them and we do a lot of that, but like the whole thing takes 18 milliseconds reading from that. And, you know, the whole query finishes very quickly anyway.
So columnstore great for reads, right? Very, very, very much speed up read queries. Uh, you can certainly pair it with, uh, partitioning in order to get partition elimination plus segment elimination, which is, uh, can be very useful for, uh, very large tables, but most tables probably won’t need it.
Um, most tables that, you know, I would say, uh, under the like a hundred million or so roll mark, aren’t going to see vast benefits from adding, from, from commingling partitioning and columnstore. That being said, if you anticipate this, these tables growing to very large sizes, you may want to start that journey early. You may not see the benefits immediately, but once that table gets, if that table that’s like 200 million rows, doesn’t see a lot of like, you know, uh, action from that.
You might try, you might start seeing it around like 500 million or a billion rows where all of a sudden you’re able to cleave out not only, uh, segments, but also you’re able to get partition elimination. And whittle down and figure and just read smaller numbers of segments, even within a partition. Anyway, that’s about it for me here.
Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that you will buy the course, which is helpfully linked for you down in the video description. And I will see you in tomorrow’s video where we are going to talk about partitioned views. So we have that brilliant thing to look forward to.
Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Learn T-SQL With Erik: Partitioning != Performance

Learn T-SQL With Erik: Partitioning != Performance



Fair warning, the audio got pretty garbled on this one. If you don’t like space alien voices, hit mute and/or skip ahead to the transcript.

Chapters

Full Transcript

Erik Darling here with Darling Data. And today’s video, we’re going to talk about how in SQL Server, at least with rowstore indexes, let me make sure that I lay out all the caveats up front before some actually person shows up and like, before they even watch the video, because Lord knows there’s a lot of that on the internet. Partitioning with rowstore indexes is not a query performance feature. It is, of course, a data management feature. Before we get into that, though, I did want to thank the nice folks at Mother Duck for hooking me up with a nice rubber duck thing here. It says data person over there, backwards hat and glasses, kind of like Run DMC Duck. And on the bottom it says, Mother Duck, you’re the one, you make data so much fun. I thought I was making data fun, but apparently they make data more fun. So thank you, Mother Duck. Down in the video description. You will see all sorts of helpful links. If you feel like spending some money with me, on me, together, you and I. We can go shopping. You can hire me for consulting, buy my training, become a supporting member of the channel, ask me office hours questions. And of course, if you would like to, you know, I don’t know, hang out and make this channel more socially acceptable, well, you can like, subscribe and tell a friend.
Get notified when I publish these works of art. The material that we’re talking through today is, of course, part of my Learn T-SQL with Erik course. This is just a small snippet bit of the full course material. And there’s a link with the coupon code down the bottom where if you feel like purchasing the entire thing, then you can do that at a discount. Another thing that you can get absolutely for free is my new SQL Server performance monitoring tool. All right. It is a free open source, no email signup, no weird telemetry telling me about your SQL servers. It’s just a bunch of T-SQL collectors going in, collecting all the stuff that I would look at during my consulting engagements. And it’s all there to help you get a handle on SQL Server performance.
It is basically all the stuff that I answer questions about when I’m working with people. So I thought it was a pretty good thing to just let loose into the world. You know, there’s only one of me. I can’t scale beyond this one of me. I’ve had a very hard time teaching people to do the correct things with SQL Server. Despite all the years of blogging and videoing and training and everything else, it seems like people are just scared of SQL Server. I don’t get it.
But if you feel comfortable with doing robot stuff, there are optional opt-in MCP tools that you can use. It’s got a server built right in there so you can have your very favorite robot talk to your performance data and answer questions about it. How good those answers are, I can’t tell you. The summaries are at least pretty good, but at least some of the advice is maybe not quite all there yet.
The robots haven’t gotten the message on some things. Anyway, if you like human-generated advice and all that other stuff, you can catch me out and about in the world, traveling all over the place, trying to bring enlightenment from one human to another.
Sort of in order here, I’ll be at Passa with my right hand removed from being my body. Pass on tour in Chicago. That will be May 7th and 8th. That is, I don’t know, I guess about five weeks from now.
At least as of this recording. SQL day in Poland, May 11th through 13th. That is a three-day adventure. That is not a two-day adventure. Then I’ll be home for about a month and heading off to Data Saturday, Croatia, which is going to be a grand old time.
That’s June 12th and 13th. And then, at least as far as I know, that’s all I have going on with my life for a little bit. And then in Seattle, Washington for the Big Pass Summit, November 9th through 11th.
So, if you, again, want to come get a human hug, this is where you can get human hugs from. At least from me. I mean, you might have other sources of human hugs that you prefer.
You may just want to learn about SQL and not human hug me. I understand either way. It’s fine with me. But for now, man, there is so much baseball to go that I am just a very happy person. If you are the type of person who likes buying stocks, playing the market, buy some Coors Light stocks because it’s going to be a busy summer for me.
Anyway, let’s talk about partitioning. So, I’ve got a table called Votes Partitioned. It’s already partitioned because who would want to sit there and watch me partition a table?
It’s not an enjoyable experience, right? But it is partitioned by, let me make that a little bit bigger. Thank you, Aaron Stilato, for project, for product managing this wonderful live result set scrolling, zooming into SSMS.
Of course, you could always scroll results. Now we can zoom on results. But this is my Votes table.
It is partitioned by the creation date column. And it is partitioned by year, right? And you can see that I have followed the partitioning bible. I’ve got an empty row group on either side.
And, well, everything is pretty okay there. Now, it’s sort of annoying. Now, this has nothing to do with partitioning performance in general. So, it’s annoying that batch mode doesn’t show the details of which partitions were accessed in the show plan XML.
So, for a lot of these demos, I’m going to be disallowing batch mode so that I can show you kind of what’s happening with them. So, if we run this and we look at this query, or rather, we look at this query plan, we say, look, we executed our execution plan. We did a great job.
We have obeyed many, many rules of partitioning. And we will see that we only had to access one partition to find our data, which is wonderful, right? Because we just looked for the year, the data from the year 2013, right?
And since our data is partitioned by this, we can find that data easily. The thing is, there is absolutely no difference between seeking into a partition like this and there is seeking into a B-tree index that would happen to lead with creation date. Because our clustered primary key on this table leads with creation date.
So, we can do that sort of thing. It’s almost like just having an index. It’s like, now we have all sorts of other stuff now kicking, going in, and getting a problem, and causing problems for us. So, some things that we normally need to not pose tremendous issues for C++ server, things that cause issues for partitioning.
For example, our partitioning problem is in the daytime. And it seems that Antiglator’s optimizer hasn’t been able to handle this as a study for lately. But, uh, when, uh, when, uh, when, uh, when, uh, when, uh, when we, uh, when we declare a local variable in the day, we think that we need an option to compile that we’ve done here.
That will allow the primary writing optimization to happen. Uh, um, we don’t necessarily get the C++ data results. Unless that battery maybe gets not quite as snappy as the other area. And now we have this whole kind of a strange area where we are.
And we can see the access to all of the partitioning, even though C++ server is able to handle that in a, in a, in a similar way. But, uh, the thing that we know more in that sub-clarity reform is also sub-clarity, uh, partitioning in the nation. So, wrapping a column, like, uh, wrapping a column in the, in the year function, just as bad partitioning as it is with, uh, as it is with the normal index.
Um, and, of course, if you try to VCC and run on any kind of, uh, even, even something that is sort of transparent from the optimizer is convert for eight. Right? Because it’s, like, like, again, C++ server is a smart number of this stuff.
But, um, if you try to convert from eight, eight, uh, here, we, we do not get the results that we would, we would want to get, you know. So this query is also not quite as snappy. And if you look over here, here, you will see, uh, well, this actually doesn’t, it doesn’t show anyway.
I’m not sure what happened here. We don’t even get the actual number of partitions that we, that we, we used on there. So, that’s a fun one.
Apparently, we eliminated it and did not. So, so, the other problem you can run into with, uh, tables that you have partitioned is around line indexes. So, uh, it, what, what, what we’re showing here, you’re getting an end on creation date.
Right? So, creation date dates, the partitioning column. So, those bills don’t justify, like, getting in a creation date, these, these, these query plans, and it’s sort of, or, if you, you know, I, first of all, this used to look at the query plans like this, that, like, basically just says they need to top one, like, from, um, the, from the table, and, like, it’s fine, right?
So, it’s very easy to do. But, if we, uh, run on that query read, we say, hey, I want to get no type by date. And I have an unhonustered index.
I have a line to the partitioning scheme and everything else. Because I want to be able to swap data. And, oh, I do not find my indexes. And I cannot swap data.
And that would lose the data. And it manages the partitioning. Well, this, this doesn’t, doesn’t wait until so soon. Right? See, the results of my list does not have the yielded ability to do things in the same way. Notice, when I’m joking, we have that possibility.
Now, we have a plan with a two-screen magnet. There’s a paracarons. Yadda, yadda, yadda, yadda. And you look over here. Remember, we’re going to see, we, just, just, just to find the new type ID. We looked at all of the politicians.
Right? Right? So, non-aligned indexes can compose real, real politicians. We looked at all of the politicians. Just to sort of contrast that in some of the same thing. If I want to get, like, a min, min, type ID in the votes table.
I already have it. I have the same, basically, like, that index up, you know, I’m going to hover it over right over there. There.
But we need to type ID. So, see, this should be explained to my min, min, type ID very, very reasonably. So, instead, it looks at all of the partitions and scans things. Right? The optimizer is just very, very interesting. The same thing with that.
If we look at this, this unpartitioned table. And we say, getting an min, type ID, I have. I have the same, basically, like, of course, the table of the partitions and the index can’t be aligned.
It’s just, like, I have another index that we can use a type ID. We can get a query branch, which you want. We’ll say, say, again, get the top one, so I can find . Partitioning doesn’t mess that up.
So, we look at some differences here, right? I’m going to run all these queries together. So, the first one is using a non-partitioned table, right?
I’m saying . The second one is using a non-aligned index on the partitioned table, and the third one is using the aligned index on the partitioned table.
Three very different performance quiz proposals. The first thing, we have the plan shapes that we want. We just, in order to find my name from a query from a table, it has that index, right?
The index presents that data in that order. Even if it was in the sense in order, it would just say, okay, go to the other third and work. These are the plan shapes we want.
But using that aligned index, we have to scan all that, but the 3.5 steps in a query, instead of that 0, the second query we do using the non-aligned index. On the partitioned table, right?
So, that’s also something to think about. There are other types of queries that can pose similar performance outputs. So, for example, if I say, you can give, like, you can give, like, you can give, like, you can give, like, you can give, like, you can give, like, you can give, like, you can give this list of the query we’re in shape that you would expect.
So, I say, you can give the top five, I know this index is equal to type ID, just to make sure that, like, it stays consistent. And, and, and, a lot about type ID, this decision is very quickly, we have a short story in here, here, like, this is all zero analysis, right?
Nothing in here is taking time. If I choose the non-aligned index on the, pardon, that is still fast, and that is still, just about the same data we got before, and maybe not exactly the same, but it is, it is close enough here.
Here. This line index, if I tried to run here, it would take a lot. If we come over here and look, you’ll see this query, when it ran, took a full minute, exactly, to the second, one minute, right?
We scan that index, that 35 seconds, that can be sorted, maybe this build, that in about six seconds into the mix, and then between, uh, let’s see, uh, 41, what’s, that’s 52, so, uh, this is about six seconds, the loop joint itself, 41 seconds, going to be, look out, 52 million, right?
Uh, that’s not great. That’s not a good strategy. Right? Uh, all sorts of queries, in the central server, can get very, very strange, once you lose partitioning.
So, so, the next time someone sees the partitioning of the formal feature, laugh at them. Say, it’s a date of the internet feature, and if you’re looking at the closest, it’s, I mean, it’s not what you can find, right?
Because, they’re asking for it. This is, it’s not new information. Uh, you can work with robots a bit, but, like, like, if you grab the, uh, partition numbers from, like, all these tables, and dump them into a table, like a table variable, you find here, here, and if you say, give me these things, and then, and you rewrite my query, right?
So, we just put all the partition into that table, and we have to rewrite our query in kind of a strange way, right? Let me say, type ID, uh, from, the cross-apply button here, right?
In here, we are saying, okay, well, we need to take the partition numbers that we have just put into that to them table, and we need to record them, we need to use those weird, weird dollar sign partition functions, and we need to feed them the creation date column, and we want the table to partition it by, and we say, hey, match the partition number, and that turns out, alright, right?
It’s like, almost the square here, and shape, shape, and line. It’s a little bit more complicated, but it does, does, does best define it. But there’s a lot of times when you just, you’re not going to rewrite my query, as much as possible.
If you’re using ORM, good, good luck with that, right? Dummiesies. If you’re dealing with a vendor application, where you can’t rewrite the code, maybe you’re as free as you want, good luck with that, right?
There’s like, things that you can do, but, they’re not always straightforward, right? They’re not always easy as things in the world, well. And, if you, but if you’re allowed to rewrite the query, you’re going to take advantage of some of this stuff, you can work around it, since the social distancing is partitioning, and get performance links you would get, if you didn’t bother to partition that table in the first place.
So, again, partitioning, and these are sort of, with rows or indexes, it’s not really a strong component feature. If you need to swap data, and all that other stuff, I, I get using it.
But, you also have to, in order to do that, you have to have your indexes aligned, to the partitioning, and if you don’t do that, and you will not really need to do the switch choices. Which is, probably, you might not want to do that in the first place.
But, when it comes to this, like, like, like, normal, running and hitting stuff, partitioning is not a solution, that you should be exploring for that.
Like, just this normal indexing the table, you know, much, much more abundantly, much, much more efficiently, without having to redo the entire table, and worry about all this stuff that comes along with partitioning, and then getting a very, very client, when you at least expect it.
Anyway, thank you for watching, I hope you enjoyed yourselves, I hope you learned something, and again, this is, this is a snippet from my larger environment, he sees, there’s a link down that video description, with a coupon for the cash, if you want to purchase the whole course, and learn the full breadth of the material, you can do that.
Anyway, thank you for watching, I hope you enjoyed yourselves, I hope you learned something, and I will see you in tomorrow’s video, where we will, inspect, partitioning with columnstore indexes, where there can be some performance benefit.
Alright, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.