SQL Server Performance Office Hours Episode 49

Query Store shows this plan has the lowest average duration, so it must be the best plan. Should we force it?

One of my queries uses an eager index spool operator that’s costing 45% of the query. I understand it’s building a temporary index, but I can’t figure out what’s triggering it – there’s no Halloween protection needed and no self-joins. The query runs fine if I break it into temp tables. Is there a way to see what the optimizer is trying to protect against, or how to prevent the spool without rewriting?

Let’s say your friend was hired as the sql developer/DBA for a very mature application. This application uses a model where each client has their own database, but each database has the same schema. Most of these databases are hosted in Azure SQLDB. However, many top-paying clients host their own server/on-prem sql and refuse to upgrade past 2014/2016. Your friend notices a lot of transformations done in procedures are way quicker using some of the features/functions in newer versions. Would your friend be an idiot for creating two separate variables, one with the sql statement for the old versions, and one for the new, then executing the appropriate one based on the SERVERPROPERTY(‘Edition’) value? How would you handle this? Also, will you be my friend? Thanks, from [hopefully] your new friend.

Postgres and Oracle have skip scans. Why Doesn’t SQL Server?

I am interested in your AI course but my company probably won’t be on SQL Server 2025 for a while. How relevant will this content be in a year or two? The AI world seems to change rapidly.

To ask your questions, head over here.

Video Summary

In this video, I dive into some interesting questions submitted by viewers during my Office Hours session. I address topics ranging from query store optimization to handling different database environments and SQL Server versions. Whether you’re curious about forcing a plan that shows as having the lowest average duration or dealing with eager index pool operators in your queries, there’s something for everyone. Additionally, I explore strategies for managing code deployments across multiple server editions, suggesting approaches like using environment variables or parent-child store procedures to handle different SQL Server versions without complicating your existing codebase too much. The video also touches on the absence of skip scans in SQL Server compared to other database systems and why this feature might not be as critical due to how SQL Server optimizes query plans. As always, I encourage viewers to engage by submitting their own questions or becoming a supporting member to help keep the content relevant and useful for all SQL enthusiasts out there!

Full Transcript

Erik Darling here with Darling Data, and we are in the midst of a Monday. But I suppose that’s good news for you, because that means you get to experience yet another Office Hours. I think we are coming real close to closing in on episode 50, which I will do absolutely nothing to commemorate.

But anyway, let’s talk through our stuff. I will answer five user-submitted questions to my Office Hours link. Now, if you look down in the video description, you will find all sorts of helpful links.

And not to spend too long here, but I recently spent some time redoing my consulting offerings, so there are some new ones in there for folks who might need different sorts of engagements than the typical blocks of hours that I was selling in the past.

Those are still available, but I do have some new stuff if you want to check that out. I’ve also added a lot of training and redone some of the training offerings in that you can now buy them as subscription products at the monthly or yearly rate. So if you are on the fence about spending some chunk of money all at once, you can now spend a smaller chunk of money, either on a monthly or yearly basis.

You can also use the links below to become a supporting member of the channel. If you just like all the stuff that I talk about, but you want to spend like, I don’t know, four bucks a month just to buy me half a New York espresso here, you can do that. There’s also a very helpful link down the bottom if you want to ask me office hours questions for one of these things.

So if you have a SQL Server question and you’re like, boy, it would be cool if Erik Darling answered that on YouTube, you can submit one and have that happen. And as always, please do the old like, subscribe, tell a friend, all that stuff. This training page is going to get fuller and fuller, at least until things start happening.

I will be at DataTune in Nashville. So all of the, I’ll just get out of the way. I have pre-cons at all of these.

I will be teaching my advanced T-SQL material at all of these. That’s DataTune, Nashville, March 6th and 7th. Data Saturday, Chicago, March 13th and 14th.

SQL Day, Poland, May 11th and 13th. And Data Saturday, Croatia, March 6th and 7th. I should probably fix the alignment on those a little bit, but you know what? I kind of like them staggered some.

I think it gives it a dynamic vibe. It’s like, whew. It’s just misaligned. That’s my fault. Anyway, we’re trudging through January here in fabulous New York City. So we’ll get there someday.

Drunken marshmallows. Data marshmallows. Anyway, let’s answer some of these questions. Here we go.

All right. Query store shows this plan has the lowest average duration. So it must be the best plan. Should we force it? Well, you know, that’s an interesting question because I would, I would be tempted to try it out and see how it goes. So should you force it?

Um, you know, depends on a few things. I’ll talk through that in a moment. But, you know, you don’t need my permission to try stuff. Right.

You can always, you know, like right click and force it. And if things go well, you can just be like, hey, I tuned a query. And if things don’t go well, you can always enforce it. Right.

But, uh, so, you know, like in the, in the, in the, in, it could totally work out in your favor. Right. It could, it could absolutely just like, like cool. It’s done. But you know, like you might need to be a little bit careful because you might need to investigate some other query metrics, uh, around that. Uh, you know, like, um, you could, you could look in, you know, I would, I would prefer if you used my store procedure, SP quickie store for this type of analysis.

But if you look up the, you know, cause you know, searching by query ID and query stores can’t be done. Right. You just can’t do it.

Uh, so if you use SP quickie store and you search for that query ID and you see all the different plans, then what you might find is some differing metrics between them. Uh, there might be something around CPU and duration. Well, I mean, obviously duration is different cause you said this one has the lowest, but you may find some other stuff to be interesting.

Like some other, some of the other plans may be parallel, uh, and some of the other plans, uh, may produce, uh, way more rows than the other. There is a row count column and query store, which can be very helpful for this. So, uh, it could be that your query has the lowest duration for a very small number of rows, like, I don’t know, 10 or a hundred or a thousand and other, uh, other times in this query runs, uh, the number of rows that it deals with is much higher.

Uh, in which case that low duration query, which might just be like some, you know, tiny little serial plan that, you know, this, you know, got a couple of nested loops joins and whatever, and just finds it’s, you know, thousand rows and moves on. Well, there’s a reason it’s the lowest duration, right? Cause it was the least like physical effort in the database.

You might find that other, other times in that query ran, there were much more rows involved. So what you could do is a couple things. Um, I have another store procedure at my GitHub repo that is code.erikdarling.com. And it is called SP query repro builder.

And what you can do is it’s almost has like the same plumbing as SP quickie store, but it has a different purpose. And the purpose is to give you a runnable query based on query store stuff. So like, you know, what, what I like, like the nice thing about query store is that it will, um, you know, uh, give you an execution plan and the query that produced that plan.

And there’s all sorts of stuff in that query plan that can help you reproduce the query. Like if your query is using, uh, you know, uh, formal parameters from your store procedure, and you know, you’re not using like recompile hints or temp tables or local variables, then it’ll set up that query to run with all of the appropriate settings. That your query ran with when it hit the database.

Uh, and you know, you will, what you will find is an executable version of the query with all of the different parameter values that were used for it. So you could experiment a bit with that and see if caching a plan for that low duration thing, and then running other, uh, you know, parameter values for that query maintains the same low duration. So that, that is the approach that I would take, right?

Cause that, that, that might make life easy. Like I said, though, there are some limitations to things. Like if you’re, if your query involves a temp table or a table variable, I can’t tell you what rows ended up in there. I can’t even tell you what the definition of it is.

Um, if you’re using local variables, those disappear with the, um, the values for those disappear in cached plans. And if you’re using recompile hints, then I will do my best to pull literal values out of the query plan that were used, but you will have to figure out which, um, which parameters they belong to. Cause I do not know that once you have recompiled, I, I lose that.

I lose that sense and sensibility. So that’s, that’s, that’s the approach I would take there. All right. Uh, one of my queries uses an eager index pool operator.

That’s costing 45% of the query. I understand it’s building a temporary index, but I can’t figure out what’s triggering it. There’s no Halloween protection needed and no self joins.

The query runs fine. If I break it into temp tables, is there a way to see what the optimizer is trying to protect against or how to prevent the spool without rewriting? Oh, well, it’s an eager index pool, which means one of two things.

And neither of them are Halloween protection. Uh, it means that now you usually tablespools are what get used for Halloween protection. And that’s of course to prevent, uh, you know, like sort of infinite loop modifications to a, to a, to a, to a query, uh, when a modification query runs.

But, uh, what, what you would want to do is try, try one of two things with your query, whichever, whichever table, uh, the eager index pool is being built off of the first thing you try. The first thing you do is add a force seek hint, uh, force seek hint to your query and see if there is an actual seekable index on that table. If you get an error, uh, then there isn’t one.

If, uh, you get an error, what you need to do is look at the eager index pool, because there will be some properties of the eager index pool, which will help you create a permanent index on the table. Um, so that SQL Server does not need to build an index every single time that your query runs. So just to make sure we covered that, uh, the, the eager index pool is not for Halloween protection.

It is, uh, strictly a performance optimization. Usually, well, actually it is not usually it’s on the inner side of a nested loops join. Uh, other things you could try doing if you felt like it, um, say option like hash join, probably, um, that might be one way to get around it, but I can’t guarantee that will be faster.

Um, usually the two things that you do is see a SQL Server, like missing an opportunity to use an index that I have with the force seek hint. If it’s not, then creating a permanent index will get usually get rid of the eager index pool. But sometimes you might find that you still need the force seek hint because the optimizer kind of a dumb, dumb some points.

All right. Let’s see what we got here. This is a long one.

Good Lord. Uh, let’s say your friend was hired as the SQL developer DBA for a very mature application. Uh, all right. Hot, mature applications want you to develop for them or something.

Uh, this application uses a model where each client has their own database, but each database has the same schema. Wild. Never heard of that before.

Uh, most of these databases are hosted and hosted in Azure SQL DB. God help you. I wish I had some holy water to throw on you. However, many top paying clients host their own server on prem, uh, own server on prem SQL and refuse to upgrade past 2014, 2016. Your friend notices a lot of transformations done in procedures are way quicker using some of the features function, the newer versions.

Would your friend be an idiot for creating two separate variables, one with the SQL statement for the old versions and one for new, then executing the appropriate one based on server property addition value? How would you handle this? Also, will you be my friend?

Thanks from hopefully your new friend. Well, I don’t know. You, you, you, you might talk too much to be my friend. I, I can be honest with you. It’s a, it’s a lot of words. The beginning and end of that are very far apart. I’m exhausted now.

Um, so I, I, I don’t know if I would handle this like in the code, like at execution time. Uh, I would, I would probably be more, um, in favor of like, like this being a code deployment strategy. Like, you know, you’re, you’re rolling out code and you’re like, oh, you’re on, you’re an Azure SQL DB where you have access to new stuff, but it’s still Azure SQL DB.

So it sucks. Uh, then, then you would, then you would roll out the, like the specific code for that. Um, I would, I would probably not want to make this a like runtime decision in every like store procedure or query that’s executing.

Um, like, like hopefully you’re using store procedures where, you know, you can like, make, you know, like change your deployments. Uh, process to, uh, roll out like a different code base to, uh, a different, uh, different, like, you know, version edition of SQL Server. But, um, I, I don’t think I would want to do this in the store procedure.

Cause you’re just thinking like very, very long complex store procedures, depending on how long and complex your store procedures are currently. I would be a little scared of that. Um, you know, even with dynamic SQL, I would be, I would be a little frightened of that.

Um, you know, if branches and query plans have, uh, very, very strange side effects, unless you, uh, bury things. So I suppose one way of handling that would be to change if like, if you, if you just wanted to straight deploy all the code, no matter what, one way of handling that would be to change the model to, um, have, have like a, like a, like a parent child store procedure thing. And then be like, Hey, if like, you know, like run this parent store procedure, like take all the parameters that we would take from, uh, like whatever the call is.

And then like execute a child store procedure based on like environment variables. That would be one way of handling it where, you know, you don’t have to tinker with the actual deployment process too much. And you still get sort of like the runtime effect of things, but without like making your store procedures insanely long and complex and dealing with like the, you know, the sort of like a lot of the repercussions of like if branching and store procedures, that would be, that would be another way of handling it.

But, uh, like I said, I wouldn’t, I wouldn’t want to, I wouldn’t want to be like, I wouldn’t want like giant if branches in the store procedure telling you, you know, dictating which, which path to follow for like a bunch of queries. Like that, that’s not a good time. So one of the, one of those other two things, either deployment specific code, uh, based on environment variables or a parent child caller situation with your store procedure, where everyone gets all the code, but only like one store procedure, one like version of the store procedure will run based on those environment variables.

All right. Uh, Postgres and Oracle have skip scans. Come on.

Postgres just got skip scans. Don’t act like, don’t act all highfalutin with me, pal. Um, Postgres and Oracle have skip scans. Why doesn’t SQL Server? I don’t know.

They haven’t put it in there. Maybe they will someday. Uh, skip scans are cool though. And, uh, I, I, I do, I do sometimes wish SQL Server had them because like what, like what a skip scan means is like you would have to like worry a whole lot less about like a lot of different, like nonclustered index permutations for things. So like, let’s say you have, uh, like, I don’t know, let’s just use like, you can either like do like the posts or the votes table in, in the stack overflow database where in the post table, you have a post type ID.

It’s a like highly non unique. Um, you know, cause it’s like just post types. It’s like eight of them and like, like most of them are one and two, which are questions and answers.

And like in the votes table, you have vote type ID. And like most of them are one, two or three, which is like marked as answer up, vote down, vote. And so like, like, like if you had an index that leads with either one of those, but then you have like another, like another column in the table, I think probably let’s just make it easy.

Let’s say it’s on the post table and you have an index on like post type ID and score. And your query is like, you know, like, like just like where score is, um, like score equals 1000, right? Just to throw an example out there.

If you don’t have an index that leads with score, you’re looking at having to scan, um, scan an index in order to, uh, like figure out like, like what, what score is 1000. And like the typical B tree index, that means like you’re scanning the entire thing, right? So like your index leads with vote type ID.

So your index is ordered by vote type ID. And then within like, you know, all the post type ID ones, like score is ordered in ascending order from like, you know, let’s just say zero to one. Like a five kabillion, right?

Uh, what a skip scan allows you to do is like, essentially it gives you like a, like, like a, almost like a double seek thing where like, you’ll break the index into chunks. So like the first thing you would do, it would go like, let’s just say like you would replace it with like post type ID equals one. And then you find all the thousands for post type ID equals one.

And then instead of like, like just scanning and scanning, like then the next thing you would do, you would skip and then you would jump to like post type ID equals two. Then you would find all the scores, score equals a thousand for post type ID two. And then like three and four and five until you finish and get to eight.

So, so like, you just kind of like chunk and index up and you just almost like a, like, like, like seek. And then many like seek to like what your actual predicate is. And then like another one.

And then you, it’s almost like if you did like a union all right. So like one way to, one way to think about that is like, like, like instead of saying like score equals a thousand, it would be like we’re post type ID equals one and score equals a thousand or post type ID equals two and score equals a thousand. It’s almost like you just union all of those queries together.

So skip scans would be awesome to have, right? Like they’re like pretty cool. Cause right now you have to say you can write some like weird recursive CTE queries to like emulate them, but it would be great if like we just had them, right? It’d be a very powerful thing for, especially people who like a lot of reasons why like so many over index tables and SQL servers.

Cause we don’t have nice things like that. All right. Uh, all right.

You can be my friend. Look at you with your nice pithy answers that allow me to talk about things. I can sell. I like you. I am interested in your AI course, but my company probably won’t be on SQL Server 2025 for a while. How relevant will this content be in a year or two?

The AI world seems to change rapidly. You are right. The, the greater AI world changes very quick. Every week there is a new most powerful reasoning thinking thing yet that like just blows the doors off everything that has happened before and is revolutionizing the AI world. Microsoft SQL Server ain’t that right?

Microsoft SQL Server moves at a glacial pace because I mean, you know, it’s never been terribly fast moving, but you know, let’s, let’s not pretend that, uh, you know, fabric isn’t training every valuable resource and dumping it into a useless cesspool. But anyway, um, the stuff that I cover in my AI course, the only thing that would like, like change in the material is certain preview features may become generally available. So like vector indexes, like vector indexes, like, like, like, like a couple other things that stuff might go from G from, from preview to generally available.

And I will update the course content is that stuff changes. But the course content itself is focused largely on, you know, just like the best way to approach, um, you know, vector stuff in SQL, certain SQL Server, how it works and how it works is not going to change that much. And I do cover all the preview features in there.

So like, it’s not like you’re just gonna like, you’re just gonna be like preview features. I don’t know. Like, like I do talk through all of the preview features in there, like, you know, like how they work, what they do, like vector indexes, vector search, um, you know, float 16 for what it’s worth. So like, I do talk through all that stuff and I will again, update the material when it, you know, changes, if it changes, because it’s not guaranteed that anything will go from preview to generally available.

But, um, you know, like, like, like really it’s, it’s, it’s a lot of just like foundational material, but like, like, like how it works, like what, what you should care about the best way to do things. Like there’s a lot of stuff in there. Um, it doesn’t make a difference if the wider AI world is changing.

It’s how you use it in the context of SQL Server will not change. So, you could buy it now and watch it and be prepared for a year or two. You could buy it now and in a year or two, watch it.

It doesn’t matter. Right? Uh, just buy it. Buy it. Arrgh. Come on. Rent is due. Buy the course.

All right. That’s probably good for me here. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope you buy the course because rent is due. I will see you in tomorrow’s video where actually, you know what, I’m going to go, I’m going to be talking more about the new AI course. So I don’t know.

It’s one thing leads to another, doesn’t it? One thing always just drags right into another Monday to Tuesday to Wednesday to Thursday to Friday. Then you die.

All right. Thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.