Signs Your SQL Server Execution Plan Is Lying To You: Scalar Valued Functions

No Values


Video Summary

In this video, I delve into scalar valued functions and their impact on query plans, particularly focusing on how they can mislead us when analyzing execution plans. I demonstrate using SQL Server 2017 to show a scenario where a scalar valued function causes significant CPU usage despite having a low cost in the overall plan. By leveraging SPBlitzCache, we explore the detailed behavior of these functions and uncover surprising insights about their execution, such as how they can run multiple times per row returned by the query. This video is part of my ongoing series on hidden issues in the plan cache that can trip up even experienced SQL Server professionals.

Full Transcript

Hello, Erik Darling here to continue or start really this is to be the first this is the first one where there’s like actual stuff going on in my series about things that lie to you in the plan cache. And we’re gonna start with everyone’s favorite bad actor, the scalar valued function. Now, I know what you can… Apparently someone is opening a crypt outside. So, I know that you’re gonna be sitting there saying to yourself, this is fixed in SQL Server 2019. No need to worry about this anymore, but that’s not true.

There are some interesting and surprising limitations and gutches around how SQL, like what kind of, what kind of function SQL Server can inline, the extent to which they can be inlined, and some other things. This particular function is the kind that cannot be inlined. There are, I don’t know, of course, other ways that you can do this sort of thing. There’s a built-in function that can do string aggregations, but it’s a little incomplete for some people.

There are some, you know, interesting side use cases where you can’t use the built-in string ag function. So, anyway, what I’m gonna show you here is let’s turn on… Actually, I don’t have to turn on query plans. I’m smart. So, I built all of these stored procedures to have set statistics XML on inside them.

So, I don’t have to turn on query plans anymore. I don’t have to remind myself to do that and then turn them off and then blah, blah, blah, blah, blah. Anyway, let’s have fun. So, I’m going to run this query. And this should take around about 10 seconds to run, which is almost the perfect amount of time for any demo query. When they take longer than that, I have to talk and fill up a lot of dead space. And that’s where things get dangerous.

That’s when I say things that I shouldn’t. That’s when I probably break NDAs and stuff. So, let’s look at the query plan. Now, it’s gonna be a little bit more obvious what happened here with… Because this is SQL Server 2017 and I’m using SSMS 18 point, whatever latest is. Just insert latest number here.

It might not even be 18. It might be 25 by the time you watch this. Who knows? But the point is, with the query operator times here, it’s a lot easier to see exactly where SQL Server spent time. And it’s gonna be pretty easy to see from here where things have gotten…

Where things could be misleading as far as the plan cache goes. So, we spend 0.000… 0.002 seconds scanning the cluster and index of the users table. However, SQL Server assigns 99% of the cost there.

Ditto this top operator, which costs 1%. And we don’t really spend any more time there. But, then we get over to this compute scalar… And… this doesn’t go so well.

This runs for 13.5 seconds and costs 0% of anything. So, lucky us. Lucky us to have this information, to have these details. So, let’s close that out. And let’s take a look at what that looks like in the plan cache.

Now, I use SPBlitzCache for all this stuff. If you want to use it, you can go to firstresponderkit.org. That’s firstresponderkit.org.

I still contribute to this thing. I love it. I believe heavily in it. And, as I was… as I was… starting to put this presentation together, I realized that a lot of the stuff in here was kind of like an ode to SPBlitzCache. And, you know, the work that people have done on it over the years.

So, it’s a really cool, handy tool to have. And, let’s run this and look directly at the stored procedure that we just ran. Now, a lot of people don’t realize that you can pass in a stored procedure name and get results for one specific stored procedure in the cache.

But, I think it’s pretty spiffy. But, what we’ll have here is we’ll have a couple warnings. One for forced serialization.

That forced serialization warning is because we call the scalar valued function. And, we’ll also have that low cost, high CPU warning. Now, if we go over here and we see this cost of 0.76 whatever whatever, that’s a very low cost query. It’s a bit suspicious that a query with that low of a cost will run… will use 50 seconds of CPU time and run for 13 seconds total.

Crazy, right? Crazy. Now, if we look at the query plan for this and we look at what happened over here, it would be fairly difficult to tell exactly why this took 13 seconds.

A lot of people would say, Damn, this clustered index scan 99%. How dare you exist? How dare you scan that clustered index?

But, reads aren’t the problem. Using a ton of CPU is. Now, we can, if we hit the properties of this compute scalar, we can look here and we can see the call to the function, right?

We can see this, the call to the function here. And this might give us some, some clue as to what was happening, perhaps. But, it would still be quite misleading if you were just, just getting started looking at execution plans.

You would focus on this clustered index scan. You would freak out for all sorts of reasons. And you would probably not know to hit F4 or to get the properties of an operator in order to, see, get more details about it.

So, now that we know what happened with, or now that we know that we called a scalar valued function, we can use SPBlitzCache to focus in on the scalar valued function. We can see exactly what it did.

Now, something interesting here. This cost 40 query bucks, right? If we remember the cost of this store procedure, it was like 0.03 or 7 or something. It was very, very low.

The cost of the function was not baked into the cost of the query that called it. Right? Because it’s black box, black boxing of things. Now, we can see, well, this is kind of a lot of warnings over here.

None of them really pertinent to what we care about. But you can see that SPBlitzCache did indeed go crazy and look at stuff. But I want to show you something kind of interesting.

The query plan for the scalar valued function went parallel. That’s why a lot of CPU got used for the calling query. The calling query itself is forced to run serially, but the body of the scalar valued function can go parallel.

So that’s why it chewed up a ton of CPU. And there’s also the fact that for the 100 rows that we executed, remember there’s a top 100 in here. For the 100 rows that we executed that function for, the function executed once per row.

Now, this, it’s a pretty convenient shortcut to say that scalar valued functions run once per row returned by the query. That’s highly dependent on where the compute scalar is positioned in the query plan. If you have a top 100 query and the scalar valued function doesn’t execute until like the very, to the very leftmost portion of the plan, then yeah, it might only execute for 100 rows.

But if that compute scalar is somewhere else in the query plan, if it’s like deeper into the plan, where like maybe you have more rows coming out of a table and like you, you use, you, you call the function earlier in the query, then it could end up running a lot more times.

In this case, it just happens to run a hundred times because of the top, top 100. And the fact that it, you know, the only, it’s at the very beginning, beginning end of the query plan. It’s at the finale of the query plan.

So we can see that it used a lot of CPU for the, well, I mean, in total, we’ll use it about 500 milliseconds on average for the 100 calls. So you can see that it did chew up. That was, did chew up a lot of CPU and time. And that’s where we spent most of our, our awful stuff, time things doing whatever.

It’s early in the morning. It’s 934. I haven’t been up this early in weeks, school vacation and everything. I’ve been sleeping in baby. Oh, it’s back to the grind.

Anyway, I’m going to call, I’m going to call it a day on this video. And I’m going to record the next video, which in case you, your eyes have not wandered, will be about table variables, which still aren’t fixed, still aren’t fully fixed in SQL Server 2019.

Anyway, thank you for watching. I hope you learned some stuff. Hope you enjoy yourselves and I will see you over in the next video.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.