Learn T-SQL With Erik: Multi-Statement Functions
Video Summary
In this video, I delve into the world of multi-statement table-valued functions in T-SQL, explaining their structure and behavior while highlighting key differences from scalar UDFs. I explore how these functions operate within SQL Server 2017 and later versions, focusing on concepts like interleaved execution and deferred compilation. By walking through detailed examples and query plans, I aim to demystify the intricacies of multi-statement table-valued functions and their impact on performance tuning. If you’re interested in diving deeper into this topic or exploring other advanced T-SQL topics, be sure to check out my full course content available in the video description.
Full Transcript
Erik Darling here with Darling Data, finest SQL Server consultancy outside of New Zealand. I just got my BeerGut Magazine Magic Quadrant notification that I am a leader in SQL Server performance tuning consultancies. So we are proud to announce that. So, you know, we just can’t be stopped. We cannot be beat. We cannot be beaten. Even though there’s only me, we cannot be beaten. All right. Team effort. Anyway, we’re going to keep talking about T-SQL today. We’re going to expand our T-SQL horizons with, by talking about multi-statement UDFs. Multi-statement UDFs, of course, return a table variable. If you’ve been watching the content on my channel for any amount of time, you’ll know exactly what my feelings on table variables are. They are good for a few niche use cases. Aside from that, they are not of much use or value. But, you know, we find people using them in all the familiar ways and all the familiar places. And we are able to remedy that situation when we come across them. But multi-statement UDFs really have no choice but to use a table variable. So we need to talk about these today.
If you would like to buy this course, which, of course, all the videos that I’m doing here are just small nuggets and morsels from the full course content. And if you would like to buy the full course, down in the video description, where this helpful array of fingers is pointing, you can find links to buy the full training, hire me for consulting, become a contributing via money member of the channel. You can also do things that are free, like ask me office hours questions. And of course, if you if you enjoy this content, please do like subscribe and tell a friend or two. It doesn’t have to just be one. Tell one you’re not maximizing your efforts. And this MLM is never going to get off the ground. If you don’t maximize your efforts, is it? All right. Anyway, if you would like to see me out live and in person, the nice folks at Redgate have decided to pull me from my wine cellar and drag me out to various and sundry locations across the world, Dallas, September 15th to 16th, Utrecht, Netherlands, October 1st and 2nd, and Seattle, Washington, November 17th to 21st.
Quite notably, in Seattle, I will be doing two days of T-SQL pre-cons, some of it based on this material with Ms. Kendra Little, and it will be the best two days of your lives. All other days will pale in comparison to those two days. So you can, you can come see me and we can compare how pale we are together. Anyway, let’s talk about multi-statement table valued functions in T-SQL. We must go to the correct spawn of SSMS in order to engage with this material. So let’s do that. And let’s talk a little bit about multi-statement table valued functions. Now back in 2017, imagine where the time goes. SQL Server 2017. It feels like it was just yesterday. It was just one plague ago. SQL Server 2017 was around, and there was sort of a precursor feature to a later feature called interleaved execution for multi-statement table valued functions, where when you invoked a multi-statement table valued function, rather than just guessing like one row or a hundred rows, depending on which cardinality, estimation model you’re using. SQL Server would wait until the function had executed and then use table cardinality to drive, you know, some of the cardinality estimates for the rest of the query.
Not a bad start, but, you know, we still don’t get column level statistics histograms with table variables, no matter if they are used in multi-statement table valued functions, if they are used normally as declared tables, or even if they are used as table valued parameters. I suppose we should talk about table valued parameters at some point, but not today. Not today, buddy.
But the precursor, the feature that this was a precursor to was, of course, table variable deferred compilation, where SQL Server would now wait for a table variable to be fully populated, and then it would use table cardinality for the table variable in order to inform various plan choices. But again, still no statistics, still no histograms.
So even though we knew how many rows were in the table variable, we still did not have any idea of the contents of them. But this is what a pretty typical multi-statement table valued function looks like. It returns a table variable with a set definition, and that’s about it.
Between the begin and end, you can put a lot of different stuff in there. There are some usual set of restrictions on functions where you can’t do a lot of side-effecting stuff, execute store procedures, yada, yada, yada. But today we’re just going to look at a very basic single insert into a table variable and some things about the query plans.
All right. So the first one is just like scalar UDFs. If when you get, say, an estimated execution plan for a multi-statement table valued function, the query that calls it will only have some sort of hidden information about the user-defined function itself, right?
You’ll see the table valued function with the name and then a table scan where data is pulled out of the function. The query plan won’t always look exactly like this. We’ll talk about why in a moment.
But then down in the estimate, down when you call the estimated plan, you’ll also have one additional estimated execution plan for the function body itself. You can see this is where we hit the post table, do some stuff with the row numbers, some filtering, yada, yada. But up here, we did not see any direct reference to the post table.
This is all happening within the body of the function. If we look in the properties here, we will see the reason why the query plan looks like this is because we have this contains interleaved execution candidates true attribute in the plan XML. That will not be there for all multi-statement table valued functions, of course.
Some of them, we will be denied access to that feature. But when we run the query and get the actual execution plan, well, this query does take a little bit to run. But the actual execution plan just shows us this.
And this is a little funny because we don’t see any time in here at all, even though the query took about four seconds to run. If we rerun this, I believe this is where we’ll see the operator times show up. And here we’ll see 4.6 seconds here and 4.6 seconds here.
So this is a very tricky thing, is when interleaved execution is used in your multi-statement table valued function, the first execution where you gather an actual execution plan will have no operator times for it. The second execution will.
So a little bit of a tricky troubleshooting thing there. So just be very mindful of that when you are query tuning. But notice that for this, we don’t get that separate execution plan down here where we had one before. This is the exact same thing like with the scalar UDFs.
All the work is, all the things that happen inside the function are hidden inside these operators in the plan. Well, mostly this operator in the plan. But this one is where data was pulled out of the multi-statement table valued function.
We see the two rows leave here. So that’s where, like, the work is all sort of in here, but the stuff that gets pulled out is here. The table scan is, of course, just the definition of the table that we have in here.
So that’s fun. And this gets, of course, particularly tricky when we want to do something with a query that calls a UDF and joins it to something else. Where, you know, now we just have this very simple looking execution plan up here with absolutely no references to all the work that occurs in here.
Now, since this type of function at some point has to insert into a table variable, what we’re going to see is some sort of non-parallel plan reason. The non-parallel plan reason, if you are on SQL Server 22-ish bits or better, will be table variable transactions do not support parallel nested transaction, which is a nice way of saying you can’t use a parallel execution plan when you modify a table variable that includes inserts. That is true even inside of these functions.
Of course, you can’t see that when you get the actual execution plan because the query plan for the function body is neglected from being shown in there for the very similar reasons to that of the scalar UDF where for every row that this thing has to process, we do pass data in and get data out. So if we get the actual execution plan for this thing, it’ll take a little bit longer. How long?
Who knows? Who can ever tell? Right? Could be a very long time. This function doesn’t seem very quick to me. Perhaps there is some indexing work that we could do with this function to speed it up.
The world may never know. Maybe an index on the post table would help us out here. Right?
Maybe this is just too painful. Oh my God. 27 seconds. But if we go and look at this. Now notice that we did not, for this query, we don’t have that sequence operator in the query plan, which means that we do not have an interleaved execution function in here. Notice that over here, we just don’t have that, what do you call it?
We don’t have the contains interleaved candidate attribute. It’s just not there. Right?
It’s not there in saying false. It’s just not there. Now we just get this generic non-parallel plan reason. Could not generate valid parallel plan. But even though in the function body we had, like when we had the estimated plan for the function body, we could see it was because of the table variable. So now we just see that as the higher level query plan for it there.
And we can see that we spent 27 seconds inside of here. Of course, this doesn’t look like very much of a big deal in the plan itself. Like if you were to be, if you were looking at, say, an estimated execution plan or, you know, you were like, you know, something like you got it from the plan cache or from query store where the actual runtime metrics aren’t included in the plan.
You would see this 0% cost and you would think nothing of it, wouldn’t you? You would say, my goodness, we scanned this cluster. 97% of the cost is scanning the clustered index of the users table.
We have to do something about this. We must, we must remedy this. But no matter what indexing you do up here, it is not going to help the 27 seconds that we spend in here. It’s really only by getting the plan for the function body that we can make any sort of reasonable assessment of what we might be able to do in order to tune this query using the multistatement table valued function.
Sometimes it is good enough to, say, add an index to support whatever the body of the function is doing. In this case, even just an index on the owner user ID column would be enough to get us somewhere. But, you know, we’re not going to do that here.
This isn’t necessarily about query tuning. This is teaching you about the T-SQL that makes up these various modules and how they might mess with your T-SQL query plans and queries. So, multistatement table valued functions, generally, don’t want to use them.
Again, they have some niche use cases just like T-SQL scalar UDFs. But generally, one should attempt to avoid them at all costs if one cares deeply about performance. So, thank you for watching.
I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video, maybe. You never know. Tomorrow’s never promised.
All that stuff. Anyway, thank you for watching. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.