The Broken fn_xe_file_target_read_file DMF In SQL Server

The Broken fn_xe_file_target_read_file DMF In SQL Server



Thanks for watching!

Video Summary

In this video, I delve into a frustrating issue with the `sys.fn_xe_file_target_read_file` dynamic management function in SQL Server, which has been broken since 2017. Despite Microsoft’s addition of an event date column to make filtering easier, something went terribly wrong due to uncoordinated development efforts. I demonstrate how this defect impacts query results and execution plans, showing you a workaround using explicit date conversion. Additionally, I share my experience with the `SP_Human_Events` stored procedure, which I’ve developed to handle similar issues efficiently. The video concludes with a thought-provoking question about the quality of production code in Microsoft’s software products, encouraging viewers to consider the thoroughness of testing and the potential for hidden bugs.

Full Transcript

Erik Goshdarnit Darling here with Darling Damnit Data. And in today’s video, I’m a little fuzzy. Let’s fix that fuzziness. Let’s take the fuzz off. I don’t need help looking any fuzzier, do I? Things are fuzzy wuzzy enough in this face. Today we’re going to talk about a rather disappointingly broken dynamic dynamic dynamic dynamic management function related to extended events. Now this is one of those things that for as long as, so this function used to be fine, but then in 2017 Microsoft added an event date column to it so that you could filter out date stuff from the DMF rather than going all the way into the XML to do it, which, sounded great. Except the summer interns were up to their old hijinks and apparently they did not talk to the Windows file system folks because something went terribly wrong. I’m going to show you what went terribly wrong and how you can fix it. Now this is now, you know, extended events have been the replacement for Profiler since 2012-ish, right? I mean, I know that they dropped in 2008 and that’s when they were supposed to launch it.

like really get awesome, but they just never did. Microsoft adds a lot of extended events every version of SQL Server, but the whole process of collecting and, you know, like analyzing extended events has always been awful. That’s why I’ve spent thousands of lines of code and hundreds of hours of my life writing stored procedures like SP human events to overcome some of the large gaps that I’ve found working with extended events generally. But before we do that, I’m going to blow your minds on this beautiful Friday and I’m going to tell you that you can support this channel for $4 a month by signing up for a membership at the link in the video description. If $4 a month is just too much for you, you would rather stuff that money in your mattress and hope that it doesn’t get moldy and rotten, get mouse eaten or rat eaten or something.

You can like and comment and subscribe. You can fill my heart with joy by your mere presence on the internet. If you have a SQL Server that you need consulting help with, that’s what I do for a living. Also shocking, I know, I am not just a YouTube celebrity. I also get my hands dirty and do actual work with SQL Server that other people either don’t want to do or don’t have time to do or can’t figure out what to do. So that’s me. And as always, my rates are reasonable.

If you would like some very high quality, very low cost training and you want to spend money before all those Black Friday sales kick in and people want to charge you way more money for SQL Server training content, you can get all of mine. About 24 hours worth for the rest of your God-given life for 75% off, which brings you to about $150 USD. It’s a hard deal to beat. Black Friday or no Black Friday.

If you are still on the fence about Past Data Summit and you’ve never seen one of my videos before, you can rest assured that I will be there November 4th and 5th co-hosting two days of wonderful performance tuning pre-cons with Kendra Little. It will be more fun than you’ve ever had in your life. Probably more fun than you thought could possibly be had with SQL Server.

So you could do that too. But with that out of the way, let’s have fun. All right. So this is a very quick Friday video.

Now, this is the DMF, the Dammit Management function that I’m talking about. Sys.fn.exe file target read file. Rolls right off the summer intern’s tongue, doesn’t it?

Now, this issue has already been logged here and I’ll put the link to this in the video description. But this is a known defect and Microsoft has just always been like, yeah, maybe we’ll do something about it eventually. I don’t know.

I’m on SQL Server 2022. It’s been broken since SQL Server 2017. Will they ever fix it? I don’t know. But we got ledger tables. We got dot feedback.

Priorities. Priorities. Maybe, maybe does no one uses extended event. Maybe that few people use extended events that they just don’t care.

That could be the case. But here’s the problem. If you were to query this DMF and you were to think that this timestamp UTC column would be something that you could very easily filter on with some date math, say, I want to find the last seven days of data, you would be horribly wrong. You would be mistaken.

Something in the file format of the system health extent, the XEL files. This is not limited to the system health extended event. This is all of them.

It uses like, I think it’s called like Windows epic time or something like that. And it’s just a weird number because it’s an epic, right? It’s just a big, long integer and it doesn’t get filtered correctly when you do this. Right?

It’ll get displayed correctly. It’ll get converted to a timestamp in UTC when you query it. But when you actually touch the file, things go awfully wrong. The only way to get around that is to write your query like this and actually convert that column to the correct date time to with a specificity of seven and compare that to whatever date math you want.

Now, I’ve already run these queries because, you know, I’m pretty sure this is going to be a Friday for you and I don’t want to, you know, plug up all your time running queries, even though they only take about 500 milliseconds. But here’s the first query, right? And if you look here, there is no filter predicate there.

If you look here, we’ll just read that there. If you look at this filter, it’s just saying where expression 1000 is not null. And if you look at the nested loops join, we don’t have anything there either.

It’s almost like that predicate on that column gets completely lost in the shuffle. But even worse is we don’t get any results back for the seven days that we wanted the results for. You might compare and contrast that with the query below where there is a convert on there.

And we do get a number of results back. In fact, in all, we get 566 rows back. That seems significant to me, a 566 row difference.

Actually, the difference between 566 and none. One might run that first query rather naively and assume that the SQL Server System Health Extended event just doesn’t have anything of value in it that we could use to troubleshoot our SQL Server. But that wouldn’t be true, would it?

That would be wrong. That would be incorrect. Now, the execution plan for the other one, you might notice if you’re quite eagle-eyed, has an additional filter in it right here. And this filter is where that predicate actually does get applied and applied correctly so that we can get data out of the System Health Extended event for the range of time we care about.

Now, my free store procedure, SP Health Parser, you can find it at my GitHub repo, does use very similar queries to this to pull data out of the System Health Extended event. And this was something that I struggled with a bit when I first started writing the query. And I was thinking to myself, well, you know what?

If someone is on SQL Server 2017 or better, we should not go digging into the XML to look for the timestamp of things. We should just use what’s in this dammit management function. And you know what?

This was a really annoying thing to deal with. Because, like, you’re running this query that should work and it doesn’t work. And, like, you start sanity checking yourself and then, like, you actually feel quite nuts. You feel quite preposterously insane.

And then you find, you start digging around the internet about this particular function. And you realize that it’s just broken and you have to fix your query to get around the brokenness of the function. Now, I’m going to ask you, I’m going to end this video on a question.

Is this production quality code? And I’m not talking about my query. I’m talking about the type of code that you need to do this sort of thing in order to fix. When people talk about worrying about, you know, the quality of Microsoft software products, SQL Server, the cumulative updates, well, no longer the service packs, but the cumulative updates that Microsoft puts out, some of the features that it inserts into new versions of SQL Server, one might start wondering just how thorough the testing is on these things.

And one might start to have real questions about just how high quality the code being implemented in your enterprise Ferrari database system is when you start running into stuff like this. Because who knows what else those summer interns worked on? And who knows where else stuff might be weird and broken?

So, I’m just going to leave you with that question. Ponder that for a moment. If you were reviewing code and you came across a bug like this, would you let that code go out into your production workload? It’s a good question to ask, isn’t it?

Isn’t it? Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.