Troubleshooting Security Cache Growth In SQL Server (USERSTORE_TOKENPERM And TokenAndPermUserStore)
Thanks for watching!
Video Summary
In this video, I delve into a peculiar issue that has plagued several client servers—security cache growth leading to various transient problems such as CPU spikes, plan cache issues, and memory-related anomalies. I explain how the security cache, meant to simplify login and permission handling, can balloon out of control if not managed properly. To help you tackle this problem, I share a detailed walkthrough on identifying and troubleshooting security cache issues using specific queries and scripts. Additionally, I discuss trace flags that might offer some relief, but emphasize that they need to be set as startup flags for effective management. For those looking for more proactive solutions, I provide stored procedures and agent jobs that can automatically clear the cache when it exceeds a certain size, ensuring ongoing performance stability.
Full Transcript
Erik Darling here with Darling Data. And I remembered to turn my microphone on, so we’re off to a great start here. In today’s video, we’re going to talk about a very weird problem that I’ve seen crop up on client servers a number of times now and lead to all sorts of weird transient issues. CPU spikes, plan cache stuff, just weird memory things, stack dumps, you name it. I’ve seen this thing be the root cause of all sorts of weird problems. And of course that weird thing is the security cache. It sounds like this nebulous little doohickey that is supposed to make life easier by caching security information about logins and whatnot, but if it grows unchecked, it can cause some real big problems. Before we get into all that, of course, we need to talk a little bit about you and me. And things that I like when people sign up for memberships and contribute just a little bit to making sure that this channel stays free and open source for everyone to use. It’s like four bucks a month at the low end. If you don’t have four bucks a month, I totally understand.
You know, there are things that I’d probably rather spend four bucks a month on too. But if you don’t want to do that, liking, subscribing, commenting, all that good stuff is just another way to make my little heart go all aflutter. Shut up, Intel drivers. If you’re in need of SQL Server consulting, that’s my job. Apparently, I do all this stuff and more and my rates are reasonable. So you can hire me to do what I’m going to show you today for you. It’s fun. It’s really great, fulfilling, really just life-affirming work. If you need training in the SQL Server performance tuning space, you can get about 24 hours of it for about $150 when you use the discount code SPRINGCLEANING. If you look in the video description, there’s a link with SPRINGCLEANING baked right into it and you can use that. It’s amazing technology. The advanced URL technology here at Darling Data.
If you want to see me live and in person, and who wouldn’t? I’m even better in person because you can throw things at me. And, I don’t know, give real likes and comments. You can comment in real life, in real time. I think that’s called a conversation. Weird. Weird. I’ll be in Dallas Friday, September the 6th, doing a full-day training session.
And, November 4th and 5th, I will be at PASS Data Summit in Seattle with Kendra Little doing two days of SQL Server performance pre-cons. So you should come see us at those and you should come see me in Dallas if you happen to be in the neighborhood. So now, let’s get on and talk a little bit about how we can troubleshoot security caches.
Now, my good and dear friend, Josh Darnell, who is an application developer, was able to figure out this part of the demo. I don’t take a lot of credit here aside from doing some nice formatting on it, even though there are a couple things that could use some work, apparently. You know, it’s hard to find good help these days.
And, the whole point of this thing is to inflate our security cache. So, that’s what I’ve done. I have inflated the security cache by using SetAppRoll over and over and over and over again in a loop.
I actually had this loop run. Actually, the first run of it got me to about, like, a gig. So, I ran this a few times to get it up a little bit higher.
Just because it made things a little bit more interesting for me. Not because it’s, you know, really all that fun or interesting or cool for you. But, that’s what I did.
So, what we’ve got here are a couple queries that will help you look at security cache stuff. If you look at this one, you will see that things were cruising along going just fine for a while. And, then at some point, the security cache grew.
So, that’s about 2.3 gigs plus about another gig from the ACR cache store. So, that’ll be about 3.2 gigs total from there. So, this is a tough query to remember.
It’s not very portable. It’s not very interesting. I mean, it’s kind of interesting. Actually, you know, if you look at it and you actually click on the XML column, you can get a lot more information out. I don’t like parsing this stuff out from the XML to show in the tabular result because it makes a lot of, like, duplicate lines that are just kind of messy.
I generally just zoom into where, like, things grew or when things grew or, like, if they, you know, spike up from, like, a lower number to a higher number or a high number to another higher number. And, then I sort of just dig around in here because you can see all sorts of interesting stuff about, you know, entries getting put in but not removed and the size of things. And, it is, you know, mildly interesting if you have this particular fetish.
If you want an easy way, excuse me, an easy way of figuring out if your system cache, if your token and perm user store is growing a lot, you can use my free store procedure SP pressure detector. I’ve got it set up here to only look at memory and to skip some other stuff that’s not really pertinent to us. But, if you run that right at the very top, you will have this section here and you will see user store token perm is about 3.2 gigs total, which I believe is about what we talked about it being from the XML.
When we did the XML query, it was, like, you know, 2.3 plus 0.9 something gigs. So, that’s the size of the token perm store there. Now, you can clear this out manually by running this, DBCC free system cache token and perm user store.
So, but the thing is, if this is something that happens regularly because of your application either using set app role or, like, doing impersonation stuff, like, I think execute as is another thing that can really pump this up. Switching users back and forth in queries for different reasons. I’ve seen a bunch of applications that, you know, log in as one user, switch to another user to do something, switch to another user to do a different thing.
Like, they have different permissions and schemas and stuff. All those things will inflate the security caches. So, you can totally run this to clear that out.
If this is a big long-term problem for you, there are a couple of trace flags that can help. The thing is, they don’t help if you just do this. These have to be startup trace flags for them to really make a difference.
So, if you want to look into what 4610 and 4618 do, if you’re having this problem, go crazy. They’re pretty useful if you’re having the issue, but only as startup trace flags. They don’t fix a problem if you just enable them globally.
If this is a problem that you’re having a lot and the trace flags don’t help and your security cache is still growing, over in my GitHub repo, which I’ll have a link to in the video description, I’ve got a few scripts in there that can help. One of them is a store procedure that will run, look at the size of your security cache, and there’s a parameter that you pass in to say how big of a security cache you care about.
If it grows beyond a certain size, it’ll run that DBCC free system cache call and clear it out for you. I’ve also got an agent job to set that up to run. The schedule, I think, is baked in for like every hour or something.
If that’s not often enough, you can, of course, adjust the schedule. But all of this stuff, you can just hit F5 on. And, of course, if you want to inflate your security cache for some reason, or you just want the standalone analysis scripts here, you can use that.
Also in my GitHub repo is SP Pressure Detector right down here, which you can also get totally for free. You don’t have to like or subscribe or comment on that, but you can get that and also view the biggest memory consumers on your server.
And if that user store token perm stuff is up there, you might want to think about running the DBCC command to clear it out, maybe enabling the trace flags, and maybe using this code to set up a job to clear it out on a regular basis, because you might be having all sorts of weird performance issues and reliability issues because this thing grows out of control.
As for like how big it has to be before I worry about it, generally, once it gets past the 2 gig mark is when I see signs of trouble. If it gets up past like 4, 8, 16, 20 gigs, somewhere in there, then you’re just about guaranteed to have some issues.
So I’m usually pretty aggressive on this, and I usually set that to be around like 1, 1 to 2 gigs to clear out for the store procedure there, because like really anything beyond that, you’re just kind of asking for trouble in the long term.
So I hope that this is not a problem that you have. I hope that you don’t have applications that blow out your server’s security cache, because memory is precious, right?
And if you have, you know, 8, 16, 24 gigs of security cache, that’s memory that your server can’t use for other stuff, like caching data pages or query memory grants or having a plan cache or other things like that.
So it’s a bad problem to have. If you are having that problem, you’ve got some trace flags that you can look into. You’ve got some scripts that you can run to clear it out.
If you, you know, I honestly like, you know, I say try the trace flags, but a lot of people can’t just restart SQL Server with new startup trace flags in place. It might be safer for you to just use the scripts there.
Excuse me. So look at your SQL Server with SP Pressure Detector. If you see high user store token perm or anything like over like the 2 gig or so mark, you might want to think about, you know, clearing that out, see if the problem comes back.
If it keeps coming back, I’ve got you on the scheduled stuff with the store procedure right there. The store procedure also does some logging so you can see like, you know, which runs cleared stuff out, how big the security cache was when the run cleared. So there’s some diagnostic data in there too that’s pretty helpful.
Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. I hope that all your dreams come true. I hope that you just get everything you want from life.
It’s a short endeavor and feeling like you are missing out on stuff is never a good feeling. So I hope you’ve got no FOMO. I hope that you get everything that your heart desires, including this video coming to an end.
That’s what I desire right now because I feel like I’m sticking the landing a little bit here. Anyway, thank you for watching. I’m going to upload this and figure out what to do with my life next.
All right. Cool. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.