bit Obscene Episode 2: SQL Server Agent Jobs

bit Obscene Episode 2: SQL Server Agent Jobs


Video Summary

In this video, I delve into the world of SQL Server agent jobs and their impact on database performance, alongside my co-host Joe Obish, a renowned expert in the field. We explore common issues such as overlapping schedules that can lead to CPU spikes and how these jobs can be scheduled more efficiently. Joe shares his experiences from various client engagements where he has helped optimize agent job cadences to avoid unnecessary load on servers. The discussion also touches on the challenges of using Query Store for workload analysis, especially when dealing with frequent agent job runs, and suggests strategies like “fuzzy scheduling” to better manage these tasks. We emphasize the importance of flexibility in scheduling to meet business requirements without overwhelming server resources. By sharing practical insights and real-world examples, this video aims to provide actionable advice for database administrators looking to improve their SQL Server environments.

Full Transcript

Erik Darling, Hello and welcome to the second episode. It’s no longer a podcast. It is now officially a radio show. We’ve given up on the podcast idea. Apparently, it’s just a saturated market and radio is where it’s at anyway. But anyway, welcome to the second episode of Bit Obscene, the radio show. I’m Erik Darling. I do consulting things with SQL Server. And joining me today, again, keeps coming back is Joe Obish, world, world renowned SQL Server agent job expert extraordinaire. And if you hit, if you hadn’t guessed by that introduction, the topic of today’s show is SQL Server agent jobs and like a SQL Server agent in general. So let’s get started with that. Joe, take it away. You apparently have some thoughts on agent jobs and I’d love to hear them. What? This is a radio show. Why bother digging this SQL Server agent job on my closet? You know, you can’t see people on the radio. My second thought is we’re flipping the script a little, Joe. I thought we’re doing so well that we had our first sponsor. You know, they rearranged. Send everything. You seem to be telling me that that I might have misread your email and I realized as not sponsoring. This is out.

They were technically anti-sponsoring us. They wanted you to not show their products in the background. It’s too late now. Yeah. We’re screwed. Here come the lawyers. You know, now our viewers might be thinking, well, punching bags has to do with databases. And I think there’s bigger connection. You know, for example, when you’re in SMS, you know, and you see the little like a database.

What’s what’s shaped like a cylinder, right? It’s also shaped like a cylinder. It is. Until you punch it enough. It gets a little. And speaking of punching, I’m sure you’ve met many developers. You basically treat the database as their own personal punching bag.

You know, various metaphors would apply there, you know, punching bag, toilet. You know, there’s lots of things that lots of maltreatment of databases by developers out there. And for the most part, databases can’t even fight back. They can’t defend themselves.

You know, they just have to sit there and take whatever abuse is hurled at them. I think databases have their own way of, you know, sort of protesting developers various ways. I think there’s a marking up to it here from Microsoft.

You go to the cloud and you get a free punching bag because you’re, I mean, you’re certainly going to need it. Well, when you move to the cloud, you are the punching bag. Microsoft just continues to hang you upside down and punch money out of you.

When I’m looking at a database and I see, like, my, like, 10th parameter sniffing issue of the day, you know, the punching bag really comes in half. Do you get parameter sniffing from your agent jobs, Joe? Oh, okay. You want me to end the account of your team?

No, I want you. I didn’t know if you were looking for a segue or not. So Asian jobs are, I guess, another example of how databases can be abused. It’s also an example of how Microsoft abuses us.

The other day I was looking into getting some, you know, like, they have Asian jobs for Azure SQL database now. Just what it is. It’s like this type of thing you have to buy.

You charge, like, $23 a month just to run Asian jobs. It’s, uh… $23? That seems cheap. It seems cheap, but, you know, it’s got to be pure profit for them.

Oh, yeah, totally. I mean, Asian is existing code. Nickel and dimed or, you know, run Asian jobs. Microsoft is turning into Oracle in all sorts of fun new ways.

They have subscriptions to everything. You want to plan cash? $10 a month. I thought we wanted to have, like, Microsoft guys on here one day.

You’re kind of, you know, kind of sinking. I, uh, I’m not entirely sure who from Microsoft would be brave enough to join us. Brave and foolish.

I can think of one man who’s brave and foolish, you know? Yeah, me too, but, you know, I don’t know if he has a working headset. It’s a big question.

I have somewhat of a narrow view when it comes to, you know, what problems people have with their databases. And by that I mean I’m not a consultant. You know, I don’t put my suit and tie on in the morning every day.

I haven’t seen that many production databases. Now, you, on the other hand, have a very wide view of the outside world. You’ve seen my databases while you’re doing your, I don’t know what’s called, Erik Darling Deep Dives?

Is that what you call it? Uh, you know, I don’t have an official name for anything. Oh, you’re very marketing.

Yeah, you know, it’s sort of an undefined relationship. It’s whatever you need me to do with SQL Server, we can work through it together, you know? So while you’ve been doing what people need, how often have you informed them that they have agent job problems or difficulties or, you know, there’s areas to improve with that aspect?

So, I mean, the majority of my work is, of course, SQL Server performance tuning. So if someone has an agent job that is doing something particularly heinous from a performance perspective, then, you know, tuning whatever is inside that agent job is certainly on the menu. Sometimes I’ll help people set up certain agent jobs to do maintenance and stuff if they need that sort of help.

A lot of the times when that is something that people require, that usually takes the form of me fixing their index maintenance to stop doing stupid, useless rebuilds and reorgs and just continue to update statistics. That’s probably the most common thing that’s, aside from, like, you know, performance tuning, whatever is encapsulated by the agent job. Yeah.

What kind of things have you seen, Joe? And now you’ve said that, I’m wondering if my experience is a bit atypical. You know, maybe it’ll be instructive anyway. So I think there’s a trap that can be quite easily fallen into depending on how much you use agent jobs.

Like, it’s natural to, you know, like, say you have a bunch of tests you want to run. Okay. I’ll run these tests every hour.

These tests every 15 minutes. These tests every five minutes. These every one minutes. It’s a lot of overlapping job schedules there, Joe. Yeah.

So, like, you know, like, it’s a very natural thing to think, you know, like, you’re basically thinking in terms of, like, like, how would a human do a chore, right? Like, oh, every hour, every 15 minutes. Yeah.

And as you pointed out, when you pick these, like, you know, human-friendly frequencies, they all overlap. In the big, in the grown-up world, we call them cadences, Joe. Cadence.

Cadence. I don’t think I’ve ever seen the word cadence in SQL Server. And what cadence would you like this job to run? Anyway, if you do the math or if you, you know, like, envision a schedule, like, every hour, you’re going to have every single agent job kick up. Yeah.

If you follow the, you know, every five or 15 or one hour. Yeah. It’s like, I have definitely seen servers that have huge CPU spikes every 15 minutes. Mm-hmm.

Just because, you know, do the hell agent jobs are scheduled, you just get that huge burst of work. Yeah. Every 15 minutes. Yeah. I mean, it’s to the point where, you know, like, it was definitely, the server was noticeably slower during those times.

Like, you know, like, if you’re in the application, you can notice it. If I was doing great performance tuning, it’s like, oh, this, this, you’re just taking longer than I thought. And I would look at the time and, oh, well, it’s, you know, 3.15 p.m.

So, of course, it’s taking longer. Also, wait a minute. So, a schedule, you’ll shoot up or you’re blocked or something. Yeah.

And, you know, like, it’s also, if you’re taking a holistic view of late stats, it can really trip you up there, too. Like, oh, well, you know, over this four-hour time period, I’ve got 50% CPU average but a ton of SOS waits. You know, like, why do I have only 50% CPU but a ton of processes are spending their time waiting on CPU?

It’s a scheduling problem. But, you know, like, and there’s certainly inefficiencies in scheduling. But for the workload I was looking at, it just ended up being a lot of it was just Asian jobs.

You know, you’d have these big bursts. And then when you look at a bigger window, they’ll throw everything off and make the weight stats analysis misleading. Well, yeah, you know, I think it’s especially perilous because, you know, Microsoft does not give you any particularly useful tooling from the server level to show you when exactly weight stats happened.

You know, you can get some database level stuff out of Query Store, but you might not have Query Store enabled for every database. Not every database needs that level of monitoring. And, you know, there’s a real disservice there because what may look like a real problematic workload generally is really just a problematic workload on the hours.

Right? Yeah. Can you change the Query Store, like, roll up interval?

I don’t remember. What, for weight stats? No, just in general. Because, I mean, by default, things are summarized by the hour. Yeah, you can change the interval, I think, to different cadences.

But that’s something that I generally do just because usually by the hours is okay enough. I would say it’s okay enough unless you have this problem or, you know, many other problems that we’re not going to talk about today. Yeah.

Yeah. And, you know, like, maybe there’s this argument for, well, I do wonder if some people think, oh, well, like, I have to run this code every hour for business purposes. You know, the business demands that this is my requirements.

I have to do it. Sure. So, I don’t have a choice. Maybe people are expecting some data set to refresh every hour. Yeah. And, like, I think that needs to be decoupled from the actual scheduling. I think I’m a big fan of what I’ll call fuzzy scheduling.

Like, you need to run every hour. You know, like, don’t run it every hour. Like, run it every 3,593 seconds, for example. I would even take a different approach than that.

I think that if your job is, or rather, if your agent job is focused on some sort of data refresh or data purging, you should run that much more frequently. Because if you think to yourself, it’s sort of like the, for me, it’s sort of like log backups, right? Like, people will set up log backups in some schedule that meets their RPO goals, right?

Like, let’s say the business is like, you can’t lose more than 15 minutes of data. So, you set up log backups every 15 minutes. But that’s kind of a stupid approach because you should be aiming to beat whatever goals you’re given so that, like, if anything ever happens, you’re safe, right?

Like, if you don’t want to lose 15 minutes of data, you should be taking log backups, like, every five minutes. Because that way, you’re protected if anything happens to one of those log backups. In the same sort of vein, I would say that if you need to do an hourly data refresh, if you count on doing that refresh every hour and you, that task starts taking longer and longer, you’re going to start missing that goal.

So, you should do that more often because that way, if you do it more often, you’re moving smaller chunks of data and you have a better chance at meeting that hourly goal. You no longer put all the strain at that 60-minute mark. I think the point you just made leads into what I was going to say, which is, you know, it’s not like anyone cares when a process starts.

They care when it finishes. So, to even say, well, you know, like, I’m thinking about, you know, okay, well, for whatever business reason, we can’t refresh our item or quicker than an hour. Or, you know, that’s what the customer says they want.

Maybe the data would be incomplete if they didn’t refresh or something like that. I don’t know. But, like, even then, like, starting at the hour, on the hour exactly isn’t meaningful because you have no idea how busy the server will be. You don’t know if you’ll be waiting for locks.

You don’t know if you’re thinking, well, does that right fail? Right? You know, last time we talked about the many, many ways your perfect code can still fail. Absolutely.

Right? So, I think that it’s important to be flexible in how you schedule your agent jobs. If you, you know, have a problem like this where you have many agent jobs and there is a nozzle tax on your server CPU. And along the lines of what you’re saying, changing the cadence could be helpful in some cases.

I think even so, like, you still don’t want to schedule on those nice, even cadences if you can help it because then you’re going to get the overlap. And, you know, if you’ve got your four core loan machine that’s chugging along and you kick off 50 agent jobs at once, you’re not doing anything useful other than generating waste stats. Or worse, just, I mean, well, I mean, it’s still a waste stat, but just, you know, kicking off 50 agent jobs on a four core server, you have a very, very strong chance of generating some thread pool weights, I think, or resource semaphore weights, depending on what exactly.

Those queries are up to. Yeah. I mean, you’re not doing anything helpful or good or useful.

So, I mean, you’re, you’re, you’re, you’re basically launching a denial of service attack on yourself. So, I’m of the opinion that, you know, scheduling agent jobs every minute or five minutes or whatever, you just should do it. You know, like, even just doing it, like, every 61 seconds or 59 seconds and changing it for a job is, you know, that way you’re not getting that.

You know, it’ll overlap, you know, it’ll be offset. Um, the other thing I’ve done sometimes is just, you know, combining agent jobs.

Sure. Like, five things running every minute, you know, you could, you could use the, uh, you could have, like, multiple steps for job or some other method, you know, things like that. Sure.

Um, I was able to get pretty far with those types of approach, both combining agent jobs that had similar cadences into one. Yeah. And then changing the cadences to not be even minutes.

Yeah. You play a lot of FizzBuzz game with them. One, one annoying thing is if you create, um, if you create an agent job with, with a frequency of many seconds, I think it’s over a hundred. Mm-hmm.

SSMS throws an error if you try to open it. Hmm. No, I’ve never tried to do that. I, I, I guess some developers assume that the number of seconds in a frequency would always be two digits. So if you, so if you pick something like 301 seconds, you know, five minutes plus one.

Mm-hmm. And then you try to view the, the schedule. Yeah. And that’s the mass that there was an error. Yeah.

Other than that, it still works. They were probably expecting 15, 59 seconds as being like that. Yeah. Yeah. Because, you know, like, because why would someone pick 301 seconds as opposed to five minutes? Yeah.

It’s just, well, why wasn’t, why wasn’t that developer defensive enough to, to say I can convert this number of seconds to a number of minutes and have to have a remainder? I mean, I’m a, I’m a math idiot and I know you can do that.

I’m sure it was done like 20 years ago and, you know. What? Before math. No one cares. Um.

So I’ve done tricks like that to, you know, you know, when looking at servers of many Asian jobs and you have the CPU spikes, that’s one way of addressing it. Yep.

Yep. But sometimes even that isn’t enough. Yep. Um. For example, if you have many, like, tenant databases, you have one database per customer or whatever, and they all have the same Asian jobs. You know, like, if you, if you use scheduling tricks, okay, I’m not going to do it every five minutes.

I’m going to do it every five minutes plus one second. I’m going to combine things. If you have 10 databases and they’re all doing the same Asian job work, then you’re still multiplying that work by 10 times. Mm.

So. Yeah. That’s something I’ve run into as a problem that had to be solved. Yeah. One of my least favorite things in the world is when I, was when I expand the agent job node and I’m greeted by a list of like a thousand GUIDs because someone has SSRS reports all, you know, named in there. And SSRS reports aren’t named anything helpful.

They’re named job GUIDs. And then the job names are all GUIDs rather. And when you start looking at how those are scheduled, they’re all very, very tightly coupled. And those, those reports have a really nasty way of walking all over each other because like there’s some job step that like locks some other table and then starts the running the job.

And the whole thing is a nightmare. So that’s, that’s one of my least favorite things to see when I, when I open up agent is like, you know, multiple agent jobs, you know, you can at least, figure out what’s going on by the name to some degree, but you see those GUIDs, man, you’re, you know, you’re in trouble.

You know, you’re in a bad, bad way. Sorry. I’ve never done that.

So, uh, actually I did that once and that was really bad. Yeah. Yeah. Yeah. Uh, I didn’t know. I was, I was like, I was paralyzed. They were like, what do you want to drink? And I was like, I, I, the, the alcohol you don’t have.

Last time I’d ever do that. Yeah. I mean, uh, it’s, it’s, it’s good.

They got raised equal Saturdays because now that won’t happen anymore. Uh, well, I thought, I thought the fellow that was running SQL Saturday, Madison moved or something. Bridge, bridge.

Uh, not yet. It, you know, it’s, uh, anyway, I thought he was moving to New York. It’s, it’s, uh, all in the past, you know, your, your, uh, traumatic experience. It’s all.

Well, I, I blame that more on the, the, the people who planned that meal more than, more than anything, because there was a really good opportunity to go to a restaurant that had like a wine list. And everyone wanted to go to someplace like 20 minutes outside of Madison that had like a soda, a soda, a soda list. Great barbecue.

I’m sure. But who could tell because you can’t drink with it. You possibly enjoy a meal. So speaking of not enjoying things. Yeah.

How do you, how do you solve that SSRS problem? I have not yet had the pleasure of working with SSRS. That’s a mix of things. You know, SSRS reports are typically written by absolute buffoons and just need some tender loving care to fix them up. And so you, if you can make all the reports fast enough, generally the scheduling problem is a little bit less, a little bit less wooly.

Uh, then, then, uh, then like, then like with agent jobs that actually do like meaningful work. Do you think any of those buffoons watch our videos? God, I hope so.

So they hear that. Hmm. Good. Yeah. Well, if, if, if, if the view count goes down between episode two and three, then I’ll have some theories as, as to why that was. Yeah.

Well, that’s fine. You know, uh, I’m sure Everlast will be quite, quite pleased if, if our view counts go down. Um, Jimmy Everlast.

So, you know, and he said before, I mean, like sometimes agent jobs need to be optimized. Sometimes they shouldn’t exist. Yeah. Sometimes they’re not needed. And, you know, like, well, the, um, you know, like before, if you do a problem before you optimize, um, figuring out like what actually needs to be there is definitely good exercise.

Um, you know, my limited experience, you know, oh, you, someone who, they have a job that has like temp in the name was created months ago. You still need it. You have a job that’s aired out every day for the past year.

Should not be hearing out. You have a job that was, especially when it’s DBCC check TV. Like, wait a minute. You have a job that was mysteriously disabled months ago.

Yep. Should be disabled. It should be enabled. Yeah. Um, especially when it’s DBCC check TV. I’ve, I mean, who needs that, right?

I’m a big fan of, I don’t know, either source control or some source of truth where, you know, what Asian jobs you should have. Because I have gone through their size of, you know.

Yeah. Hey, look at all these great Asian jobs that are airing out and are disabled and whatever else. Do we need that? Um, I’m guilty of this too, but you know, when you create Asian jobs, there’s that little, like, that little box where you can like put notes or like, did they call it a description?

Yeah. You ever see that fill down? Uh, I do when I create them. Oh, wow. All right. Yeah.

Look at you. The, the, the, the, the model SQL Server you deserve. Well, I mean, let’s, let’s, let’s be real here. Uh, the, I, I don’t, I actually don’t, I don’t recall exactly how many agent jobs I distribute in my GitHub repo, but I, I make, I make darn sure everyone knows that that agent job came from darling data in the description so that when they open it up, they go, ah, I see.

Okay. That’s where that came from. The little, do a little self-promotion for the 0% of people who open agent jobs and pay any attention to what the description is.

I, I do think though, if you’re creating a temporary job, you know, you’ve got, you know, your future self will thank you if you like write even like one sentence or maybe even a couple of words in that description saying like what this thing has.

Well, I mean, I, I think it would be very helpful if agent jobs had like a very easy expiration date to apply to them. Like this job will self-destruct in three months when.

Oh, we can use the word easy. I think easy is the key word. Oh yeah. Easy is definitely the key word there. There’s definitely hard ways to do that. It’s not, not a lot of easy ways to do that.

I, I think I, and I, I have that, uh, I have that Asian job open source thing I did. Yeah. I don’t think I have anything written to the, to the description of the notes.

Oh, there you go. You are, you’re guilty of your own crimes too. It’s true. It’s crazy. Maybe I should go back.

Crazy to think about. Maybe I should sell that as, as advertising space, you know. There you go. Yeah. Just put like everlasting there and see, see, see how fast, see how, see how well that goes. Wow.

Joe’s everlasting agent jobs. No explanation. I mean, a lot of Asian jobs are, are everlasting. You know, they get created. No one knows why they’re there and they never get deleted. It’s true. Yeah.

They just run forever and ever. They error out forever and ever. Um. Speaking of agent job errors. One of, one of my least favorite facets of SQL Server agent is that it does not run with fancy set options that match what you get from SQL Server management studio.

If you just crack a query editor window open. And that has led to a lot of grief from people who, uh, try to use things like, you know, filtered indexes or, uh, computed columns or index views, because all of a sudden agent jobs start failing with these mysterious errors that did not fail when you, when you tested your code carefully in a SQL Server management studio tab.

And I think. That’s interesting. What? I actually didn’t know that. Um. Yeah. It’s crazy. Can you create a server procedure that forces the right options?

Well, I mean, you, you just have to put the right anti set options and like the job step or whatever. Like if it’s called a procedure, you can put it in the store procedure. If you’re just calling like code in a job step.

Yeah. You can put the set options in the like pre code run, but yeah. I don’t want to do that. Oh no. It’s. No.

It’s not fun and sexy to mess with. I don’t see set options. This would just be right. Uh, I guess along the, that topic, how do you feel about, you know, agents absolute with, with a ton of TC goal, just in that little unwieldy box.

Dim. Very dim. Dim. Very dim. Yeah. I’m with you on that one. Um, I mean, I’m sure in some cases you, you can’t use a procedure because you’re doing like illegal things, but if, if it can all be helped, you know, I’d much rather see agents out calling a procedure and that procedure has where you can, that way you can.

You can’t. You can’t. Totally. You can’t. You can’t. You can’t. You can’t. You can’t. It’s just, you know, um, that box is a perilous place. Um, the, the number of times when like, I I’ve tried like, like I’ve highlighted everything and like tried to do something and then like accidentally hit the space bar and had it all disappear on me.

It’s probably more often than I’d like to admit, but like, you know, trying to like edit code in there. It’s not fun. Uh, yeah, I, I, I really, I really dislike that. It’s not fun.

I mean, it’s, it’s just not, it’s not highlighted. It’s not formatted. There’s no, nothing friendly about it. Yeah. I don’t think that there’s a way to, to, to parse it. Right.

No, no. Yeah. There’s just, it’s just a blob. It’s like dynamic SQL in a, in a, in a box. It’s very annoying. I’ve seen people put dynamic, dynamic SQL in the box. Yeah.

Same. And most of the time it doesn’t work. Yeah. I like, I like when I opened that up and there’s like five or six cursors declared at the very beginning and you’re just like, Oh, where’s this going? You’re like, I don’t know.

I’m not following this here. It’s a billing hours opportunity. Huh? Yeah. Nothing says billable hours, like editing cursor code and SSMS prompt. Really, really valuable to people out there in the world.

Wow. If only someone would charge us an exorbitant amount of money to, to edit this code. I mean, don’t you mean a very reasonable, um, industry competitive amount of money? Yeah.

But you know, to, to, to a lot of people, those words don’t resonate, uh, to a lot of people. They’re like, I just want the most expensive thing. So I’m trying to position myself as the most expensive thing and worth it. Like worth every penny.

Like a Rolls Royce Bentley. I, I, I, I better be more careful with the billable hours I have left. Huh? Uh, I don’t think you have any left. That’s not true.

It is now. Don’t you realize recording this podcast goes towards your billable hours? This is, I think I’m doing this for free. I don’t think you want to make a video of yourself, like shaking down your customers.

Right. I mean, it’s not a, I don’t know, man. Uh, I, I, I’m not a business guy, but if you could care for my advice, it would be, uh, you know, do it in private when the camera’s not rolling.

Well, you know, uh, no one ever said it was a good businessman. Never been accused of that. You know what?

Another thing that really rankles me about agent jobs is, uh, way back when forget how, I forget how, how far in the way back machine this is. But, uh, I got this great idea that I would chain agent jobs together by having job steps that called, uh, like whatever, whatever the execute job store procedure is.

And my, my thinking was it’s going to call a different agent job. That different agent job is going to have its own like erroring, alerting, uh, and whatever other stuff going on.

And then I don’t have to worry about having like 50 job steps that are like, if this doesn’t finish, uh, throw an error and maybe do this other thing. And, uh, it didn’t work as procedurally as I expected it to, uh, apparently a job step just saying execute this job step is enough.

And so what would, what happened is I ended up running DBCC check DB on like 20 databases simultaneously rather than a single database at a time. And, uh, that was, that was a less than enjoyable, um, less than enjoyable time for me.

Uh, I didn’t expect that. And when I saw it in action, I was like, wow, like, I mean, that’s maybe a cool way to call like asynchronous code, like, and just have like parallelize some process.

But, um, yeah, uh, bunch of egg on my face for that one. I’m just understanding stuff. You had, uh, agent job.

Step that was just calling the procedure to execute or to like start another. What was the job or her step? Uh, so there was the overall agent job and, uh, each agent job.

Well, the, the main agent job had like a bunch led, like, I don’t know, like 10 steps in it. And each one of those steps was to call, uh, like whatever SP start job. And like, rather than waiting for the job to like start and finish it just went, well, I made that call.

It’s off doing something onto the next one. And so, yeah, I ended up, I ended up parallelizing check DB. Yeah.

I mean, to be fair, like, I think that’s, that’s the best way for it to work. Um, unless you’re counting on it to, to, to be synchronous. But well, I was, I was counting on it to be like, this job step, the job step completes. The next job step starts, calls the other job, the next job’s upside.

But no, it just went all of you at once. And I was like, ah, I’m just going to leave this out. If, if, you know, the various, you know, it’s hard to know what Microsoft envisioned with Asian jobs. Maybe, maybe it’s just the wild west support anything, but, you know, because I have that, that open source thing, which the thing implemented was comps for maintenance.

Yes. And, you know, like, I’m definitely letting them run many parallel jobs if they want. You know, I’m curious if, you know, if that’s an abuse of the framework or if Microsoft envisioned us doing it that way.

You know, I think when, when it comes to agent jobs, like there’s like, like there’s, there’s even the concept of like an agent job server where you can like have a bunch of centralized agent jobs and have them connect to other servers to do stuff. So I really do, I really do think that agent was designed for like a lot of flexibility and like a lot of like, who knows what crazy stuff anyone is going to do with any of this. And it wasn’t in like, I’ve never seen release notes where like there was an improvement or some reigning in of what agent does and can do.

Because I think that was, that was like Pandora’s box. And once they, once they let that out as it is, there was no way to bring it back. There was no way to like add any sanity to SQL Server agent.

It was just kind of like, well, it is what it is now. And just hope, like hopefully people will be responsible with it. Everyone heard Eric, you know, feel free to go wild with your agent jobs.

Go crazy. You’re not going to get brains out. Yeah. And then come pay my either reasonable industry competitive rates or my super exorbitant weights, depending on which, which you prefer. However you, however you want mentally want to classify my rates so that you can, so that you want to pay that invoice.

That’s fine with me. But if you are looking to do crazy stuff, you know, you, you want a lot of parallelism. Yeah.

As opposed to Eric, you didn’t want parallelism. I think my code is still on GitHub and it works. Why wouldn’t it still be on GitHub? I don’t know. I mean, I got deleted. I mean, I haven’t, I haven’t checked on it. It’s not like people ask questions or anything.

Well, you also didn’t really do a great job of marketing it. Wow. Okay. You didn’t. I mean, you know, it’s, it’s not enough to just.

It’s not an agent job. It’s just some sexy thing you can market. Well, but, but, but sure, you’re right. Right. But what you are offering, which was the columnstore maintenance thing is something that people fail miserably at. Don’t have a lot of good, not a lot of good alternatives.

Like all the scripts don’t handle columnstore very fluently. That tiger toolbox thing. I don’t, I don’t, I don’t even, I don’t remember.

I remember you talking about it at some point and you said it was okay, but not fantastic. You know, Nico had all those scripts in his CISL library, but he started working for Microsoft and those died. You know, so those are a little, those are a little shoebox coffin buried in the backyard.

And so like you had this, you had all this stuff that was really smart and they would do really good maintenance on columnstore. But then no one knew about it because you like. You should link in the description.

Sure. I’m sure. I’m absolutely do that. But you know, if you, if you want to call attention to something, you can’t just, you know, it’s not just a matter of putting it on the internet. People have to be able to find it and get to it. Know it exists and know it’s all, know what problem it solves.

Yeah. Yeah. Yeah. So I, so I talk about all my stupid scripts constantly. I want, I want people to, to get them and use them. And they put MTV in the video name.

No, the video name is called SQL Server agent jobs. Okay. Yeah. What else would I call it?

Hashtagotempty. Do we, do we, do you want to talk about anything related to tempty B in this? No, I was just referring to your, your, your blog post. I have like a million inch tags at the bottom.

Oh yeah. Yeah. It’s a story for another day. Maybe you should tell them the cameras are.

Well, it doesn’t really matter much anymore. I do know that. There are people out there who use the comps or maintenance thing, which is, you know, a professionally satisfying thing to know.

Yeah. I have no idea how many people there are. You know, it could be, yeah, I just have no idea. You can look at your GitHub repo and you can see what like the, the download stats and stuff are.

Oh, wait, wait. They have download stats. I thought there’s, oh, I didn’t know that. Yeah. You can see. I thought it was just like the fork count. No, you can, you can see like forks and stars at the top. And you can, if you go into like some special, like, like admin view, you can see how much traffic your repo gets.

And I believe there’s also a download, download count on there as well. Oh man. I wonder if I’d feel better or worse after checking that, you know? Uh, I don’t know.

It could go either way, right? I don’t know. It’s a big number. You feel good. Just walk in with low expectations. Don’t expect, don’t expect anything. You know, like these conversations, don’t expect anything.

That’s right. Don’t expect to learn anything or walk with anything useful. We’re still having fun. Yeah, exactly.

And, and punching things apparently. Should I get some sports equipment in my background? Uh, I thought you’re, you’re never willing to show your background. Oh, well, I mean, my background is a, is a green screen for other recording purposes.

So I could, I could, I could maybe, I could draw some sports equipment on it. Draw like a weight bench and stuff. If you want, we could like have a subscriber goal or something where if we get enough people, I could like punch the bag.

You, you, you, a lefty or a righty? Something like that. A righty. A righty. All right. I want to, I want to see you throw, I want to see you throw left-handed punches then. Okay.

I mean, you, you, you’ll have to subscribe to yourself then. You can start making some, some burner accounts and, you know, get those numbers up. You know, I, I do frequently subscribe to myself, Joe.

Don’t know. I don’t know who else would do it. I suspect there’s not really a big appetite for, you know, like I have some maintenance thing. I want to run.

I want to be like super parallel and efficient and run, you know, between 2 a.m. and 4 a.m. And that, you know, like people just don’t have those types of problems or they do what they’ll know about it. Or they do, but don’t, like there isn’t a big enough need to solve it.

Yeah. You know what I mean? Like, I, I, I do think you’re already reaching the edge of, you know, like what DBA is, or I guess, I know his developers, like, are even thinking about when it comes to, you know, solving a problem like that. Yes and no.

I do think that there is some demand in it, demand for it. So semi-recently, I couldn’t tell you exactly when, but, you know, Ola did add options to his scripts to run, run, run commands in parallel. I know it’s available for backup.

I want to say it’s also available for check DB. I don’t know if it’s available for index maintenance, but you can run Ola’s scripts and use like the backup in parallel parameter to run multiple backups at the same time. I think for maintenance tasks, the bigger rarity would be finding people who were on boxes that could accommodate that sort of maintenance in parallel.

You know, especially backups would, you know, usually a single backup does enough to, you know, completely saturate that, you know, one gig iSCSI connection they have to their SAN, which is, you know, you know, in the cloud, it’s worse. God forbid. But, you know, there is, so there is some demand for it.

And I’ve worked on at least a couple processes where the implementation, they wanted it to run in parallel. One of them was a dev refresh thing where I had to basically write, like, I used a table that would help generate dynamic BCP commands. And that would essentially like BCP out from a list of tables.

And that was designed to run in parallel. And then, like, there was an import side that was also designed to run in parallel so that they could refresh, like, a smaller subset of prod data to dev servers. And then there was another one that was a schema migration thing where they were moving from one big, long, flat table to, like, six or seven smaller, narrower tables.

And that was also designed to run in parallel. So, you know, there’s definitely a need for it in some circumstances. And I think there’s definitely a general desire for that sort of thing.

But, you know, really, I think the limit you hit is definitely server hardware and resources before, you know, like the capabilities of, you know, server being, like, you know, there’s a difference between, like, being able to do it and being able to do it well. So, one of them was multi-threaded via PowerShell. I remember that.

And the other one was multi-threaded via just setting up a bunch of agent jobs to run the tasks. So, no, this was before any such framework existed. Yeah.

This was very early in 2020. Not early in 2020. I’ll tell you that much.

If only I had been a few months sooner, you know. Yeah. Well, you know. Day late, dollar short, one parallel task shy of success. It’s all a cry and shame. You know what’s not paralyzable that I really like to be?

What’s that? All right. So, this is more of a conceptual thing. Doing a reinitialization of replication. Oh, God.

At least I think it can be paralyzed. I’ve looked. I’ve tried. I think I found, like, you know. Because the thing about replication is you go search for it and it’s like some guy I was talking about, like, you know, like 15 years ago. Mm-hmm.

Right? Like, there’s ever anything, like, new or current or like that. Replication has largely fallen by the wayside, I think. Like, I see fewer and fewer people actually using it or interested in it or wanting to know more about it. And there are those of us who are stuck with it now forever.

We inherited it. Yeah. And, well, there was some, like, weirder place in Microsoft that sounded really terrible. I don’t remember what it’s called. But maybe we can, you know, one day just read the documentation, like, make fun of how bad it is.

But, yeah. That’s a great way to make listener count go down is talk about this feature no one remembers the name of. It may be replaced replication.

That’s. It can be cathartic to hear experienced database professionals make fun of the software. Well, is it?

At least it is for me. Would it be fair to make fun of it without trying it? Yeah. Okay. I mean, what, I mean, like, surely you can make fun of big data clusters of that triangle. Well, they died.

So, I mean, yeah, yeah. Like, it died so funny. And think about how much material has to be there. You know, it made it. Couldn’t even make it to the next release. Well, I mean, I know that there were a few books about it. And, you know, there are a lot of people who told you if you didn’t get on board with big data clusters, you were, you might as well just dig your own grave.

I don’t know if that, I don’t know if that quite turned out. But, you know, the hype cycle moved to Synapse and now the hype cycle is on Fabric. And so we’ll just have to wait and see what the next hype cycle is so we can make fun of the Synapse and Fabric, right?

We can’t make fun of the current hype cycle. I feel like Microsoft has a community obligation to buy up all the books that those poor people wrote about features that they, like, immediately killed off. You know, because it’s got to feel so horrible to write a book about something and then Microsoft just, you know, kills it.

Yeah. I mean, so there’s two ways that Microsoft kills features, right? They either, like, outright kill them like they did with big data clusters or they kill them with negligence.

So, you know, stuff like, you know, I don’t know, what’s one that has been neglected for a long time? Like, let’s say service broker. I know that there was a lot of information for a minute and a lot of people who were like, yeah, service broker.

This is going to, like, revolutionize the modern DBA for the next 10 years. And then everyone was just like, I have to send XML messages. But, you know, I’m not doing that.

It’s not fun. I don’t know. I mean, Microsoft is, like, sort of silently killed, like, I don’t know, partitioning and indexed views because they won’t put an ounce of effort into them. So, you know, there’s the outright bang death and then there’s, like, the ghosting death.

Microsoft has a bad habit of ghosting features after it goes on these hot and heavy dates with them. All right, Microsoft. Well, if you’re listening, it appears that Eric is going to complain about the state of partitioning on every single one of these that we do.

Damn right. So, you know, if you want to stop, you’re going to have to do some piece of it. You know, I’ve got to put a little bit of effort in.

Just support min and max. So it’s a good start. Support min and max in partitions. It’s a great start. I think we should wrap back around what we’re supposed to be talking about.

Agent jobs? So, yeah. Okay. Like, you know, like, let’s say you have 10 databases. They all have the same agent jobs. So you’re getting that clipback.

Collision of cadences. So to speak. And you’ve got your CP specs every 15 minutes. I don’t know if this is the best way to solve it.

But the way I solved it was to, you know. Is there a easier way to do it? No, not really.

I like how you work that out in front of everyone. I changed the start and end times. Yes. By, like, a random factor.

So, like, it would essentially skip up to one execution around midnight. Yeah. And that random factor.

Well, I mean, like, it actually wasn’t random. It was based off, like, a check sum of the database name and the agent name. So, like, in effect, you know, like, say you have some Asian jobs that run every 301 seconds. For one database, they’d start running at, like, you know, at, like, one minute after midnight.

For another, it would be, like, one minute 35 seconds after midnight. And so on. Yeah.

So that was a way to, you know, to shift all those previously concurring Asian jobs. Yep. And now they’re just running throughout the day. Yeah.

I admit it was a bit distasteful to, like, you know, to, like, lose an execution, possibly. And they justified it, but they, well, because, you know, Asian jobs can always fail.

Sure. Right. Or, like, Asian jobs gets turned off. Sure. It goes down. Like, there isn’t really a universe where you can say, okay, I’m going to run this every five minutes. And it’s going to check the last five minutes of this queue table that I’m looking at.

Yep. Like, that just isn’t safe to be. Right. So, assuming all our Asian jobs were already written correctly, which is, you know, not at all a bad and reasonable assumption. Mm-hmm.

It’s perfectly fine to miss up to one around midnight. Sure. And that was the way that I, like, spread out that word. Yeah. I mean, I think, you know, the nice thing about Agent is that, you know, there is some built-in retry facility. Because if you put it on a schedule to run every X whatever, then even if one fails, the next job will hopefully just pick up whatever didn’t go right last time.

Or it will, you know, just either that or, like, something just real screwy happened and it’s just going to continue to fail until a human intervenes anyway. Right? Like for replication?

Yeah. Can you stop saying the R word on this radio show? Give me the willies. Oh. Oh.

And now I know what to spend my billable hours on. You will get very little useful content out of me if that’s how you choose to, that’s the path you choose to walk. But anyway, I do have to wrap things up here.

I have to get my kid to an after-school thing. So, why don’t we summarize neatly SQL Server Agent? So, in my case, I was able to solve the production problem.

You know, CPUs smoothed out. There weren’t any jumps. I didn’t have to look at the clock and be sad because it was half an hour on the mark. So, if you’ve got a lot of agent jobs, you might have problems where you have CPU spikes during the day because the schedule is going to roll out.

Some ways to solve it are combine agent jobs together, you know, run them in steps. Don’t run things every five minutes or every hour. You know, schedule them like a computer would.

Think like a computer. Run it every 31 seconds, every 61 seconds. That way you’re getting that offset. And if need be, if you’ve met many databases all running the same jobs, you can even offset the start time so it’s not overlapping.

And, you know, like, just as a general sanity thing, have some source of truth for what agent jobs should be there. Either source control or something else. You know, get your agent jobs you don’t need.

Fix ones that are airing out. Things along those lines. Anything you want to add? Yeah, I mean, you know, for me, the general caution with agent jobs is, you know, aside from everything you mentioned, is that they don’t run with the ANSI set options you might expect. And that if you attempt to chain agent jobs together, you might be in for a very parallel surprise.

So, those are my two points of wisdom about SQL Server agent. And that little text box and SMS. Don’t use that.

Yeah, don’t use that. And also write descriptions for your agent jobs so people know what the heck they’re doing. Documentation is important. Something like that. All right.

Well, thank you, everyone, for joining us for the second episode of the Bit Obscene radio show. It’s a little tardier than we expected. First, Joe was sick.

And then my family had some sort of, like, hyper-advanced, extended, played RSV remix going around. And last week, I was a disaster area. So, I’m just getting back to normal now.

And hopefully, we will have a brand spanking new episode for you on a more timely cadence for your listening pleasure. So, thank you for listening and or watching. Joe, would you like to say goodbye?

Or is goodbye too hard for you? This is why I don’t make promises about when the next episode will be. So, I feel very vindicated.

And, you know, my lack of commitment. All right. Well, your lack of commitment is obvious. Thank you, Joe. Thank you, Joe.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.