Test SQL Server Backup Performance For Free

Chapters

*00:00:00* – Introduction
*00:00:30* – Free Stuff on GitHub
*00:01:00* – Stored Procedure Overview
*00:01:27* – Backup Performance Factors
*00:01:57* – Testing and Iterations
*00:02:12* – Stored Procedure Functionality
*00:02:34* – Parameter Parsing
*00:03:10* – Cross Join and Temp Tables
*00:03:34* – Logging and Progress
*00:04:09* – Backup Command Construction
*00:04:24* – Reporting Results
*00:04:35* – Throughput Ranking
*00:04:52* – Compression Settings Report
*00:04:57* – Impact Analysis
*00:05:11* – RAM Usage Report
*00:05:30* – Consistency Report
*00:05:59* – Example Run
*00:07:06* – Cleanup Process
*00:08:09* – Results Display
*00:08:30* – Small Database Test
*00:08:46* – Larger Database Considerations
*00:09:02* – Usage and Testing
*00:09:13* – GitHub Repository
*00:09:20* – Open Source Contribution
*00:09:32* – Script Naming
*00:09:45* – GitHub Issue Guidance
*00:10:12* – Conclusion

Full Transcript

Erik Darling here, Darling Data, continuing my epic journey through monitoring tool mogulhood. I want to walk through a stored procedure that I wrote to help, I mean, I wrote it for one person to help them test different backup configuration performance, but I’m sharing it with you because that’s how I roll, right? Lots of stuff gets out in the world for free. It’s over at my GitHub repo at code.erikdarling.com. That’s also where you can find other free stuff that I’ve done. This one’s in the Darling data repo where you find the majority of my other SQL Server related scripts, but there’s other stuff up there too, like performance monitors, like this one right here. Look at this handsome devil telling you all sorts of stuff about SQL Server. And then there’s an alternate version of the monitoring tool that also tells you all sorts of useful stuff about SQL Server. You can go read about all the differences between them and what they do. And then there’s Performance Studio. Where you can get good advice about query plans. Just open one up in there and you have me looking at your query plan for you and giving you feedback on it. So there’s all sorts of free stuff that I do that I just want to point out. So this is a stored procedure though. And just kind of walking through some of the code. The whole idea here is that there’s all sorts of stuff that you can do to try to make backups faster.

So for example, you can mess with how many files you stripe it out to. You can choose to compress a backup or not. You can mess with more sort of rocket science-y settings like buffer count and max transfer size. And there’s stuff that you might want to know about your backups while they’re running, like how are they making progress. So you can choose that here. And you can also choose how many times you want to run through the test because just running a test once, you know, you can get some good information. But you kind of want to run it maybe more than once to figure out, like, was that an outlier? Was that like the good average of what happened? So there’s things that we can do programmatically that are a lot easier to do. But it requires testing, right? We’ve got to know what we’re in for here. So the stored procedure, what it essentially does, there’s a help section here in case you forget what any of those things mean. But there’s also an example execution here if you’re into that sort of thing.

But it’s kind of what this thing does after it, you know, does some initial sort of smart stuff is it, it takes all of the parameters. So the you pass in parameters is a comma separated list. And what what I do, there’s a table where we hold the sort of logging results for each run through. What it does is, it takes all of the different things that you pass in and those CSVs. So like, like compression settings, number of files you’re striping to max transfer sizes, you know, buffer counts, things like that. And to put in and it parses those comma separated lists out into temp tables. And then what it does is it cross joins all of those temp tables together, right?

And what that does is it creates a combination of everything that essentially explodes all those CSVs out into into a whole bunch of different test combinations of things. So you can test combinations of like all that stuff in one run through over many, many iterations, then, you know, of course, the usual sort of, you know, logging and, you know, message tab awareness of where you are in the loop, because that’s important stuff, right? If you’re running things, you kind of want to know where they are. So you’re not like, hey, when’s this thing going to be over? What am I doing here?

So that’s what like a lot of this code does. And then, you know, after we kind of go through all this, the cursor that does important things, and more printing things out, that’s fun. And then when we get down, though, this is where we build up the sort of backup command that we’re going to run. And then we grab a whole bunch of information about what happened there, and we log it for each run through.

And then that allows us at the end to not only show you sort of a general, give you a general sense of which ones work best, but also to sort of do some reporting on things, right? So like down here, this will, the first report result that you’ll get back is going to be the result of all configurations ranked by which one had the best throughput. So like which one, which one completed the backups the fastest.

Then there’s another report that will tell you the best config per compression setting. So if you test compressed and uncompressed, this doesn’t, like if you only, if you only test compressed or only test uncompressed, this one’s a little bit less useful. But if you’re testing both, then you’ll get the best configuration per compression setting.

And then the third one is which parameters had the most impact. So like when we want to figure out like, you know, like, like, like across all of the backups that we took, which ones like had the biggest impact on like, like, like every backup that they were involved in. That’s what this will tell you.

And then, which might be important for your environment, right? That’s, it might be a good thing to know about. And then we have another report that tells you the best throughput per megabyte of RAM. So like, you know, different types of backups require different amounts of memory.

Uncompressed backups, you know, can, like the amount of memory buffers required is going to depend on max transfer size and buffer count. And then like compressed backups are actually sort of like 3x the memory size usually because you have three different compression streams that you kind of have to deal with. So you use like, like, like, well, compression is wonderful for like what ends up on disk.

The memory that it uses while the backup is being taken can be, you know, is more by 3x. And then there is a final fifth report that only runs if you did multiple iterations where we talk, where I basically report on how consistent each backup setting was. So the lower the standard deviation is, the more predictable it is.

And that is ordered by most to least stable, which helps you sort of figure out if there were outliers or if like, you know, taking like one set of backup options just had wildly different metrics after each run through, then that’s what, you know, that’s what this would show. But anyway, there’s an example run of this over here where I tested this just backing up to null. If you do backup to a path, it will clean up the backups after each iteration through.

So you’re not going to have to worry about like disk space blowing up. It uses the built in sys.delete, xp delete file thing that like all the scripts use it. It’s like like one of like the purge things and maintenance plans, but it allows you to delete backups from a particular path.

So if you are using a test backup path, don’t back, don’t run these backups to your like default location. That’s insane. But it will delete like the backups that it made.

It’s not going to delete all your backups. It keeps track of which, like the backup files that it created and like the name of them and the path to them. And it just deletes those. It’s not deleting every backup.

Don’t worry. I don’t want to get in trouble for you being like, I went to test my backups and Eric deleted all my backups. I wouldn’t do that. That’s cruel. So, but I just ran a sort of quick round through these and the results look about like so. It is maybe not terribly interesting backing up to null.

Oh, look at that. We expanded all our results at once. What a good time. So this is what it’ll show you. And I use the Stack Overflow 2010 database for this, which is maybe not the greatest example because it’s only about like 8, 10 gigs. All right.

It’s pretty small. But I just needed something that I can run through kind of quickly to show you what the results look like. So this just shows you sort of what I got locally backing up to null with a pretty small database. For you with bigger databases, you might find different things make more sense.

You might find that different configurations, maybe even depending on like how your database is set up internally with like files and file groups. You might find that different backup strategies work a little bit better for you. But for me, this is what the numbers look like.

Again, take these with a grain of salt. I’m not saying that this is like my results here or how you should take your backups. What I’m saying is you should use this procedure to test and see which ones work best for you, which is the whole point of why I wrote this. So, yeah.

Again, this is all available at code.erikdarling.com. You can go find it in my GitHub repo and it’s in the Darling Data one. It’s going to be called Test Backup Performance.

It’s got its own folder with a little readme in there so you can read all about it and enjoy yourselves. And maybe you’ll even use it and run it. Maybe you’ll even, I don’t know, want to do something with it. So, this is not part of the bunch of scripts that I usually run to do like server analysis.

Note that there is no SP underscore in front of the name because this one is a little bit outside of what I would normally be doing with someone. But I had to do it. I thought you might like it.

I thought you might find it useful. So, there we have it. Anyway, again, that’s at code.erikdarling.com. Check it out. If you run into any issues with it, open a GitHub issue because that’s where issues get solved. Don’t email me.

Say, I hit an error running your procedure. Help me. I’m going to tell you to open a GitHub issue. All this stuff is on GitHub for a reason. Open an issue where it makes sense.

Anyway, thank you for watching. Hope you enjoyed yourselves. I hope you learned something. I hope that you will test your backup performance. And I will see you in tomorrow’s video where we will talk about, where we’re going to get back to, I’m going to get back to our roots a little bit and we’re going to talk about SQL Server stuff that is not an open source tool or vectors or backups. Backups.

Can you imagine me having to do backups in this day and age? Anyway. All right. Thank you for watching.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.