Making It Easy To Tell When Data Changed With Change Data Capture In SQL Server
Thanks for watching!
Video Summary
In this video, I delve into tweaking change data capture (CDC) tables to make them more human-readable by adding a `change_time` column that captures the exact timestamp of when rows were modified. This simple tweak can significantly ease the process of working with CDC data, especially for tasks like populating data warehouses or performing other data movements. I also share my personal disdain for change tracking (CT), describing it as one of SQL Server’s worst features due to its tendency to cause server backups under high concurrency. The video walks through enabling CDC on a sample table and demonstrates how the `change_time` column can be used to streamline data movement processes, making it easier to identify new or updated records based on timestamps.
Full Transcript
Erik Darling here. Still Darlinging the data, as far as I know. Haven’t quite had any acquisition offers lately, so, you know, who knows, maybe someday. You know, the dream of every founder is to be acquired and, you know, go start some new business. So, you know, any day now, I’m sure. My billions of weight. Maybe I got into the wrong line of work for acquisitions, though, who knows. Anyway, today’s video, we are going to talk about how you can tweak change data capture tables so that you have a more human-readable idea of when rows ended up in your CDC tables. This can make all sorts of things easier. Now, look, there are built-in functions that can help you with this that do all sorts of things easier. There are lots of wacky stuff with log sequence numbers. The problem is, they’re complicated. It’s hard. I don’t like doing it. I like things to be easy, especially when my goal is to build bigger tasks around change data capture. Whether it’s populating a data warehouse or, you know, doing some other data movement stuff. This is one of my favorite tricks. Now, for so many reasons, I absolutely loathe change tracking.
Change tracking is one of the worst features that Microsoft has ever rolled into SQL Server. It is crap on wheels. I have seen it take down so many servers under high concurrency because what happens is you start to stack up these commit table weights. It’s commit underscore table weights. It’s a monster. And when your server backs up on those commit table weights, there is no helping it. You’re going to flame out. Again, change tracking, CT, is god-awful. If you turn it on, no one’s going to want to help you. I don’t even know if anyone at Microsoft still does anything with it.
I think it’s abandonedware at this point. Change data capture is beautiful. It’s wonderful. It is my precious angel because change data capture works asynchronously. You set up change data capture for a table or tables. And as changes happen to those tables, change data capture reads from the transaction log and moves that data into tables that you can actually see and do things with in your database, which is wonderful. Because you can do all sorts of neat stuff with those tables, including index them any way you want. You can add columns to them. You can, I don’t know, let’s see, you can prune them out whenever you want.
It’s not like change tracking where you have to do magic incantations and hope that whatever thing you run actually clears things out of the change tracking tables. Again, change tracking, nightmare nuisance, change data capture, beautiful, wonderful angel. So if you take one piece of advice from me and you work at a company where someone is like, do we want change tracking or change data capture?
Please say change data capture and save yourself all the problems in the world. Now, neither one of those things will tell you who changed something. Right? That’s auditing. That’s a completely separate thing.
You just need to track what data changed. Change data capture is what you should be using. So in my database called crap, and if you do not have a database called crap, I question whether you are a real database person, because every database person needs a database called crap to do crap like this in.
It’s just the way it goes. So I’ve created a table called posts that mimics, make that formatted correctly, that mimics the post table in the Stack Overflow database. At least, you know, the publicly available copy of it.
I’m sure the actual, you know, production copy of the post table is a nightmare. And I’ve just stuck a thousand rows in there to make life easy, because I felt like it, because I don’t need a huge table to show you this. So what we’re going to do is we’re going to walk through enabling change data capture for the crap database.
Oh, that happened so quickly. That was nice. And then we’re going to enable change data capture for the post table. And this will take a couple seconds to kick in, because it does all sorts of stuff in the background.
And, well, you know, okay, well, whatever. Cool. And what we’re going to do is we’re going to add a column to the post table that adds a… We’re going to… Jeez.
We’re going to add a column to the change data capture instance for the post table called change time, with a default of the sys date time that will tell us exactly when data changed. So let’s go and alter our table.
That happened very quickly. That is lovely. Now, if we go and look in here, there will be nothing, because we have not captured any data changes. There simply have been no data changes to capture.
That’s okay, because we’re going to make some changes. It’s going to be great, right? It’s going to be so much fun. So we’re going to update the table, and we’re going to plug that in there. And one row got affected.
And now, wonderfully, magically, we’re going to have data in our change data capture table. Look at all this great stuff. Now, if you were looking to use change data capture professionally with the built-in functions, you could use the LSNs in here and whatever to figure out when things change.
But that’s Bush League. You want to do things in a way that people will actually be able to use, right? It’s like extended events.
Like, Microsoft just made it as hard as possible to do anything. And they’re like, why doesn’t anyone use this? Why do people still use Profiler?
Why? Well, I’ll tell you why. Because you made it impossible. You jammed it full of XML. What did you expect was going to happen when you made something as difficult as possible? You built something that was great for, like, you know, like smart support engineers to use.
You ruined it for normal people. Much like using LSNs to track when things changed. But now, if we scroll over a little bit in our change data capture results, we will way over at the end here have this lovely column called Change Time with the exact time and date that something changed in.
This gives away when I’m recording this, doesn’t it? Crap. You’ve got me. Now, if we make a different update to that table, we’re going to just add 1,000 points onto the score here.
And we revisit our, oh, that highlight did not go well at all. And we revisit the change data capture results. Now we have some additional data in there. And if we scroll over a little bit further, we will see that the change time column is now incremented to exactly when we made the change over here.
Right? So this is about a minute and a half apart. So we see exactly what time something came in. This can be useful for all sorts of things.
Like, if you’re the type of person who uses change data capture to push data to another source, or rather another destination, another target, it’s a lot easier to use this change time in your queries to figure out if this is new data that you need to change.
Because in whatever process you do, you can say, last data movement was it this time. Is the change time column greater than this time? Yes!
We move that data. Is it less than that? No, we don’t touch it. No more futzing around with anything else. You can have a jolly old time moving your data over without having to worry too much about it. And so for these reasons, which I believe to be self-evident, change data capture is superior in absolutely every single way to change tracking.
If you don’t like it, again, I’ve got the Delta Miles to come fight you. I’m happy to do it. So now we’re going to finish this thing off by turning off change data capture for our table and database.
Because if I don’t, then every time I run SPHoo is active, I’ll have some weird job running in the background that just makes things weird and confusing. And no one needs that. So there we go.
I’m going to finish this video by saying, thank you for watching. I hope you learned something. I hope you enjoyed yourselves. I hope you never use change tracking in your life. I hope you always opt to use change data capture instead.
I should really start doing this stuff at the beginning of the video before everyone stops watching. But, you know, if you like this video, I like thumbs ups and I like comments.
Again, even mean comments because I do have the Delta Miles to come fight you. But still, you know, take your chances. Why not?
If you enjoy this type of SQL Server content, you can subscribe to the channel and join now over 4,000 data darlings in their joyful bliss every time they get a notification that I publish a video. So you should do all three of those things.
Except, I don’t know, maybe be nice. You could choose to be nice. What does it people say? It costs nothing to be nice.
I don’t know if that’s true. Sometimes it takes a physical, there’s a physical cost to being nice. There’s a physical toll on being nice. But other times, you know, it’s cool. It’s fine.
You know, tip your bartenders. It’s basically my, be nice to your bartenders. I’m all, I’m, the people in the world who I am the nicest to are bartenders because anyone who just keeps bringing me drinks is best friend.
So much like the way a dog will look at you when you give them a liver treat, that’s how I look at bartenders who bring me drinks. So that’s, that’s me though.
You might have your own kink. I don’t know. Whatever. Anyway, I’m going to get going. Apparently I have a phone call starting in a few minutes. So I should probably, I should probably attend to that. That would be a client call.
One of the lovely people who paid me money to do SQL Server Consulting so that I can keep recording these videos for free. If you find yourself in the market for a SQL Server Consultant to help you with performance or choosing between change tracking and change data capture, you know how to get in touch with me.
Here I am. My rates are reasonable. Thank you for watching. I’m going to go, I’m going to go entertain someone for money now. That sounds, that sounds wronger than it actually is, but gosh darn it, it’s the truth. All right.
Cool. Adios.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.