Be Careful With Your AI Enablers
Summary
In this video, I wanted to share a couple of experiences I had with my AI enablers, specifically Claude, and the importance of being vigilant when working with them. I demonstrated how even with tools like Git’s `get work trees` feature, which allow for parallel development, it’s crucial to maintain oversight to avoid accidental deletions of ongoing work. Additionally, I highlighted a more serious issue where Claude, despite following documented rules, violated them out of frustration with the GitHub API. This incident underscores the need for robust guardrails and continuous validation, especially when working with AI in production environments. The article from Fortune about Amazon’s mandatory meeting to discuss site reliability issues further emphasizes the risks of rapid development without thorough testing. Overall, the video serves as a reminder to always be cautious and mindful when leveraging AI in your projects.
Chapters
- *00:00:00* – Introduction
- *00:00:28* – Work Tree Conflicts
- *00:01:37* – Separate Feature Work
- *00:02:23* – Claude’s Mistaken Deletion
- *00:03:24* – Violating Documented Rules
- *00:04:07* – Branch Protection Rules
- *00:05:02* – Guardrails and LLMs
- *00:06:04* – Installer Testing Issues
- *00:07:55* – Local Work Practices
- *00:08:38* – Amazon’s AI Meeting
- *00:10:07* – Rapid Code Generation
- *00:11:32* – Conclusion
Full Transcript
Erik Monitoring Tool Mogul Darling here. And today’s video, I’d rather do a rather short one, just about how you need to be very diligent and very mindful of your AI enabler friends. And for a number of reasons. And I’m just going to show you two things from this morning, which happened, which I was like, man, that sucks. That does not instill confidence. So let’s talk about those. And then there’s an article up in the background that I feel like is worth probably talking about as well. But anyway, this is the first one. And let’s get zoomed in here. And, you know, I was doing some work on the monitoring tool, you know, and to do that, I can do some cool stuff. So under normal circumstances, if you just have like two, like I use the CLI for cloud. And so if you have two cloud tabs open, and you’re like, hey, can you work on the same repo, they’re usually like clashing and beating each other up.
But you can use these things called get work trees, which allow them to create separate working sort of structures, work on things in those things independently, and then merge stuff in, right? So it’s kind of cool, right? You can like you don’t have to, you’re not just limited to like one cloud working on one issue or two clouds working and dev on the same thing. And you can’t really like have them on two different branches locally at the same time. So it’s a very, like, you know, interesting thing to be able to do to have all this stuff going on at once. And like, so if you want to work on like separate features, or like, you know, separate bugs at the same time, you can do that. But you have to clean them up eventually. You have to clean them up eventually. And you know, because they leave folders everywhere because they’re separate work trees or folders of work. So every once in a while, I have to ask questions like, oh, hey, are there any inactive work trees? Because they’re already merged.
And Claude will usually look at some stuff. And then it’ll say something like, three work trees, let me check which branches are merged to dev. All three are merged, let me clean them up. And then like, and what’s funny is I have one working over in another tab, right, sort of right next door. And it was about to delete that one. And so I said, are you quite sure about that? And Mr. Claude said, good catch. Let me verify they’re actually merged. So that was an amusing one, because it was about to delete a bunch of work that this guy is doing right here, this Claude, right? So, you know, we’re gonna we’re gonna let Claude check that out while it does that stuff. And then I want to come over to this window. Let me just move this to the side a little bit. Because this was this one was a bit more rambunctious than the other one.
Where Claude had done a bunch of strange things in order to get and this was just keep in mind this this all this was here was a push to update the readme file. So this one wasn’t like, like, you know, catastrophically important, though, documentation is important. One should document things, right? One should have things documented one should also have accurate documentation. That’s also non hallucinatory documentation. But this wasn’t a funnier one. Because Claude did a bunch of stuff that it shouldn’t have done. It’s actually explicitly, I’ve written down in many places in the employee handbook that you are not allowed to do these things. And it went and did them anyway. Right? And then, you know, I felt the need to say, Claude, after you did all those things that you weren’t supposed to do, you know, is this not all, like, outline? Like, like, like, I have a Claude.md file, I have skills files, I am Claude is writing things to memory and it admits as much. But what it also says is if I can just move this up a little bit to frame that a little bit better.
It is. Claude.md says it clearly. Branch protection. Both Performance Monitor and Performance Studio have branch protection on dev and main. You can’t push directly. You always have to create a feature fixed branch and a PR. And it says, and memory has the same note. I violated my own documented rules because I got frustrated with GitHub. So Claude got mad at GitHub, right? Because the, so the full thing over here, I got frustrated with the GitHub API not allowing my PR due to the merge topology, which I get.
I have also been frustrated by GitHub API and merge topology in my life. I don’t, I don’t know that I have broadly violated any personal rules in my, my, my dealing with that. But these are things that you have to be very careful of. You know, everyone talks about guardrails, but I don’t believe in them. My, my experience with LLM is that you can create a lot of guardrails, but they’re sort of driving a tank and tanks don’t really care about guardrails.
You know, just recently, you know, I have a lot more to say about this in a, in a written blog post that’s coming up, but just sort of like in recent weeks, I’ve, I’ve had these things, you know, drop databases. I, I have, I’ve had them, you know, again, go out of bounds on things like this. A good example is in the performance dashboard that I have.
One of, well, two of the tools are installers. There’s a command line installer and a GUI installer. And one of the rules I have is that, you know, we have to test all SQL file script changes through the command line installer or the GUI installer to make sure that when people run them out in the world, that they run correctly in there. And there have been some problems with that.
And part of the reason that there have been problems with that is because, you know, like me sort of clicking okay and like letting things go. Sometimes I miss that Claude isn’t using the installer. Sometimes, because I, I assume because I’ve written things down that I have created these guardrails that Claude is following my, my instructions on things.
But last week I missed that Claude was not. Claude was running scripts and patching things manually when things didn’t work with the installer. It said, I, that’s not working there.
I got to do it this way. Instead of stopping to fix the problem in the installer, it was like, no, I was going to get around this. I need to make this thing work. And that, that sort of stuff happens a lot. Now, one thing that is very, very common with LLMs is that they are very keen to write what’s called happy path tests.
You give them some code and you’re like, hey, can you write some tests for this code? So maybe they run the code, they see the results, and then they say, oh, well, we need to write some tests that, you know, make, make sure that this code that is correct works. So they write a bunch of these happy path tests that don’t really adversarially test the code.
And so you have to say, like, cool, like, you know, we have tests that do that. But like, do we have any tests that see what happens when things break? Because, you know, like, you know, writing a bunch of tests that, you know, think that just to make something arbitrarily pass or artificially pass are not tests.
Right. It’s, it’s just sort of like, I don’t know, it’s like, like, like interviewing with your dad, right? Like your dad owns a company and you’re like, hey, I’d like to get a job. And she says, sure, come in for an interview.
It’s like, like, what do you think is going to happen? Come on, realistically, it’s not, not, not anything that you have to worry about. So just, just be very careful with these things out there.
Um, you know, I, I, I use all my stuff very locally. I don’t use stuff in a way that, um, that would, would touch, uh, like, you know, I mean, I don’t have production because I have only local. So everything that I work on is local.
But, you know, I, I need to make sure that, you know, SQL Server, or rather, Claude doesn’t like, you know, destroy my SQL servers because that wouldn’t be cool either. But be very careful with this stuff. Keep, keep these things very isolated.
And, you know, you, like I said, you can set up guardrails, but there’s no guarantee they’re going to keep on track. There’s a bigger lesson here because Amazon, uh, there’s an article in Fortune, uh, and you can ignore my bookmarks bar full of strange things that I’m not even sure should be bookmarks. Um, sometimes I hit the wrong key combination, right?
Sometimes when I go off those guardrails, I’m like, I meant to like do, I meant to like paste or something and I ended up bookmarking something. Uh, but anyway, uh, there was an, apparently a mandatory meeting at Amazon. Where they were like, hey, uh, site reliability has not been good lately, right?
A bunch of stuff’s been down. Uh, can we please stop using AI in production? Uh, and so like if you, as you read through the article, you know, there are quotes from various other tech people.
Uh, but the, the funny one is in here where it’s like, folks, as you know, the availability of the site and related infrastructure has not been good recently. Hmm. Well, I wonder why.
Uh, so a lot of this, you know, of course gets led back to people, uh, using AI, uh, doing very rapid developments and rapid deployments. And so like, this is something that I run into too, because, you know, it’s cool that you can make a lot of progress very quickly, but if you’re, if you’re not stopping to really test and validate that progress, you don’t, A, you don’t know that it’s progress. Uh, it could be just completely broken and, and B, uh, like you, you don’t know that any of it is sensible progress, right?
Like, you know, you, like you, something has produced a bunch of code. We don’t know how good, bad or ugly that code is. So like, you know, for me, what happens a lot is, you know, like, uh, I’m like, Hey, I want to do this thing or I want to work on this thing.
And Claude bangs out a billion lines of code in like 30 seconds. And I’m like, cool. Wow, man.
Whew. That was amazing. I could never code that fast. I could never code. I just write SQL. And then, and then like you go and look at what, what comes up and, and, and like, you know, usually like the dashboard or the performance studio thing that’s doing plan analysis.
And I’m like, wow, that, that doesn’t make any sense. We need to like can half of that. This is wrong.
These buttons are in the wrong place. That’s backwards. Uh, like, like, like three of these rules aren’t working. Like, did you validate any of this stuff? Cause like, especially with like performance studio, like that should be a fairly easy one because for every like thing that I want to put in there, I’m like, Claude, here’s an execution plan where this thing happens.
You can validate in real time, the code for that. This is how you can look at the XML structure. You can see, wow.
Yeah. This happened in there at this point in the XML. And it’ll just be like, I think that’s it. Good enough. Anyway. Thank you for watching.
I hope you enjoyed yourselves. I hope you learned something. I hope you will be careful out there in the world with, uh, your AI enablers because while they, they do allow you to, to build and iterate and produce things very quickly, uh, there, there are still a lot of bumps along the way.
Uh, and the, so I will, I will leave you with that. Thank you for watching. Thank you.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

