Working With Claude Code: The Good, The Bad, and The Ugly

I've been using Claude Code daily across 5+ projects for the past few months — personal Laravel apps, a self-hosted radio station manager, infrastructure automation, a portfolio site, and this Monster Hunter fashion set community. Here's what it's actually like after the honeymoon phase wears off.

The Good

It's genuinely fast for the boring stuff. Server migrations, deployment config, wiring up new integrations — the kind of work where you know what needs to happen but it's just... tedious. I migrated three apps from Kamal on DigitalOcean to Dokku on Hetzner. Claude handled the rsync commands, Dokku config, DNS cutover steps, Cloudflare SSL dance, and storage migration. What would've been a full weekend of careful copy-paste-pray was done in a few focused sessions.

MCP integrations are a game changer. This is the thing that surprised me most. I have Sentry's MCP server connected, so Claude can query my APM data directly. Tonight I asked it to check for slow page loads on my fashion site. It pulled span data, grouped by transaction, found N+1 queries I didn't even know I had, traced them back to specific controller methods, and I had fixes deployed in 15 minutes. That's not "AI wrote my code" — that's "AI did the detective work so I could write targeted fixes." The Datadog MCP exists too, so same workflow is possible at work.

Memory across sessions actually works. Claude remembers that my auth system is magic links + passkeys (not Discord OAuth, which was removed months ago). It remembers that RadioBox should stay simple, that I hate unnecessary abstractions, that all my Laravel projects use Sail. It builds up context over time so I don't have to re-explain my stack every conversation. When it works, it feels like pair programming with someone who actually paid attention last time.

It reads code before changing it. This sounds basic but it's huge. It reads the controller, reads the model, checks the migration, checks the observer, then makes changes. It finds the existing patterns and follows them instead of inventing new ones.

The Bad

It guesses when it should research. This burned me hard on RadioBox. Deployment was failing with a frozen string error in Kamal and instead of stopping to Google the actual error, Claude just kept retrying the same command with slight variations. We wasted 30 minutes and a bunch of tokens before I told it to stop and look up the actual issue. The fix was a one-line env var. Same thing happened with a Dokku plugin — it kept trying to fix the plugin instead of just removing it. I had to explicitly train it: "never guess at deploy fixes, research the error first."

It over-engineers when you're not watching. I once asked it to make a user an admin. Instead of adding a simple hardcoded check, it invented an ADMIN_ prefix convention, asked me to rename things in 1Password, and tried to rewrite the user sync system. For a one-liner. I've had to hammer in "do exactly what I asked, nothing more" as a persistent feedback memory.

Confidence without verification is dangerous. Multiple times across projects, Claude has said "everything works" without actually running the code. On RadioBox, it shipped a Liquidsoap config with an incompatible buffer() function that crashed the station. On kaiyo-shard, it changed model scopes without checking all the places those scopes were used — broke the homepage, the admin panel, and the sitemap. I now have a memory that says "NEVER say 'it should work' — prove it works." The hard rules in my CLAUDE.md exist because of real incidents.

It forgets cross-project context. Each project's memory is siloed. It doesn't know that the RadioBox deployment lessons apply to mh-fashion too, or that the Dokku debugging rules I learned the hard way on one project should carry over to all of them. I end up duplicating feedback memories across projects.

The Ugly

The 3 AM Dokku incident. Deploy failed after I added a custom Discord notification plugin. Claude's response? Run dokku git:from-image which corrupted the deploy state entirely. Now the original problem was compounded by a self-inflicted wound. The actual fix was removing the broken plugin — a 10-second operation. But we spent 30 minutes making things worse first because Claude was guessing instead of diagnosing. That one earned a very strongly-worded feedback memory.

The Discord OAuth ghost. For weeks, Claude kept referencing "sign in with Discord" in UI copy and documentation — a feature I'd removed months ago. The old CLAUDE.md still mentioned it, and Claude trusted the stale docs over the actual codebase. I had to create a memory in all caps basically yelling "NEVER mention Discord auth." It's fixed now, but the fact that it trusts written docs over code reality is a footgun.

Root-owned .git corruption. On kaiyo-shard, Claude ran a version tagging artisan command inside Docker. The command created git objects as root. Suddenly I couldn't do anything git-related on the host without sudo. Small thing, but the kind of "obvious in hindsight" mistake that an experienced dev wouldn't make. Now there's a memory: "never run git commands inside Docker containers."

What I've Learned About Making It Work

1. Write your hard rules down. My CLAUDE.md files have grown from "here's the tech stack" to "here are the 5 things that will get you fired." The hard rules section exists because of real incidents. They work — Claude follows them.

2. Give feedback aggressively. Every time Claude does something dumb, I save a feedback memory with why it was wrong. Every time it does something surprisingly right, I save that too. The feedback memories are probably more valuable than the project memories.

3. Don't let it guess on infrastructure. Code changes? Sure, let it try things. But deployment, Docker, DNS, production databases? Make it research first, propose a plan, and get your approval before running anything.

4. MCP integrations multiply the value. Claude writing code is useful. Claude with access to your Sentry traces, your GitHub PRs, your database — that's a different thing entirely. The Sentry APM workflow tonight was the most impressive thing I've seen it do. Push for getting these connected at work.

5. Trust but verify, always. Run the tests. Check the browser. Read the diff. Claude is a very fast, very capable junior dev who occasionally has supreme confidence in broken code. That's fine — just don't merge without checking.

Would I Go Back?

No. Even with the ugly parts, the raw throughput is too valuable. I'm shipping features across 5 projects as a solo dev that would've taken me 3x longer without it. The infrastructure migrations alone saved me an entire weekend. The performance audit tonight was genuinely fun.

But I wouldn't trust it unsupervised on production infrastructure. Not yet. Maybe not ever — and honestly, that's fine. The best workflow is Claude doing the heavy lifting while I keep my hands on the wheel for anything that matters.

It's a power tool. Power tools are great. Power tools also take your fingers off if you're not paying attention.

The Good

The Bad

The Ugly

Would I Go Back?

The Sermon and the Thirst Trap - Current-Gen MH Streamers (Part 1: The Moral E-Girl)

Big Balls, Small Excuses

Everyone Argues About Tools While No One Listens to the Music