Posted by speckx 8 hours ago
Lmk how you feel when you're constantly build integrations with legacy software by hand.
Defining “Gambling” like isn’t really helpful.
You can't keep paying to play the "refinancing game" until you get a good rate (at least not like pulling the lever again and again, you have to wait a long time, you won't call the same bank again and again, and suddenly they have an amazing rate), it's a different experience and the psychology is different.
That’s only half of the transition.
The other half - and when you know you’ve made it through the “AI sux” phase - is when you learn to automate the mopping up. Give the agent the info it needs to know if it did good work - and if it didn’t do good work, give it information so it knows what to fix. Trust that it wants to fix those things. Automate how that info is provided (using code!) and suddenly you are out of the loop. The amount of code needed is surprisingly small and your agent can write it! Hook a few hundred lines of script up to your harness at key moments, and you will never see dumb AI mistakes again (because it fixed them before presenting the work to you, because your script told it about the mistakes while you were off doing something else)
Think of it like linting but far more advanced - your script can walk the code AST and assess anything, or use regex - your agent will make that call when you ask for the script. If the script has an exit code of 2, stderr is shown to the agent! So you (via your script) can print to stderr what the agent did wrong - what line, what file, wha mistake.
It’s what I do every day and it works (200k LOC codebase, 99.5% AI-coded) - there’s info and ideas here: https://codeleash.dev/docs/code-quality-checks
This is just another technique to engineer quality outcomes; you’re just working from a different starting point.
When I have Claude create something from scratch, it all appears very competent, even impressive, and it usually will build/function successfully…on the surface. I have noticed on several occasions that Claude has effectively coded the aesthetics of what I want, but left the substance out. A feature will appear to have been implemented exactly as I asked, but when I dig into the details, it’s a lot of very brittle logic that will almost certainly become a problem in future.
This is why I refuse to release anything it makes for me. I know that it’s not good enough, that I won’t be able to properly maintain it, and that such a product would likely harm my reputation, sooner or later. What frightens me is there are a LOT of people who either don’t know enough to recognize this, or who simply don’t care and are looking for a quick buck. It’s already getting significantly more difficult to search for software projects without getting miles of slop. I don’t know how this will ultimately shake out, but if it’s this bad at the thing it’s supposedly good at, I can only imagine the kinds of military applications being leveraged right now…
It also depends on what you're coding with;
- If you're coding with opus4.6, then it's not gambling for a while.
- If you'r coding with gemini3-flash, then yeah.
One thing I have noticed though is- you have to spend a lot of tokens to keep the error/hallucination rate low as your codebase increases in size. The math of this problem makes sense; as the code base has increased, there's physically more surface where something could go wrong. To avoid that you have to consistently and efficiently make the surface and all it's features visible to the model. If you have coded with a model for a week and it has produced some code, the model is not more intelligent after that week- it still has the same layers and parameters, so keeping the context relevant is a moving target as the codebase increases (and that's why it probably feels like gambling to some people).
> you have to spend a lot of tokens to keep the error/hallucination rate low
Ironically, I find your comment more effective at convincing me AI coding is gambling than the original article. You're talking about it the exact same way that gamblers do about their games.
- Was there anymore intelligence that you wanted to add to your argument?