Posted by reasonableklout 5 hours ago
Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.
Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).
Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
Do they??
My team lead has worked on the same software for 30 years. He has the ability to hear me discuss a bug I noticed, and then pinpoint not only the likely culprit, but the exact function that's causing it.
https://www.joelonsoftware.com/2000/04/06/things-you-should-...
A decade ago, I was sitting in on a meeting about a rewrite and, before I could say anything, someone in the first year of her career asked why anyone thought a rewrite would be any cleaner once all the edge cases were handled. Afterwards, I asked her where she learned this. She said "I don't know, it just seems kind of obvious." She went on to be a great engineer and is now a great manager.
Including all of the above.
maybe some that people said were that bad. but they just needed some elbow grease. remember, it takes guts to be amazing!
The reason Oracle can continue failing at those massive projects is simple: everyone fails at them routinely and often it’s the customers fault.
it will kill all the people in that hospital too
What do you think the fake Delve attestation scandal was about? https://news.ycombinator.com/item?id=47444319
(Screams in "deployed in 2026 a new product that only works in internet explorer" in healthcare).
Definitely cleaning up other people's AI mess for them for free is not a good use of time.
“ These are highly complicated pieces of equipment… almost as complicated as living organisms.
In some cases, they’ve been designed by other computers.
We don’t know exactly how they work.”
Now how did that work out ;-)
I think the problem will get worst. I dislike the marketing around AI, but I do think it is a useful tool to help those who have experience move faster. If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
I've been watching non-developers vibe code stuff, and the general failure mode seems to be ignorance of 3-pick-2 tradeoffs.
They'll spam "make it more reliable" or some such, and AI will best-effort add more intermediary redis caches or similar patterns.
But because the vibe coders don't actually know what a redis cache is or how it works, they'll never make the architectural trade-offs to truly fix things.
I often wonder if it’s the statistical nature of the LLM mixed with a request in the prompt.
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
Plausible?
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
Do you think people are not giving their agents specs and asking for input?
Commits, design reviews, whitepapers, code reviews, test suites. And pretty concerning : chat logs and even keystrokes from employees nowadays.
The way we train specialized bots now is incredibly inefficient, that part is rapidly improving.
That's serious levels of circular thinking right there.
We train humans to do things untrained humans can not do.
- AI Hype
- AI Psychosis
- AI keeps getting better and better until it can work around big AI slop code bases
The belief in this is a form of AI psychosis, I think.
Maybe in the future but certainly no evidence of this anytime soon
Here's some anecdotal evidence from me - I cleaned up multiple GPT 4.x era vibecoded projects recently with the latest claude model and integrated one of those into a fairly large open source codebase.
This is something AI completely failed at last year.
Maybe you should try something like this or listen to success stories before claiming 'certainly no evidence' in future?
What evidence is there that we're not at or close to a plateau of what LLMs are capable of? How do you know the growth rate from 2023 to present will continue into 2029? eg. Is it more training data? More GPUs? What if we're kind of reaching the limits of those things already?
I don't see why we would assume that we are at a plateau for RL. In many other settings, Go for instance, RL continues to scale until you reach compute limits. Some things are more easily RL'd than others, but ultimately this largely unlocks data. We are not yet compute/energy/physical world constrained. I think you would start observing clear changes in the world around you before that becomes a true bottleneck. Regardless, currently the vast majority of compute is used for inference not training so the compute overhang is large.
Assuming that we plateau at {insert current moment} seems wishful and I've already had this conversation any number of times on this exact forum at every level of capability [3.5, 4, o1, o3, 4.6/5.5, mythos] from Nov 2022 onwards.
And the answer appears to be that the improvement is accelerating. So how could it be stopping?
1) same business logic implemented in two different places, with extra code to sync between them
2) fixing apparently simple bugs results in lots of new code being written
It’s a sign I need to at least temporarily dedicate more effort to overseeing work in that area.
I somewhat agree with the AI psychosis framing of the OP. It takes some taste and discipline to avoid letting things dissolve into complete slop.
* A belief that AI will keep getting better, presented without evidence, does not yield a lot of skepticism around these parts.
* Your comment saying it is wrong to believe AI will keep getting better, also presented without evidence, is downvoted.
I think it will be needless verbose complexity.
I kind of imagine someone having an unlimited budget of free amazon stuff shipped to their house.
In theory, they are living a prosperous life of plenty.
In reality, they will be drowning in something that isn't prosperity.
You have not seen the spreadsheets that accounts run the firm on.
Bloody kids!
The issues have all been structural, not local. It's easier to treat it like a rewrite using the original as a super detailed product spec. Working on the existing codebase works, but you have to aggressively modularize everything anyway to untangle it rather than attack it from the top down.
All of these projects have gone well, but I haven't run into a case where a feature they thought was implemented isn't possible. That will happen eventually.
It's honestly good, quick work as a contractor. But I do hope they invest in building expertise from that point rather than treating it like a stable base to continue vibecoding on.
I exaggerate only a little.
Are you sure about this? Yes, there is a stable set, but they are used in all of the wrong places, particularly in places where they don't belong because juniors and now AIs can recite them and want to use them everywhere. That's not even discussing whether the stable set itself is correct or not - it's dubious at this point.
(None of above is theoretical)
Violets are blue
AI is great
And so are you
In their current forms, it's unlikely for a product that actually needs to work.
It's not getting that complex and working with current LLMs.
I thought the same when I saw development outsourced to Indians that struggled to write a for loop.
I was wrong.
It turns out that customers will keep doubling down on mistakes until they’re out of funds, and then they’ll hire the cheapest consultants they can find to fix the mess with whatever spare change they can find under the couch cushions.
Source: being called in with a one week time budget to fix a mess built up over years and millions of dollars.
Scrape off all the soil, put it in casks, and bury it in a concrete bunker for 10000 years. Then relocate everyone and attempt to rebuild.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
If the farming situation were as dire as you seem to suggest, we'd have unpredictable famines all the time, but we don't
Planting is merely setting up the conditions. We didn't write the dna, we couldn't write the dna if we wanted to because we are an infinity away from understanding all the actual processes that descend from the dna. And when we utilize the dna that we simply found and didn't and couln't hope to write, it's always, at best, a case of hoping it goes right again this time.
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.
I don't think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.
These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.
This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can't let it do the thinking and decision making.
Here's some other topics I've written on it:
- https://mitchellh.com/writing/my-ai-adoption-journey
- https://mitchellh.com/writing/building-block-economy
- https://mitchellh.com/writing/simdutf-no-libcxx (complex change thanks to AI, shows how I approach it rationally)
I wish I had written that.
>Amazon workers under pressure to up their AI usage are making up tasks
compare 100 pollocks vs 2-3
Claiming that the people who disagree with you must be experiencing a form of psychosis, experiencing actual hallucinations and unable to tell what is real, is a weak ad hominem that comes off no better than calling them retarded or schizophrenic.
If you genuinely think one of your friends is going through a psychotic episode, you should be trying to get to them professional help. But don’t assume you can diagnose a human psyche just because you can diagnose a software bug.
To the wider audience on HN the phrasing is pretty clear. An outsider with a tiny bit or intellectual charity wouldn't come to conclusions like you do.
https://en.wikipedia.org/wiki/Chatbot_psychosis
https://www.rollingstone.com/culture/culture-features/ai-spi...
https://www.nytimes.com/2025/06/13/technology/chatgpt-ai-cha...
The key factor is losing touch with reality, which results in individual or collective harm.
There is also such a thing as mass psychosis, and those are unfortunately a more difficult situation because the government and corporations are generally the ones driving them, and they are culturally normalized.
If he meant mass psychosis, he should have said mass psychosis. And again, since he is not a public health scientist or any flavor of psych professional, he probably shouldn’t make those proclamations. And should probably call for a wellness check instead of posting on social media if he were truly concerned for their health.
For people who are considered neurotypical, social coherence often overwrites reality. Its a mechanism for achieving consensus withing groups while spending the least amount of brain compute energy. Same goes for social metainfo tagged messages, they are more likely to influence reality perception, subconsciously. E.G: If a rich guy says you should be hyped the people who wanna get rich will feel hyped and emotional contagion can spread between people who belong to the same "tribe"
It's very visible for us atypical folk who can't participate well in groupthink at all
They almost always generate logically correct text, but sometimes that text has a set of incorrect implicit assumptions and decisions that may not be valid for the use case.
Generating a correct correct solution requires proper definition of the problem, which is arguably more challenging than creating the solution.
Does it make it better than us? No because ultimately the thing itself doesn’t ‘know’ right from wrong.
The standard of most employment is already to produce mediocre, plausible outputs as cheaply and rapidly as possible. It's a match made in heaven!
It's an incredible tool but it's also very derpy sometimes, full of biases, blind spots etc.
the trick is to be mindful, aware, and deliberate about what decisions are being outsourced. this requires slowing down, losing that absurd 10x vibe coding gain. in exchange, youre more "in-the-loop" and accumulate less cognitive debt.
find ways to let the agent make the boring decisions, like how to loop over some array, or how to adapt the output of one call into the input of another.
make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
tell the agent to halt on ambiguity.
a good engineer will get a 2x or 3x speedup without the downsides.
Those kind of advice ultimately don't matter. If you're familiar with a programming project, you'll also be familiar with the constructs and API so looping over an array or mapping some data is obvious. Just like you needn't read to a dictionary to write "Thank you", you just write it.
And if you're not, ultimately you need to verify the doc for the contract of some function or the lifecycle of some object to have any guaranty that the software will do what you want to do. And after a few day of doing that, you'll then be familiar with the constructs.
> make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
The only way to do that is if you have implemented the algorithm before and now are redoing for some reason (instead of using the previous project). If you compare nice specs like the ietf RFCs and the USB standards and their implementation in OS like FreeBSD, you will see that implementation has often no resemblance to how it's described. The spec is important, but getting a consistent implementation based on it is hard work too.
That consistency is hard to get right without getting involved in the details. Because it's ultimately about fine grained control.
Or random consultants.
Is "AI said it was a good idea" and worse than "we were following industry trends"?
Based on the stuff I've seen, yes it seems a lot worse.
I can't imagine how bad it would be if your employer started doing this from the leadership. You'd be pressured to get on board or fear getting fired. Nobody would be trying to moderate your thinking except your coworkers who disagree with it, but those people are going to leave or be fired. If you want to keep your job, you have to play along.
Their entire organization has been handed Codex/Claude and told to "go all in on AI" and "automate everything". So the mandate is for people that do not know how to code and have the keys to the castle to unleash these things upon their systems.
This is at a large organization with tens of thousands of employees.
I am waiting with bated breath for the ultimate outcome!
this leads to naive AI adoption, which is the worst of both worlds (no real speedup, out sourcing thinking, ai slop PRs, skill rot).
> your coworkers who disagree with it, but those people are going to leave or be fired.
Personally I expect that I will be this person soon, probably fired. I'm not sure what I will do for a career after, but I sure do hate AI companies now for doing this to my career
This is the right definition. LLM outputs have undefined truth value. They’re mechanized Frankfurtian Bullshiters. Which can be valuable! If you have the tools or taste to filter the things that happen to be true from the rest of the dross.
However! We need a nicer word for it. Suggesting someone has “AI psychosis” feels a bit too impolitic.
Maybe we reclaim “toked out” from our misspent youths?
e.g. “This piece feels a little toked out. Let’s verify a few of Claude’s claims”
[1] here I don't mean to imply agency, just vigor.
Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.
---
Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it. Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed. Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind. Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around. For drafting, brainstorming, or casual questions, ease off and match the task.
---
Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).
I'm seeing it with lawyers, too. Like, about law. (Just not in their subject matter.) To the point that I had a lawyer using Perplexity to disagree with actual legal advice I got from a subject-matter expert.
While you have to think about things objectively no matter what, when I start researching topics like physics, using AI as suggested in that article has proven very useful.
To me AI psychosis is the handful of friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover, the one guy who won’t speak to his family directly but has them talk to ChatGPT first and then has ChatGPT generate his response, or the two who are confident that they have discovered that physics and mathematics are incorrect and have discovered the truth of reality through their conversations with the models.
But language is a shared technology so maybe the term is being used for less egregious behavior than I was using it for.
My understanding is that regular psychosis involves someone taking bits and pieces of facts or real world events and chaining them into a logical order or interpolating meanings or explanations which feel real and obvious to the patient but are not sufficiently backed by evidence and thus not in line with our widely accepted understanding of reality.
AI psychosis is then this same phenomenon occurring at a more widespread scale due to the next-word-prediction nature of LLMs facilitating this by lowering the activation energy for this to happen. LLMs are excellent at taking any idea, question, theory and spinning a linear and plausibly coherent line of conversation from it.
I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn that loss?
The fact that they were hurt by that sudden loss is totally healthy. It's just part of moving on. The real problem was getting into an unhealthy relationship with a fictitious partner under the control of an abusive company willing to exploit their loneliness in exchange for money.
Hopefully they now know better, but people (especially desperate ones) make poor choices all the time to get what's missing in their lives or to distract themselves from it.
Ah, I forgot about the ai relationship companies. No this guy was using the browser based ChatGPT for coding and ended up in love with the model. No relationship was sold at all.
It's so interesting how easy it is to steer the LLM's based on context to arriving at whatever conclusion you engineer out of it. They really are like improv actors, and the first rule of improv is "yes, and".
So part of the psychosis is when these people unknowingly steer their LLM into their own conclusions and biases, and then they get magnified and solidified. It's gonna end in disaster.
Right know, prompters are setting up whole company infrastructure. I personally know one. He migrated the companies database to a newer Postgres version. He was successful in the end, but I was gnawing my teeth when he described every step of the process.
It sounded like "And then, I poured gasoline on the servers while smoking a cigarette. But don't worry, I found a fire extinguisher in the basement. The gauge says it's empty, but I can still hear some liquid when I shake it..."
If he leaves the company, they will need an even more confident prompter to maintain their DB infrastructure.
the top reply is from someone doing exactly that, arguing "but the agents are so fast!"
Maybe they're assuming that doubling the code-base/features is more beneficial versus the damage from doubling the number of bugs... Well, at least for this quarter's news to investors...
The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
I mean, yes, logic is useful, but ignorance of risks? Assuming that moving blazingly fast and pulverizing things will result in good eventually?
This AI thing is not progressing well. I don't like this.
Let's say I'm polar opposite of them, and we're on the same page with you.
The whole "you'll be forced to do it" comes from the alternative being that you lose. You no longer get to be a player in the "game". In the same way that coopers and cobblers are no longer a significant thing, but we still have barrels and we still have shoes. Software engineers who refuse to employ any LLMs won't be market competitive. If you adopt it, you at least get to remain playing the game until the game changes/corrects. That's the part that's "not so bad".
Choosing your own survival isn't ethically bankrupt.
You'll be forced to do it, or lose. The unstated assumptions are that, first, it will work, and second, that you can't afford to lose. But let's just assume those for the sake of argument.
> It can't be that bad
That does not follow at all. It can in fact be that bad. That was what made the game theory of MAD different from the game theory of most other things.
Oof. Potential "bad" outcomes of "game theory" should be calibrated to include all the bloody wars and genocides throughout recorded history.
Why did the Foi-ites kill every man, woman and child of the conquered Bar-ite city? Because if they didn't, then they'd be at a disadvantage if the Bar-ites didn't reciprocate in the cities they conquered...
The problem was not him, but the fact that the number of people who thinks like him. They may word it in a more benign form, but the idea is the same.
So obsessed with being the first mover and winning the battle, never thinking whether they should, or what would happen with that scenario.
Missing the whole forest and beyond for a single branch of a single tree.
Thanks. :)
i don't think it's 'our side' that has the psychosis.
plot twist: it's Starbuck
Show HN here: https://news.ycombinator.com/item?id=48151287
I think Mitchell's point is well taken -- it's possible for these tools to introduce rotten foundations that will only be found out later when the whole structure collapsed. I don't want to be in the position of being on the hook when that happens and not having the deep understanding of the code base that I used to.
But humans have introduced subtle yet catastrophic bugs into code forever too... A lot of this feels like an open empirical question. Will we see many systems collapse in horrifying ways that they uniquely didn't before? Maybe some, but will we also not learn that we need to shift more to specification and validation? Idk, it just seems to me like this style of building systems is inevitable even as there may be some bumps along the way.
I feel like many in the anti camp have their own kind of reactionary psychosis. I want nothing to do with AI but I also can't deny my experience of using these tools. I wish there were more venues for this kind of realist but negative discussion of AI. Mitchell is a great dev for this reason.
I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
The general laziness of looking for a perfect library on CPAN so that I don't have to do this work (often taking longer to not find a library than writing it by hand).
Have written thousands of lines of code with AI tool which ended up in prod and mostly it feels natural, because since 2017 I've been telling people to write code instead of typing it all on my own & setting up pitfalls to catch bad code in testing.
But one thing it doesn't do is "write less code"[1].
[1] - https://xcancel.com/t3rmin4t0r/status/2019277780517781522/
Maybe it's just my prompt or something but my coding agent (Opus 4.7 based) says things like "this is the kind of thing that will blow up at 2am six months from now" all the time.
And we do not get even get into potential adversarial tactics. If you have no morals what is better than using agents to flood your competitor with fake bug reports.